Every drug, gene, trial, and paper on this site is pulled live from public biomedical databases. We don't have a private dataset, a paywall, or a curated list of "our" drugs. We just read the same sources a researcher would and try to make them legible to someone who isn't one.
This page walks through every source we query, how the answers get merged when they disagree, how fresh the data is, and importantly what we don't have.
Different sources give different answers to the same question. Open Targets is conservative; DGIdb is broad; ClinicalTrials.gov is messy but fresh. When sources disagree, we union them and prefer the richer source for display fields.
Four places do this merge:
- Drugs already used for a disease. Unions Open Targets + ClinicalTrials.gov + openFDA. Approved = OT phase ≥ 4 or openFDA-label match. In-trials = anything else with a phase, plus any drug that appears in a CT.gov trial for the condition.
- Drugs that act on a gene. Unions Open Targets + DGIdb. Names normalized (salt forms stripped, case-folded), then deduplicated. The display name comes from the cleanest source we have — usually OT.
- Standard treatments for a disease. The 3-card "what doctors typically prescribe" strip on each disease page is the top 3 of the union list, filtered to phase ≥ 4 (FDA-approved). When the disease is a parent classification with no drugs labeled for it specifically (e.g. "spinal cord cancer" — drugs are approved for the sub-types), the section shows an explainer instead of going silent.
- Repurposing candidates for a disease. Pulls Open Targets' top 40 most-associated genes for the disease. For each gene, queries the multi-source drug list (OT + DGIdb) — not just OT — so well-trodden disease genes still surface candidates. Excludes drugs that are already FDA- approved for this disease (drugs that were tried at phase 1–3 but never approved are still valid candidates and stay in, tagged "has appeared in trials"). Enriches survivors with PubMed lit counts using
[Title/Abstract] field tags, CT.gov trial-status checks, mechanism action type (so a researcher can sanity-check direction-of-effect against disease pathology), and per-drug safety flags (black-box, market withdrawal). Ranks by a composite score:(assocnon-chembl × 0.5 + lit × 0.2 + novelty × 0.3) × safetymodifierwhere association uses the gene-disease score with the chembl evidence source removed (it's circular for repurposing — that source treats existing drug-trial co-occurrence as evidence), and safetymodifier is 0.55 for withdrawn drugs and 0.85 for drugs with a boxed warning.
AI is used in two places, both for translation only, never to fetch new facts.
- Plain-English summaries at the top of every disease, drug, and target page. It takes the upstream description text from Open Targets and rewrites it in 1–2 sentences for someone with no biology background.
- Per-candidate rationale — the 2–3 sentence "why this might work" explanation on each repurposing candidate. Same model, prompted with structured inputs only: drug name, disease name, bridging gene, mechanism of action, association score, and the evidence types that underlie that score.
The AI doesn't fetch its own data and doesn't make claims beyond what the inputs say.
Being clear about what's outside the perimeter:
- No proprietary databases. No DrugBank Plus, Cortellis, MedAdherence, or anything behind a license. Public data only.
- No pricing or availability. Drug pricing varies by country, insurance, and formulation in ways that aren't comparable. We won't guess.
- No EHR or claims data. Real-world evidence from anonymized patient records is powerful, but it's not in scope here.
- openFDA is US-only. Drugs approved in the EU, UK, or Japan but not the US can be missing from approval-status surfaces. We catch some of those via ChEMBL through Open Targets, but not all.
- No automatic direction-of-effect check. We surface each drug's mechanism action type (inhibitor, agonist, antagonist) on candidate cards so a researcher can sanity-check whether the direction matches the disease pathology. We don't automatically validate this — disease- direction annotations (gain-of-function vs loss-of-function) aren't reliably exposed by our sources at the API level.
- No drug-class deduplication. If five drugs of the same mechanism class all bridge to the same gene, all five appear as separate candidates rather than collapsing to a single representative. ATC-class collapsing is a future improvement.
- Curated lists have gaps. Open Targets' "known drugs" aggregation is high-precision but incomplete — that's why we layered in CT.gov, openFDA, and DGIdb. Our answers are still bounded by what these sources know.
- Trial-status is "ever-mentioned", not "ever-succeeded". The "has appeared in trials" chip fires when ANY ClinicalTrials.gov record mentions the drug + disease pair — including control arms, terminated trials, and observational studies. It does not mean the drug worked. The trust-tag tooltip text says so explicitly.
Two things you can give us, both optional:
- An email address, if you sign up.
- Saves and notify-me subscriptions.
We don't collect analytics on what you search, save, or read. We don't sell anything to anyone. We don't have ads.
Found something wrong, or a source we should add? Email
hello@repurposex.org. We're especially interested in mismatches between what RepurposeX shows and what a domain expert would expect.