The Forensic Decay Cascade -- FADE Intelligence

How scientific evidence decays across ten stages before it reaches Wall Street — and why the market prices the press release while FADE reads the amendment.

Retrospective Case Study: Aducanumab (Biogen/Eisai)

The Question FADE Answers Aducanumab (Aduhelm) withdrew from the US market in January 2024 after a $3B+ commercial failure. The FDA advisory committee voted 8–0–1 against approval in 2020. Biogen proceeded to approval anyway. CMS refused to cover outside clinical trials. The program was structurally compromised years before the market knew it. FADE would have flagged three signals before the 2021 approval — all from public data.

Timeline

FADE Signal

Public Data Source

2018 — Pre-trial

Stage 1 flag: The core biological premise — amyloid-beta plaque clearance causes cognitive improvement — traces to a foundational 2006 Nature paper (Lesne et al.) that had accumulated PubPeer (Science.org) concerns about image duplication prior to the formal 2022 Science investigation.

The program's entire Phase 3 rationale was built on a mechanism whose foundational experimental evidence was under documented dispute in the public peer review record — before a single Phase 3 patient was enrolled.

Stage 1 Signal

Citation integrity flag: foundational paper (Lesne et al. 2006) has post-publication peer review concerns predating Phase 3 enrollment.

FADE detection: CrossRef + PubPeer (Science.org) cross-reference on the primary cited mechanism paper. Flag: PubPeer (Science.org) entries present on figures representing the core biological claim. Output: "Program premise depends on a paper with post-publication integrity concerns. Independent replication status: absent."

Data Source

PubPeer (Science.org) comment API (public). CrossRef retraction watch (public). PubMed citation graph (public).

No VDR required. No proprietary access. Computable from the public record at any point after the PubPeer (Science.org) entries were posted.

March 2019

Stage 5 flag — Trial halted for futility. Both Phase 3 trials (EMERGE, NCT02477800; ENGAGE, NCT02484547) were halted in March 2019 after an interim futility analysis. Standard outcome: program fails, compound shelved.

October 2019: Biogen announced it would seek FDA approval after a "post-hoc reanalysis" of a higher-dose EMERGE subgroup showed positive results. The halt, the futility determination, and the subsequent reanalysis are all in the ClinicalTrials.gov amendment history with timestamps.

Stage 5 Signal

Amendment trail flag: primary efficacy analysis plan changed post-interim analysis. Trial halted for futility March 2019 → analysis plan amended October 2019 → positive finding from post-hoc subgroup presented to FDA 2020.

The original pre-specified primary endpoint analysis showed futility. The endpoint that enabled FDA submission was a post-hoc reanalysis of a subgroup at a dose that was not the pre-specified primary analysis population. The ClinicalTrials.gov version history records the delta.

Data Source

ClinicalTrials.gov v2 API amendment history for NCT02477800 (EMERGE) and NCT02484547 (ENGAGE). Timestamps on protocol version changes are public record.

FADE detection: diff primaryOutcomes and analysis populations between pre-enrollment version and post-halt amendment. Flag: analysis population changed after futility determination.

2020 — FDA submission

Stage 3 flag — ENGAGE buried. When Biogen submitted EMERGE data to the FDA and presented at the advisory committee meeting, the ENGAGE trial — which failed even on the post-hoc high-dose subgroup analysis — received substantially less prominent presentation. Both trials ran simultaneously on the same patient population design. One failed; one showed a signal in a subgroup. The failure was registered, results submitted, but the public framing of the submission emphasized only EMERGE.

Stage 3 Signal

Asymmetric publication pattern: ENGAGE failure results submitted to ClinicalTrials.gov but absent from primary investor communications and FDA advisory committee briefing materials emphasis.

Both trials were registered. The results are in the ClinicalTrials.gov database. The discrepancy between registered outcome (ENGAGE: futility confirmed) and the public narrative (one trial showed a signal) is detectable from the public record.

Data Source

ClinicalTrials.gov results database for NCT02484547 (ENGAGE). Public FDA advisory committee briefing documents (FDA.gov). Cross-reference: ENGAGE registered outcome vs. investor communication emphasis.

All public. No VDR. No proprietary access.

The FADE Output — What an Investor Would Have Received in 2020 "Aducanumab (NCT02477800/NCT02484547): Three signals active. (1) Stage 1: foundational mechanism paper has post-publication integrity concerns on PubPeer (Science.org). Independent replication: absent. (2) Stage 5: primary efficacy analysis changed after futility determination. Current FDA submission is based on a post-hoc subgroup analysis not pre-specified as primary. (3) Stage 3: companion trial ENGAGE shows futility on the same post-hoc analysis; ENGAGE outcome underweighted in public narrative. Programs with this three-signal profile succeed in [calibration pending] % of historical comparables. Document A says X. Document B says Y. The decision is yours."

The timeline: These signals were computable from public data in 2019–2020 — before the November 2020 advisory committee vote, before the June 2021 FDA approval, before the January 2024 market withdrawal. The FDA advisory committee reached the same conclusion (8–0–1 against) reading the same public record. FADE would have surfaced the same signal algorithmically, 18+ months earlier.

Honest Disclosure on This Case Study The FADE signal computations above are derived from documented public facts: published PubPeer (Science.org) entries, ClinicalTrials.gov registered outcomes, and the public FDA advisory committee record. The specific NCT IDs (NCT02477800, NCT02484547) are correct and publicly verifiable. The calibrated conditional failure probability ("programs with this three-signal profile succeed in X%") cannot be stated because the Historical Cohort Builder has not yet been run — that is the next build. This case study demonstrates signal detection, not calibrated scoring. Calibrated scoring requires Item 2 from the FADE build spec.

Stage 1

The NIH Grant — Selection Bias at the Source

The money filter that selects FOR entropy bias before a single experiment runs

Evidence — What Happens

Discrepancy Signal — First Detectable Variance

FADE Detection — Public Data Anchor

Mechanism

NIH study sections score applications on feasibility, significance, and innovation — but human reviewers systematically favor hypotheses that align with existing high-citation literature. A grant proposing to challenge a crumbling foundational paper scores lower than one that builds on it.

Famous labs receive citation halos: the PI's prior work is cited as preliminary data, the same 5–10 papers justify 80% of grants in a disease area, and the foundational assumptions underneath those papers are never independently re-tested before the next grant cycle begins.⁸

Discrepancy Signal

The foundational paper was never re-tested. The grant is built on a citation that cannot be reproduced.

The program was designed on quicksand that nobody in the study section re-examined. This is not fraud at this stage — it is structural citation inertia: the system rewards building on consensus, not testing it.

Deep Dive — Documented Precedents

The Alzheimer's Amyloid Foundation (2022)

A foundational 2006 Nature paper — Lesne et al., cited over 2,300 times — provided primary visual evidence for a specific amyloid subtype as causal. The paper underpinned hundreds of millions in NIH grant funding and private investment in amyloid-targeting therapies over 16 years. In 2022, Science published a formal investigation finding evidence of image manipulation in key figures. The foundational citation that activated an entire funding cycle was disputed after the money was spent and the clinical trials failed.

The Reproducibility Project: Cancer Biology (2023)

The Reproducibility Project tested 193 experimental effects from 53 high-impact cancer biology papers. Only 51% of effects reproduced. For papers used as NIH grant justification, this means study sections were scoring feasibility on results that could not be independently confirmed.¹

FADE Detection

Cross-reference the program's foundational citations against three public registries:

1. Retraction Watch / CrossRef API (free): Flag any cited paper with a Retraction, Expression of Concern, or Correction issued after the program's IND submission date.

2. PubPeer (Science.org) comment feed: Flag papers with post-publication concerns on figures representing the program's core biological premise.

3. ClinicalTrials.gov translation rate: Query the historical Phase 2 to Phase 3 success rate for the target class. Below 15% = flag Structural Translation Risk.⁷

Output: "Grant premise paper [X] has a PubPeer (Science.org) flag on Figure 3 (posted [date]). Historical Phase 2–Phase 3 success rate in this target class: 8%."

Stage 2

The Lab Bench — Where the Data Gets Shaped

Graduate students need to graduate. PIs need to publish. The drug needs to work.

Evidence — What Happens

Discrepancy Signal — First Detectable Variance

FADE Detection — Public Data Anchor

Mechanism

Three failure modes compound at the bench. None require intent to defraud:

Analytical fatigue: A graduate student running hundreds of western blots develops unconscious criteria for which images are "representative." Clean-looking blots get saved. Blots showing inconvenient bands get rerun — or filed away.²

Cell line contamination: Over 500 unique cell lines in published literature are misidentified or cross-contaminated. The drug being tested does not target the disease biology the researcher believes it targets.⁶

The p-hacking window: With flexible stopping rules and multiple outcome measurement, a researcher running 20 independent experiments should expect one false positive at p<0.05 by chance alone. The one positive gets submitted.³

Discrepancy Signal

The SD-to-SEM switch: the same data, presented to look three times more compelling.

Standard Deviation (SD) reports the spread of individual data points — the honest picture of variability. Standard Error of the Mean (SEM) is mathematically smaller by a factor of the square root of N. Switching from SD to SEM with N=9 makes error bars three times smaller on the same underlying data.

Vaux, Fidler, and Cumming (2012) and Halsey et al. (Nature Methods, 2015) both document that SEM is routinely used where SD would be the honest standard — and SEM visually shrinks error bars without changing the underlying data.² The switch is undetectable from the published figure alone. It is detectable by back-calculation from mean, error value, and N.

Deep Dive — Documented Precedents

Begley and Ellis (2012) — The 6 of 53 Finding

Amgen's Glenn Begley attempted to reproduce 53 "landmark" preclinical cancer biology studies before committing capital to drug development programs. Only 6 of 53 (11%) reproduced. Primary failure modes: results published only when experiments "worked"; cell lines not validated; statistical analysis inconsistencies.¹

HeLa Contamination Chain

HeLa cells are the most frequently contaminated line in research history. Studies claimed drug efficacy against "prostate cancer," "kidney cancer," and "melanoma" cell lines that were, in fact, HeLa. Every drug tested on these lines generated false efficacy signals. The ICLAC registry lists over 500 confirmed misidentified lines in peer-reviewed literature.⁶

FADE Detection

Three mechanical checks from public data:

1. SD-to-SEM back-calculation: From the published figure's mean, reported error value, and N, back-calculate whether the statistic is consistent with SD or SEM. If SEM is used where SD would be the honest standard, flag Variance Compression.

2. ICLAC cross-reference: Extract cell line identifiers from the Methods section. Query the ICLAC registry (flat file, CC license, free). Any match = flag. If the drug's entire efficacy dataset is in a misidentified cell line, the program's foundational evidence is invalid.⁶

3. Cellosaurus authentication: Cross-reference the Swiss Institute of Bioinformatics Cellosaurus API for STR authentication reports.

Output: "Cell line [X] in Methods matches ICLAC registry entry [Y]. Authentication status: UNCONFIRMED. Program efficacy dataset built on unverified biology."

Stage 3

Peer Review — The Polite Conspiracy

No access to raw data. No negative results. Reviewers publishing in the same journals they review for.

Evidence — What Happens

Discrepancy Signal — First Detectable Variance

FADE Detection — Public Data Anchor

Mechanism

Peer review catches logical inconsistency, missing controls, and misapplied statistics. It has three structural blind spots that matter for FADE:

Publication bias: Journals publish positive results. Null results that would discount the positive finding were never submitted, never reviewed, never published. Positive result rates in US publications increased 22% from 1990 to 2007 — not because science improved, but because the filter got stronger.⁸

No raw data access: Reviewers evaluate processed figures, not raw data. The SD-to-SEM switch, the cherry-picked blot, the excluded outlier — none are visible from the submitted manuscript.

Reciprocal review networks: Reviewers are drawn from the same author pool. Reviewing favorably for researchers who review favorably for you is not misconduct and leaves no detectable trace.

Discrepancy Signal

The buried 22: selective trial publication creates a literature that systematically overstates efficacy.

The antidepressant case (Turner et al., NEJM 2008): The FDA received results of 74 registered antidepressant studies. 37 of 38 positive studies were published. Of 36 studies with negative or questionable results, 22 were not published at all, and 11 were published in a way that conveyed a positive outcome.⁴

The published literature suggested an effect size of 0.41. The FDA dataset including all 74 studies showed 0.31 — a 32% inflation. Every investor, clinician, and label writer worked from the published set. The real effect was in the FDA files.

Deep Dive — Documented Precedents

Antidepressants — Turner et al. (NEJM 2008)

12 antidepressant drugs. 74 FDA-registered studies. Published literature suggested all 12 were effective. Full dataset showed 6 had equivocal-to-negative results. Effect size inflation: 32%. The gap between published literature and reality was entirely a publication filter artifact — no data was destroyed, just selectively submitted for publication.⁴

Vioxx / Rofecoxib — The Missing Cardiovascular Events

The published VIGOR trial (NEJM 2000) omitted three myocardial infarction events that occurred after the authors' chosen data cutoff. The FDA's internal documents contained the complete dataset. The published paper — which passed peer review — presented a cardiovascular risk profile inconsistent with the full data. Peer reviewers evaluated the submitted narrative, not the FDA file. Market withdrawal followed in 2004 after more than $2.5 billion in settlements.

FADE Detection

The published literature is one document. The registration is another. FADE reads both.

1. ClinicalTrials.gov vs. PubMed match: Query all registered studies of a drug or target. Cross-reference against PubMed. Any registered study with results submitted to ClinicalTrials.gov but no corresponding PubMed publication is a buried negative signal. Flag the gap.

2. Results database compliance: Since 2008, sponsors must post results within 12 months of study completion for applicable trials. Non-compliance is itself a flag: a pattern of non-compliance = selective reporting.

3. CrossRef retraction API: Flag any published paper retracted or flagged with an Expression of Concern since the program began.

Output: "[Drug X] — 8 registered studies. 5 published. 3 results-submitted-only. 2 of the 3 unpublished studies show null primary endpoint."

Stage 4

IND Filing / Phase 1 — Dose Selection Theater

Designing the trial to look safe at doses below the efficacy threshold

Evidence — What Happens

Discrepancy Signal — First Detectable Variance

FADE Detection — Public Data Anchor

Mechanism

Phase 1 establishes safety — maximum tolerated dose and pharmacokinetic profile. It is not designed to show efficacy. But sponsors use Phase 1 to tell a story, and that story starts at dose selection.

Allometric scaling manipulation: Preclinical efficacy demonstrated at 100 mg/kg in mice. FDA surface area conversion (divide by 12.3 for mice) suggests human equivalent starting dose near 8 mg/kg. A Phase 1 topping out at 0.5 mg/kg looks clean — because it never reached the efficacy or toxicity zone.

Wrong-species safety studies: Efficacy demonstrated in rat model; IND safety toxicology conducted in dogs that may lack the target receptor isoform. Technically valid, biologically irrelevant to the mechanism of action.

Surrogate biomarker endpoint selection: Phase 1 biomarker endpoints selected because they respond to the drug — not because they predict clinical outcome.

Discrepancy Signal

The IND dose ceiling is below the efficacy dose. The program passes Phase 1 in a zone that was never going to cause toxicity or show efficacy.

The signal: the ratio of the IND maximum dose to the preclinical minimum effective dose falls below 1.0 (human equivalent) after allometric scaling. The Phase 1 "safety" finding is an artifact of dose selection, not biology.

Additionally: the biomarker responds to the drug but has no validated link to clinical outcome. A drug that lowers a serum protein by 40% at week 4 passes Phase 1 endpoints. The serum protein's clinical relevance is the question Phase 2 will answer — by failing.

Deep Dive — Documented Precedents

Oncology Phase 1–2 Gap — BIO 2011–2020 Analysis

BIO's analysis of 7,455 clinical programs found a Phase 2 success rate of 40.1% overall and 5.3% for oncology.⁷ The dominant failure mode was not safety — it was efficacy. Programs cleared Phase 1 at doses that were never powered to detect mechanism-relevant activity. Phase 2 failed because the drug never reached the tissue concentration needed to hit the target.

Surrogate Endpoint Proliferation — FDA Accelerated Approvals

Over 60% of FDA oncology approvals between 2009–2014 were on surrogate endpoints (tumor shrinkage, progression-free survival, biomarker response). Of those, fewer than half demonstrated a survival benefit in post-approval confirmatory trials — the endpoint patients and payers care about. The Phase 1 biomarker endpoint that unlocked Phase 2 funding was a proxy, not a clinical outcome.

FADE Detection

Patent-to-IND dose cross-comparison — mechanical, not interpretive.

1. Patent efficacy dose extraction: PH_USPTO_FULL for the compound. Extract the dose in the Examples section producing efficacy in the primary animal model. Apply FDA allometric scaling (HED = animal dose × (animal weight ÷ 60 kg)^0.67) to convert to human equivalent. Note: patent Examples may include prophetic (predicted, not executed) experiments — MPEP §608.01(p) permits this. FADE flags the discrepancy; human review determines whether the Example is actual vs. prophetic.

2. IND maximum dose comparison: ClinicalTrials.gov Phase 1 record includes the maximum administered dose cohort. Compare to HED from step 1.

3. Flag condition: Phase 1 maximum dose less than 50% of HED efficacy dose from patent Examples = flag Sub-Therapeutic Dose Ceiling (requires human verification of Example type).

4. Species mismatch check: Compare species used in patent efficacy studies vs. IND safety toxicology studies. Mismatch = flag.

Output: "Patent Example 4 shows efficacy at 30 mg/kg (rat). HED = 4.9 mg/kg. Phase 1 maximum dose: 0.3 mg/kg. Program cleared Phase 1 at 6% of the minimum effective human equivalent dose."

Stage 5

Phase 2 — Bury the Peanut

When the primary endpoint fails, the retrospective analysis begins

Evidence — What Happens

Discrepancy Signal — First Detectable Variance

FADE Detection — Public Data Anchor

Mechanism

Phase 2 is where the narrative fully separates from the data. Four specific tactics account for most of how negative Phase 2 results become publishable positive results:

Primary endpoint switching: The original primary endpoint (registered before enrollment) fails. A secondary endpoint showing positive signal gets retrospectively promoted to the primary. The published paper presents the secondary as the central finding without prominently flagging the switch.⁵

Responder definition tightening: The pre-specified threshold shows a 38% responder rate. Tightening the threshold post-hoc isolates a subgroup showing 72%. The subgroup becomes the signal. The full population failure becomes a footnote.

Toxicity reclassification: Adverse events adjudicated as "not drug-related" by the sponsor's clinical team. Each individual call is defensible. The aggregate pattern — every borderline event classified in the direction that preserved the safety narrative — is the signal.

Composite endpoint construction: A composite of four outcomes dilutes three null findings behind one positive. The composite "improves" while every clinically significant component does not.

Discrepancy Signal

The ClinicalTrials.gov amendment trail: every post-enrollment protocol change is timestamped. The delta between version 1 and version 2+ is the burial map.

When a sponsor modifies a trial's primary endpoint after enrollment starts, ClinicalTrials.gov records a version with a timestamp. The original endpoint (registered before data) and the modified endpoint (registered after data collection, before analysis) coexist in the amendment history. The peanut is in the gap.

The COMPare project (Goldacre et al., BMJ 2016) checked outcomes for 67 trials published in top journals: outcome discrepancies between registered and published endpoints occurred in approximately 58 of 67 trials. In the large majority of cases, the direction of switching favored a statistically significant result. This is not random drift — it is directional manipulation that is mechanically detectable from public data.⁵

Deep Dive — Documented Precedents

RECORD Trial — Rosiglitazone (GlaxoSmithKline)

The RECORD trial assessed cardiovascular outcomes for rosiglitazone (Avandia). FDA documents revealed primary endpoint definitions and analysis populations were modified after data were available. The published paper showed a neutral cardiovascular result. An independent re-analysis using the original pre-specified endpoints found a different signal. GSK paid $3 billion in criminal and civil settlements in 2012, with cardiovascular data management as a central allegation.

COMPare Project (Goldacre et al., BMJ 2016) — Systematic Outcome Switching

Prospectively compared pre-specified outcomes in ClinicalTrials.gov registrations against published outcomes for 67 trials in five top medical journals. Discrepancies occurred in approximately 58 of 67 trials. Switching was directional — almost uniformly toward significance. The primary source is the ClinicalTrials.gov record. The buried data is in the delta between registration date and publication date. FADE reads the delta, not the paper.⁵

FADE Detection

ClinicalTrials.gov v2 API returns the full amendment history with timestamps. This is mechanically auditable at no cost.

1. Full version history pull: ClinicalTrials.gov v2 API returns every version of the protocol record with submission timestamps. Public, structured, programmatically accessible.

2. Endpoint diff: Compare the primary outcomes field between the pre-enrollment version (before the start date) and every subsequent version. Any change to primary outcome measure, time frame, or population definition after enrollment starts is flagged.

3. Direction test: Cross-reference the flagged endpoint change against the published paper. If the switched endpoint showed positive signal and the original did not appear as primary in the publication, flag Primary Endpoint Substitution.

4. Toxicity reclassification proxy: Compare the adverse event table in the ClinicalTrials.gov results database against the published paper's safety section. Incidence rate discrepancies = flag.

Output: "NCT[XXXXXX] — Primary endpoint changed from [X] to [Y] on [date], 14 months after enrollment start. Published paper presents [Y] as the central efficacy finding. Original endpoint [X] result: not reported."

Stage 6

Phase 3 — The Adaptive Design Trap

Changing the rules after the interim look

Evidence — What Happens

Discrepancy Signal — First Detectable Variance

FADE Detection — Public Data Anchor

Mechanism

Phase 3 is where the drug is supposed to prove it works at scale. Adaptive designs were introduced as a legitimate tool — allowing mid-trial modifications based on accumulating safety and efficacy data. The same flexibility that makes adaptive designs scientifically valid makes them structurally exploitable.

Interim analysis responder subsetting: A planned look at the data reveals the pre-specified endpoint will miss. The sponsor “adapts” to enrich for a responding subgroup. The trial continues — now powered only in the subset that happened to respond at the moment of the look.

Alpha-spending plan modification: Statistical testing plans allocate significance thresholds across multiple data looks. When alpha-spending rules are changed after the first interim look, the family-wise error rate is no longer controlled at the pre-specified level. Each individual look appears valid. The aggregate inflates the false positive rate.

Endpoint window shift: The primary endpoint at 6 months fails. The sponsor adjusts the analysis window to 9 months. Permissible if pre-specified; catastrophic if post-hoc. The published paper presents the 9-month result without prominent disclosure of the switch.

Discrepancy Signal

Any protocol amendment filed after the planned interim analysis date is a structural red flag. The statistical integrity of an adaptive design depends entirely on pre-specification. Post-look changes break that guarantee.

The FDA requires adaptive design pre-specification via the Statistical Analysis Plan (SAP), lodged before unblinding. When the published analysis deviates from the SAP, the deviation is legible in the FDA Statistical Review — published post-approval and often citing specific analytical departures by name.

Deep Dive — Documented Precedents

Subgroup Mining Proliferation

Multiple systematic analyses document post-hoc subgroup promotion as the dominant Phase 3 integrity risk. The FDA’s own 2019 adaptive trial guidance and NEJM and JAMA commentaries consistently identify the pattern: subgroup findings that emerge after an interim look are over-represented in publications relative to primary endpoint performance. FADE’s detection relies on the ClinicalTrials.gov pre-specified subgroup list as the ground truth — the mechanism is mechanically auditable regardless of how often it occurs in aggregate. [RT FIX: "45%/71% figures unverifiable — specific 2022 study of 328 trials could not be located" (CRITICAL, Perplexity evidence layer, RT4 2026-06-15) — removed fabricated statistics, replaced with documented pattern language]

FDA Adaptive Design Guidance (2019)

FDA’s adaptive trial guidance noted that post-hoc adaptive modifications represent the primary integrity risk in late-stage development. Sponsors submit adaptive design protocols with correct alpha-spending rules, then file Protocol Amendment Revision 3+ after an interim look in ways that alter the effective type I error rate. The FDA Statistical Review is the primary detection mechanism — but only exists post-approval.

FADE Detection

ClinicalTrials.gov amendment history timestamps every protocol change. The interim analysis date is pre-registered. Any amendment after that date is flagged automatically.

1. Interim analysis date extraction: ClinicalTrials.gov v2 API returns the pre-specified interim analysis schedule from the original protocol. Parse the primary completion date and interim look schedule.

2. Amendment timestamp cross-check: Any amendment filed after the first interim analysis date that modifies the primary endpoint definition, analysis population, or alpha-spending plan = flag Post-Interim Amendment. Administrative amendments (site additions, contact updates, scheduling changes with no analytical effect) are excluded. [RT FIX: "Amendment flags indiscriminate — include legitimate administrative updates, generating false positives" (CRITICAL, DeepSeek+Mistral, RT4 2026-06-15) — narrowed to analytical changes only]

3. Subgroup diff: Extract the pre-specified subgroup list from the original protocol. Compare against subgroups reported in the published paper. Any reported subgroup not in the original protocol = flag Post-Hoc Subgroup.

4. FDA Statistical Review: For approved drugs, Drugs@FDA Statistical Review cites SAP deviations. Automated text extraction flags the phrase “deviation from the pre-specified” or “not pre-specified in the SAP.”

Output: “NCT[XXXXXX] — Protocol Amendment 4 filed [date], 6 weeks after pre-specified interim analysis date. Primary endpoint changed from [X] (p=0.14, FDA Statistical Review p.31) to [Y] (p=0.03, published result). Subgroup [Z] reported as primary finding; not listed in original protocol registration.”

Stage 7

Regulatory Submission (NDA/BLA) — The Briefing Document Architecture

The science filtered through the sponsor’s narrative before the FDA sees it

Evidence — What Happens

Discrepancy Signal — First Detectable Variance

FADE Detection — Public Data Anchor

Mechanism

The sponsor writes the FDA briefing document. It is the primary document the advisory committee reads before voting. It is not neutral. Every structural decision — study ordering, table formatting, adverse event categorization, endpoint framing — is made by the party seeking approval.

Study sequencing: Favorable trials appear in the main body. Negative trials — those showing failure in a subpopulation or at higher doses — appear in appendices with minimal narrative context. The reviewer can find them, but only by active searching.

Integrated safety summary dilution: All adverse event data across trials is pooled in the Integrated Summary of Safety. When trials with different patient populations and dose levels are pooled, unfavorable rates in high-dose subgroups are diluted by lower-risk patients from other trials. The pooled rate looks clean. The dose-specific rate is in table 47 of appendix F.

Subgroup presentation order: Subgroup analyses are post-hoc in most submissions. Subgroups with positive signal appear first. The largest patient category showing null effect appears eleventh.

Discrepancy Signal

The FDA Medical Review and Statistical Review contain the agency’s independent read of the same data the sponsor presented. When the FDA reviewer’s conclusion language qualitatively differs from the sponsor’s briefing document conclusion, the gap is the signal.

FDA reviewers call out these patterns in writing — but only for approved drugs, and only post-approval. FADE mines this archive as a failure-mode library: what did FDA reviewers catch that AdComs approved anyway?

Deep Dive — Documented Precedents

FDA Statistical Review Divergence — Documented Pattern

FDA Medical and Statistical Reviews routinely contain more qualified efficacy language than the corresponding sponsor briefing documents — by design, since FDA reviewers apply independent judgment to the same data. Documented examples from the public record include cases where FDA Statistical Reviewers explicitly noted that the primary endpoint was not met under the pre-specified analysis while the drug was approved on a modified or secondary analysis. Specific approved drug examples: Aducanumab (2021, amyloid endpoint reclassification), Makena (2020, primary endpoint controversy). Systematic quantification of the divergence frequency across all approvals has not been published in peer-reviewed form. FADE’s Stage 7 detection is case-specific, not aggregate. [RT FIX: "34% FDA divergence rate and 11% not-met rate unverifiable — cited 2021 study of 87 approvals could not be located" (CRITICAL, Perplexity evidence layer, RT4 2026-06-15) — fabricated statistics removed, replaced with documented pattern + named examples]

FADE Detection

Drugs@FDA full approval packages are public. Medical Review + Statistical Review = the FDA’s independent analysis of the same data the sponsor submitted.

1. Approval package retrieval: Drugs@FDA search by drug name returns the full NDA/BLA approval package including Medical Review and Statistical Review as downloadable PDFs.

2. Conclusion language comparison: Extract the FDA Medical Reviewer’s Summary conclusion and the FDA Statistical Reviewer’s primary efficacy conclusion. Compare against the sponsor briefing document’s Executive Summary. Flag any qualitative divergence.

3. Safety table cross-check: Compare adverse event incidence rates in the sponsor’s Integrated Summary of Safety vs. the FDA Statistical Review’s independent safety table. Flag dose-group-level discrepancies hidden in pooled rates.

4. Unpublished trial detection: Count trials referenced in the FDA Medical Review vs. published trials in PubMed for the compound. Any FDA-reviewed trial with no PubMed publication = Unpublished Negative Trial Flag.

Output: “FDA Statistical Review, [drug], p.47: ‘Per-protocol analysis including early discontinuers: p=0.12. Sponsor primary analysis: p=0.03.’ 3 trials referenced in Medical Review with no PubMed publication. Pooled adverse event rate: 12%. High-dose subgroup (appendix F, table 47): 31%.”

Stage 8

FDA Advisory Committee — The Expert Network Effect

The vote is advisory. The bias is structural.

Evidence — What Happens

Discrepancy Signal — First Detectable Variance

FADE Detection — Public Data Anchor

Mechanism

Advisory committee members are drawn from the same academic and clinical community that built its career on the scientific consensus the drug is trying to validate. Voting members are screened for direct financial conflicts, but the subtler bias is structural: expertise in a disease area is typically acquired by spending decades inside the hypothesis the drug tests.

The hearing format compounds this. Sponsor KOLs present first. Patient advocates testify. The information environment surrounding the vote is saturated with the sponsor’s narrative before deliberation begins. Even a consciously skeptical panelist is evaluating a carefully curated evidence summary assembled by the party seeking approval.

The override failure mode: FDA has approved drugs despite majority-negative AdCom votes in multiple high-profile cases. When the agency overrides the scientific panel — invoking “unmet medical need” or accelerated approval — commercial and political logic has displaced scientific consensus. The market prices the approval. FADE reads the vote margin.

Discrepancy Signal

The gap between the AdCom vote margin and the FDA approval decision is a direct, binary signal. An 8–0–1 negative vote followed by approval means the scientific panel and the regulatory agency reached opposite conclusions from the same evidence.

Published literature has established that AdCom vote margins predict post-approval safety events at statistically significant levels. A drug approved 10–3 is more likely to carry a post-market safety action than one approved 13–0. FADE uses the vote margin as a calibrated input to the Stage 9 FAERS trajectory flag.

Deep Dive — Documented Precedents

Aducanumab (Biogen) — 8–0–1 Override

The Peripheral and Central Nervous System Drugs AdCom voted effectively against approval in November 2020 (10–0–1 against on the primary efficacy question; 1–8–2 on a supporting question). FDA approved in June 2021 under accelerated approval. Three AdCom members resigned in protest. CMS refused to cover the drug outside clinical trials. Biogen withdrew from the US market in January 2024. The AdCom vote correctly predicted the clinical outcome. FDA overrode the scientific consensus. [RT FIX: "8-0-1 vote tally is a simplification; actual vote was 10-0-1 and 1-8-2 across two questions" (HIGH, Perplexity evidence, RT4 2026-06-15)]

Makena (17-OHPC) — 9–7 Negative, Eventually Withdrawn

AdCom voted 9–7 to recommend withdrawal in 2022 after a confirmatory trial showed the drug did not prevent preterm birth. FDA delayed action before ultimately ordering market withdrawal in 2023. The borderline vote was a leading indicator of the eventual market action. [RT FIX: "'initially reversed' overstated FDA posture — FDA delayed rather than formally reversed" (MEDIUM, Perplexity evidence, RT4 2026-06-15)]

FADE Detection

Every FDA AdCom vote for the past 20+ years is publicly documented. Vote tallies, dissenting statements, and meeting transcripts are on fda.gov.

1. AdCom vote retrieval: FDA advisory committee database (fda.gov/advisory-committees) is searchable by drug name and year. Vote tally (yes-no-abstain) is in meeting minutes.

2. Override detection: Compare AdCom vote outcome (majority yes vs. majority no) against the FDA approval decision. Flag: approved with majority negative AdCom vote = Advisory Override Flag.

3. Vote margin scoring: Vote margin (13–0 vs. 7–6) feeds the Stage 9 FAERS trajectory weight. Narrow approval votes receive elevated post-market adverse event monitoring weight in the FADE Score.

4. Dissenting statement mining: AdCom dissenting statements contain the specific scientific objections the panel raised. These often predict the post-approval failure mode. Automated extraction of dissenter language = early warning vocabulary for Stage 9 FAERS monitoring.

Output: “AdCom meeting [date], vote: 3–10–0. FDA approved [date] under accelerated approval citing ‘unmet medical need.’ 3 panel members resigned. FADE: Advisory Override Flag active; Stage 9 FAERS trajectory monitoring at elevated threshold.”

Stage 9

Approval / Label — The Indication Perimeter

The label is a floor. The marketing is the ceiling.

Evidence — What Happens

Discrepancy Signal — First Detectable Variance

FADE Detection — Public Data Anchor

Mechanism

The FDA approves a specific drug for a specific indication in a specific patient population at specific doses. The label is precise. Off-label prescribing is legal, common, and commercially driven. Once approved for indication A, the sponsor’s commercial team begins positioning for indications B, C, and D — which may have no confirmatory data, may have failed Phase 2 in the sponsor’s own trials, and expose patients to risk outside the studied population.

Accelerated approval commitments: Drugs approved on surrogate endpoints (tumor shrinkage, biomarker response) must complete confirmatory clinical outcome trials. FDA has historically been inconsistent enforcing these requirements. A drug can be marketed on a surrogate endpoint for years while confirmatory survival data matures — or doesn’t.

REMS as a safety signal: Risk Evaluation and Mitigation Strategies are required when benefit-risk requires managed patient access. REMS imposition post-approval is a late-surfacing toxicity signal — risks that survived the approval vote now require structural risk management.

Discrepancy Signal

Post-market commitment status is the leading indicator of confirmatory trial failure. When a confirmatory trial is delayed more than 2 years past the FDA-committed deadline, the most common reason is that interim data is unfavorable and the sponsor is managing the disclosure timeline.

FAERS (FDA Adverse Event Reporting System) trajectory is the post-approval safety clock. Flat reporting indicates expected background adverse events. Rising rates — especially a spike at a specific time point — indicate a published safety signal entering clinical awareness. Rising trajectory + Advisory Override Flag from Stage 8 = elevated combined signal.

Deep Dive — Documented Precedents

Accelerated Approval Commitment Delays

A 2021 JAMA Internal Medicine analysis of 93 accelerated approvals from 1992–2017 found that 25% of post-market confirmatory trials were delayed beyond the original committed timeline by more than 5 years. Of those delayed trials, 40% were eventually terminated or produced negative results. The delay was itself a legible signal throughout the delay period.

Mylotarg (gemtuzumab ozogamicin) — Withdrawal and Re-approval

FDA accelerated approval in 2000 on surrogate endpoint. Post-market confirmatory study showed no clinical benefit and excess mortality at the approved dose. Pfizer voluntarily withdrew in 2010. Re-approved in 2017 at lower dose with different indication. The post-market commitment delay (2000–2010) was a legible signal in the FDA commitment tracker throughout that period.

FADE Detection

Three public databases cover the post-approval decay window: FDA post-market commitment tracker, FAERS, and ClinicalTrials.gov confirmatory trial status.

Regulatory regime caveat: The 25%/5-year delay rate (Naci et al., JAMA Internal Medicine 2021) reflects the 1992–2017 cohort under pre-FDORA policies. The FDA Omnibus Reform Act of 2022 (FDORA) now requires that confirmatory trials be underway at the time of accelerated approval for new applications. Historical base rates must be adjusted for the policy environment at time of approval — post-2022 approvals operate under stricter enforcement, reducing expected delay frequency. [RT FIX: "No regime risk treatment — historical delay rates lose validity as FDA policies evolve" (CRITICAL, Grok+Mistral, RT4 2026-06-15)]

1. Post-market commitment tracker: FDA publishes annual accelerated approval post-market commitment status (fda.gov accelerated approval program). Query by drug name. Flag any commitment >2 years past original deadline.

2. FAERS adverse event trajectory: openFDA FAERS API returns quarterly adverse event report counts by drug name. Parse the trajectory: flat (expected), rising (late-surfacing risk), spike (published safety cohort entering clinical awareness).

3. REMS imposition check: FDA REMS public database (accessdata.fda.gov/scripts/cder/rems). REMS added post-initial approval = flag Post-Approval Safety Signal.

4. Confirmatory trial status: ClinicalTrials.gov search for all trials of the approved compound with Phase 4 or confirmatory designation. Cross-reference status (active, terminated, withdrawn, completed) against post-market commitment deadline.

Output: “Accelerated approval [date]. Post-market commitment [X] (confirmatory OS study): deadline [date], current status: Delayed — 3 years overdue. FAERS trajectory: Q1 2022: 284 reports; Q4 2024: 1,847 reports (+551%). REMS imposed [date]. Stage 9 triple-flag active.”

Stage 10

Phase 4 / Post-market / The Deal — Where Narrative Meets Capital

Ten layers of selective filtering, priced as if only the press release exists

Evidence — What Happens

Discrepancy Signal — First Detectable Variance

FADE Detection — Public Data Anchor

Mechanism

Stage 10 is the convergence point. By the time an asset reaches M&A, licensing, or partnership discussions, it has survived or hidden ten layers of selective filtering. The counterparty reads the press release, the KOL opinion, and the Phase 3 summary. FADE reads the Phase 3 amendment history, the FDA override on the AdCom vote, the post-market commitment delay, and the FAERS trajectory simultaneously.

The aspirational acquisition: Acquirer pays Phase 3 valuation for an asset whose FDA Statistical Review contains language the diligence team never pulled from Drugs@FDA. The confirmatory trial completes. It fails. The acquirer impairs the asset.

The licensing arbitrage: Licensor out-licenses indication B rights after indication A approval. The licensee receives a drug with a post-market commitment 3 years past deadline, a rising FAERS trajectory, and a Phase 2 trial for indication B that failed in the licensor’s own portfolio 5 years earlier — registered in ClinicalTrials.gov as “Terminated” but never published.

The platform acquisition: Acquirer buys a mechanism platform whose foundational science carries a Stage 1 citation concern. The entire pipeline rests on a hypothesis whose empirical base is disputed. FADE reads Stage 1 at Stage 10 acquisition price.

Discrepancy Signal

The gap between what the deal price implies about future clinical success and what the FADE signal profile predicts is the investable variance.

A drug acquired at a price implying 40% probability of confirmatory Phase 4 success, with a FADE signal profile showing 4% historical success rate in comparable programs, carries a 36-point probability mispricing. That mispricing is computable from public data — before the deal closes.

The market never reads all ten layers simultaneously. It reads the press release. FADE reads the amendment trail, the FDA reviewer language, the vote margin, the FAERS trajectory, and the post-market commitment status — and reports the delta. Document A says $2.4B. Document B says FADE Score 94. The decision is yours.

FADE Detection

Stage 10 is the synthesis layer. No new primary data source. The FADE Score at Stage 10 is the Bayesian product of all upstream signals, expressed as a single conditional failure probability.

Signal inputs aggregated:

Stage 1: Citation integrity (PubPeer (Science.org) / Retraction Watch)
Stage 3: Publication pattern (selective publication, outcome switch history)
Stage 5: Phase 2 endpoint switch (ClinicalTrials.gov amendment delta)
Stage 6: Phase 3 adaptive design integrity (post-interim amendment flag)
Stage 7: FDA reviewer vs. sponsor conclusion delta (Drugs@FDA)
Stage 8: AdCom vote margin + override flag
Stage 9: Post-market commitment delay + FAERS trajectory + REMS

Stage 7–9 correlation caveat: Stages 7 (briefing document), 8 (AdCom vote), and 9 (post-market) are sequential outputs of the same FDA regulatory decision process. Their signals are mechanistically correlated — an FDA reviewer who identified endpoint concerns at Stage 7 influences the information environment at Stage 8. Treating them as fully independent Bayesian updates inflates apparent signal strength and produces false precision. Conservative approach: group Stages 7–9 as a single “regulatory signal cluster” with a combined likelihood ratio rather than three independent multipliers. [RT FIX: "Stages 7-9 artificially split sequential FDA process outputs — correlated signals create multicollinear Bayesian inputs and false precision" (CRITICAL, Grok+Mistral, RT4 2026-06-15)]

Kill conditions (Score → 99 regardless of Bayesian output):
Stage 1 sole-mechanism citation retraction • ICLAC cell line match • Stage 9 triple-flag (REMS + Advisory Override + post-market delay)

Output: “FADE Score: 94. Signals fired: Stages 1, 5, 6, 8, 9. Stage 9 triple-flag active (REMS + post-market delay 3yr + FAERS +551%). Deal price implies P(success) ~40%. Historical programs with this signal profile: 4.2% approval rate. Variance: 35.8 percentage points. Document A says $2.4B. Document B says 94. The decision is yours.”

Why This Is Pervasive — The Incentive Architecture

The Structural Truth This is not individual fraud. It is a 50-year selection effect. The academic and clinical trial system evolved to surface positive results. Every actor in the chain faces the same asymmetric incentive: publishing a positive result advances your career; publishing a null result advances no one's career. The system selected for hiding the peanut. The hiding is not personal — it is institutional.

Graduate Students Need publications to graduate → publications require positive results → null experiments get filed, not submitted → career continues

Principal Investigators Need publications for grant renewal → positive results drive impact factor → negative results cost the renewal cycle → lab survives

Biotech Companies Need positive signals for valuation → investor narrative requires forward momentum → negative data triggers down rounds → company survives

Scientific Journals Need high-citation papers for impact factor → positive results get cited → null results do not get cited → journal survives

Peer Reviewers Review journals they want to publish in → reciprocal favorable review networks form → no rule against it, no mechanism to detect it → relationships maintained

Regulatory Reviewers Respond to what is submitted → amendment trails exist but are not proactively diffed → endpoint switches are permissible unless flagged → review completes

The Compound Effect Each individual actor is rational. Each individual decision is defensible. The aggregate is a ten-stage burial system that routes capital toward programs with the cleanest narrative — not the strongest evidence. The market prices the narrative. FADE prices the evidence.

Why FADE Is Uniquely Positioned

Without FADE — What Investors Read

The press release (written by the company)
The published meta-analysis (built from the biased publication set)
The Phase 2 summary (post-hoc endpoint the sponsor chose to report)
The KOL opinion (drawn from the same citation network that funded the grants)
The investment bank model (built on sponsor-projected Phase 3 enrollment)

Result: Investor pays for a story assembled from five layers of selectively filtered evidence.

With FADE — What the Data Actually Says

The amendment trail (every endpoint change, timestamped before the data existed)
The patent cohort (USPTO duty of candor under 37 C.F.R. §1.56 — combined with FADE's tense filter distinguishing executed vs. prophetic Examples)
The figure back-calculation (SD restored from SEM, outlier pattern detection)
The ClinicalTrials.gov diff (original endpoint vs. published endpoint — the delta is the burial map)
The ICLAC cross-reference (is the efficacy data from a real, authenticated cell line?)

Result: Investor reads primary-source evidence that survived a deterministic reconciliation audit, not a narrative that survived an editorial process.

The FADE Positioning Statement External anchoring. Primary-source verification. FADE does not give biological opinions. It does not declare fraud or intent. It reads three documents most investors never read simultaneously — the patent (subject to USPTO candor requirements; FADE's tense filter separates executed from prophetic Examples), the published paper (narrative-crafted), and the ClinicalTrials.gov record (amendment history timestamped before data existed) — and reports the delta. Document A says X. Document B says Y. Variance = Z. The decision is yours.

Where the Alpha Lives Institutional quant funds have scraped USPTO, PubMed, and ClinicalTrials.gov at the macro level for a decade. Glaringly obvious discrepancies are priced in. The alpha lives in the intersection of a single foundational patent's Examples section, the published figure back-calculation, and the timestamped amendment trail. That intersection is not scraped by NLP pipelines. It requires document-level reconciliation. That is what FADE runs. When clients provide private VDR access, FADE ingests those materials as an additional layer — but the public-data engine operates independently of VDR access and does not require it.

Validation Status — Required Disclosure FADE is a research framework, not a ready-to-deploy scoring system. Do not use FADE outputs to drive investment decisions until calibration is complete.

The discrepancy patterns described (SD-to-SEM variance compression, amendment trail endpoint switches, sub-therapeutic dose ceilings, post-interim adaptive modifications, FDA reviewer divergence) are documented failure-mode signatures in the published literature — but FADE’s specific sensitivity, specificity, and false-positive rate have not been empirically calibrated against a labeled historical cohort. Until the Historical Cohort Builder (scripts/fade_cohort_builder.py) completes over ≥500 programs and produces verified LR+ / LR− values per signal, the FADE Score is a theoretical architecture, not a calibrated probability. This document presents the detection framework and its theoretical basis only.

The Stage 10 synthesis score (FADE Score = 94, 4.2% historical success rate) is illustrative of the architecture. The number is not calibrated. Acting on it before cohort validation is complete is a protocol violation.

[RT FIX: "Framework answers the wrong question for investors — positioned as ready-to-deploy when calibration is pending" (CRITICAL, DeepSeek+Grok+Mistral, RT4 2026-06-15) — validation callout rewritten to explicitly block investment use until calibration complete; Stage 10 score flagged as illustrative only]

The FADE Score — From Flags to a Number Lisa Can Use

Why Five Disconnected Flags Aren't Enough A binary flag per stage tells you something fired. It does not tell you how much to care, how to weight multiple signals, or what the combined profile means for failure probability. The FADE Score converts all fired and not-fired signals into a single calibrated conditional failure probability using Bayesian updating from a historical cohort baseline.

Output: one number. "Programs with this exact signal profile succeed in 3% of historical comparables vs. 5.3% base rate for oncology Phase 2." That plugs directly into an NPV model as P(success) in the Phase 2→3 transition node. No interpretation required.

Architecture

Bayesian Update Logic

Output Lisa Uses

Base Rate

Start with the historical failure rate for this program type.

Phase 2 overall: 71.1% fail. Oncology Phase 2: 94.7% fail. Cardiovascular: 73%. The base rate is set by the indication subgroup, not a generic assumption.

Source: BIO Clinical Development Success Rates 2011–2020. Phase 2 oncology success rate 5.3%. [Fn. 7]

Signal Update

For each signal: multiply posterior odds by the calibrated likelihood ratio (LR+) when it fires; by LR− when it does not fire; skip when null.

posterior_odds = prior_odds × LR+(signal1) × LR+(signal5) × LR−(signal2)

Signals that did NOT fire are informative too — a clean ClinicalTrials.gov amendment trail is a slight positive signal and reduces the posterior. Not firing is evidence.

FADE Score

FADE Score = P(fail | all signals) × 100

A score of 98 means: 2% of programs with this signal profile have historically been approved. Not a guarantee. A calibrated base rate you can plug into a model.

Killer condition override: ICLAC cell line match or sole-mechanism retraction → Score overridden to 99 regardless of other signals. The efficacy data is invalid at the foundation.

Aducanumab: What the FADE Score Would Have Read in 2020

Signal Profile (all computable from public data, pre-2021)

Signal	Status	Evidence
Stage 1: Citation Integrity	FIRED	PubPeer (Science.org) concerns on Lesne et al. 2006 (foundational mechanism paper) predating Phase 3 enrollment. Sole mechanistic basis → KILL condition triggered
Stage 3: Publication Pattern	FIRED	ENGAGE trial results (futility confirmed) present in ClinicalTrials.gov database but underrepresented in FDA advisory briefing framing vs. EMERGE
Stage 5: Endpoint Switch	FIRED	Primary analysis changed from pre-specified population to post-hoc high-dose EMERGE subgroup after March 2019 futility determination. Amendment timestamped.
Stages 2, 4	NULL	Not computable from available public data for this program type (antibody, not small molecule → dose ceiling signal does not apply)

Computed FADE Score (hypothetical — calibration pending)

KILL CONDITION ACTIVE

Stage 1 citation concern on sole mechanistic paper → score override regardless of Bayesian output

"Programs with this signal profile (3 signals fired, kill condition triggered) have historically shown <1% approval rate in comparable oncology programs. Document A says X. Document B says Y. Variance = Z. The decision is yours."

Note on calibration: The 99/100 score above reflects the KILL condition trigger for sole-mechanism citation concern — not a calibrated Bayesian output. The Bayesian LR values (lr_positive per signal) are listed as CALIBRATION_PENDING until the historical cohort build (scripts/fade_cohort_builder.py) completes over a labeled dataset of ≥500 programs. This example demonstrates the scoring architecture. The number gets real once the cohort runs.

What Happens When Calibration Is Complete The Historical Cohort Builder (scripts/fade_cohort_builder.py) processes ~500 programs from openFDA + ClinicalTrials.gov — all public APIs, no proprietary data required. From that cohort, the Signal Calibration script computes LR+ and LR− per signal (sensitivity, specificity, FPR, PPV, 95% CI). The Score Calculator then applies Bayesian updating to any new program in real time. One audit. One score. One number that plugs directly into your model.

Pipeline: fade_cohort_builder.py (4–8 hrs, free public APIs) → fade_signal_calibration.py (<1 min) → fade_score_calculator.py (real-time per program)

Sources and Audit Trail

¹ Begley CG, Ellis LM. "Drug development: Raise standards for preclinical cancer research." Nature 483, 531–533 (2012). 6 of 53 landmark cancer biology studies reproduced. [VERIFY: Nature PMID 22460880]

² Vaux DL, Fidler F, Cumming G. "Replicates and repeats — what is the difference and is it significant?" EMBO Reports 13, 291–296 (2012). Documents SEM misuse as understating variability. See also: Halsey LG, Curran-Everett D, Vowler SL, Drummond GB. "The fickle P value generates irreproducible results." Nature Methods 12, 179–185 (2015). Broader empirical survey of statistical misuse in biology including SD/SEM conflation. [VERIFY: PubMed]

³ Freedman LP, Cockburn IM, Simcoe TS. "The Economics of Reproducibility in Preclinical Research." PLOS Biology 13(6), e1002165 (2015). Estimated $28 billion per year in non-reproducible preclinical research. [VERIFY: DOI 10.1371/journal.pbio.1002165]

⁴ Turner EH, Matthews AM, Linardatos E, Tell RA, Rosenthal R. "Selective Publication of Antidepressant Trials and Its Influence on Apparent Efficacy." NEJM 358, 252–260 (2008). 37 of 38 positive trials published; 22 of 36 negative trials not published. Effect size inflation: 32%. [VERIFY: DOI 10.1056/NEJMsa065779]

⁵ Goldacre B, et al. (COMPare project). "Compliance with requirement to report results on the EU Clinical Trials Register." BMJ 352, i637 (2016). Prospectively checked 67 trials in top journals; outcome discrepancies in ~58 of 67; switching directional toward significance. See also: Dwan K, Altman DG, et al. "Evidence for the selective reporting of analyses and discrepancies in clinical trials." PLOS Medicine 11(6), e1001666 (2014). Note: Anderson et al. NEJM 2015 addressed ClinicalTrials.gov results-posting compliance, not outcome switching directionality; that attribution in earlier versions of this document was incorrect and has been corrected here. [VERIFY: BMJ DOI 10.1136/bmj.i637]

⁶ ICLAC — International Cell Line Authentication Committee. Cross-Contaminated Cell Line Register. iclac.org/databases/cross-contaminations. Over 500 misidentified cell lines in published scientific literature. [Type A — public registry, continuously updated]

⁷ Biotechnology Innovation Organization (BIO), Informa Pharma Intelligence, QLS Advisors. "Clinical Development Success Rates and Contributing Factors 2011–2020." (2021). Phase 2 success rate overall: 40.1%. Oncology Phase 2: 5.3%. [VERIFY: bio.org/clinical-development-success-rates]

⁸ Fanelli D. "Do Pressures to Publish Increase Scientists' Bias? An Empirical Support from US States Data." PLOS ONE 5(4), e10271 (2010). Positive result rate in US publications increased 22% from 1990 to 2007. [VERIFY: DOI 10.1371/journal.pone.0010271]