Texas Sharpshooter Fallacy

aka Clustering Illusion Fallacy · Post-Designation Fallacy · Data Dredging Fallacy

Cherry-picking a data cluster and building a story around it after the fact, ignoring everything that doesn't fit.

Illustration: Texas Sharpshooter Fallacy
WHAT IT IS

The glitch, explained plainly.

Imagine you throw a whole bag of darts at a wall with your eyes closed. Then you walk up, find the three darts that happen to be close together, draw a circle around them, and tell your friends you're an amazing dart player. You're not lying that those darts are close — but you're hiding all the other darts that went everywhere. That's what this fallacy is: picking the part that looks good and pretending the messy rest doesn't exist.

The Texas Sharpshooter Fallacy occurs when someone sifts through a large or complex dataset, identifies an apparent pattern or cluster after the fact, and then presents it as though it were a predicted or meaningful finding — while discarding all the surrounding data that fails to support the conclusion. Unlike forming a genuine hypothesis before examining evidence, this fallacy reverses the scientific process: the 'target' is drawn around whatever already looks interesting. It is especially pernicious because the patterns discovered are often real in a superficial statistical sense — they just aren't meaningful, having emerged from random variation in sufficiently large data. The fallacy is a core driver of the replication crisis in science, where researchers engage in 'HARKing' (Hypothesizing After Results are Known) and selectively report only the variables that yielded statistical significance.

SOUND FAMILIAR?

Where it shows up.

  1. 01 After a breakup, looking back and picking out all the 'red flags' supposedly missed, constructing a narrative of inevitable doom while forgetting all the genuinely positive signals.
  2. 02 Reading a daily horoscope that says 'you'll receive unexpected news,' and when any message arrives, thinking the horoscope was eerily accurate — ignoring the dozens of days it was wrong.
IN DIFFERENT DOMAINS

Where it shows up at work.

The same glitch looks different depending on the terrain. Finance, medicine, a relationship, a team — same mechanism, different costume.

Finance & investing

Investors and analysts frequently back-test trading strategies across decades of market data, then highlight the specific combination of indicators that would have yielded the highest returns — without accounting for the thousands of indicator combinations that failed. This leads to overfitted models that perform well historically but collapse on new data.

Medicine & diagnosis

Epidemiologists face constant pressure from communities reporting apparent disease clusters. When boundaries are drawn around observed cases after the fact rather than defined in advance, nearly any region can appear to have an abnormally high rate of illness, leading to false alarms, wasted resources, and public fear about non-existent environmental hazards.

HOW TO SPOT IT

Ask yourself…

  • Did I define what I was looking for before I looked at the data, or did I find this pattern after browsing?
  • How many other patterns, variables, or clusters did I examine and discard before landing on this one?
HOW TO DEFEND AGAINST IT

The playbook.

  • Pre-register hypotheses: Before analyzing data, write down exactly what you expect to find and what would count as evidence for or against it.
  • Apply the 'how many barns?' test: Ask how many variables, subgroups, or time windows you examined before finding this pattern. The more you looked, the less meaningful any single cluster is.
FAMOUS CASES

In history.

  • The Erin Brockovich case: Hexavalent chromium in a California town's water was blamed for a cancer cluster, resulting in a $333 million settlement, but subsequent epidemiological analysis showed the cancer rate was no higher than — and actually slightly below — the general population.
  • Nostradamus prophecy interpretations: Over 1,000 vague quatrains are retroactively matched to modern events by selecting the few that seem to fit while ignoring the vast majority that don't.
  • The 1980s Texas cancer cluster scare: Elevated cancer rates in certain Texas counties triggered public alarm and investigation, but rigorous statistical analysis found no consistent environmental cause — the clusters were consistent with random variation.
  • The Bible Code phenomenon: Researchers claimed to find hidden prophetic messages in the Torah by selecting specific letter-spacing patterns, ignoring the fact that similarly 'meaningful' patterns can be found in any sufficiently long text.
WHERE IT COMES FROM
Academic origin

Epidemiologist Seymour Grufferman, 1977. First appeared in a paper on Hodgkin's disease clustering (Cancer, 39: 1829–1833), using the sharpshooter metaphor to warn against post-hoc interpretation of disease clusters. The concept gained broader traction in critical thinking and statistics during the 1990s.

Evolutionary origin

Ancestral humans who over-detected patterns — seeing a predator in rustling grass even when it was just wind — survived more often than those who under-detected. This asymmetry in the cost of errors (Type I errors being cheaper than Type II errors) selected for brains that aggressively find clusters and assign meaning, even at the expense of frequent false positives.

IN AI SYSTEMS

How the machines inherit it.

Machine learning models are highly susceptible to this fallacy during training. When models are evaluated on the same data used to find patterns (overfitting), they 'draw the bullseye' around noise in the training set. Feature selection algorithms that test thousands of variables and report only the significant ones without correction replicate the fallacy at computational scale. In LLMs, cherry-picked benchmark results or selectively reported evaluation metrics can create a misleading picture of model capability.

Read more on Wikipedia
FREE FIELD ZINE

10 glitches quietly running your life.

A free field-zine PDF — ten cognitive glitches named, illustrated, with a defense move for each. Plus the weekly Glitch Report on Fridays — one bias named, two spotted in the wild, one defense move. Unsubscribe any time.

EXPLORE MORE

Related glitches.

LAUNCH PRICE

You read about it. Now drill it.

This page taught you the name. The deck turns the name into reflex. 1,100+ swipeable scenarios, 1,100+ defenses, 650+ detection prompts — spaced-repetition Swipe Deck, unlimited Spot-the-Bias Quiz, Defense Playbook, Pre-Flight, My Blindspots, Cheat Sheets, Field Guide e-book. $39.53$59.

Unlock the full kit

Everything below — yours forever. Pay once, use across every device.

Launch price — first 100 readers, $20 off. Auto-applied at checkout.
$59 $39.53
one-time payment · lifetime access
  • All interactive digital cards — search, filter, flip, shuffle on any device
  • Five training modes — Spot-the-Bias Quiz, Swipe Deck, Pre-Flight, Diagnose, Blindspots
  • Curated Lenses + Decision Templates + Defense Playbook
  • Printable Deck PDFs + Field Guide e-book + Cheat Sheets + Anki Export
  • Every future improvement, included
Get the full kit  $39.53

30-day refund · no questions asked

Unlock the full kit

Everything below — yours forever. Pay once, use across every device.

Launch price — first 100 readers, $20 off. Auto-applied at checkout.
$59 $39.53
one-time payment · lifetime access
  • All interactive digital cards — search, filter, flip, shuffle on any device
  • Five training modes — Spot-the-Bias Quiz, Swipe Deck, Pre-Flight, Diagnose, Blindspots
  • Curated Lenses + Decision Templates + Defense Playbook
  • Printable Deck PDFs + Field Guide e-book + Cheat Sheets + Anki Export
  • Every future improvement, included
Get the full kit  $39.53

30-day refund · no questions asked