Berkson's Paradox

aka Berkson's Bias · Berkson's Fallacy · Collider Bias

Two unrelated traits appearing inversely linked because the sample only includes people who have at least one of them.

WHAT IT IS

The glitch, explained plainly.

Imagine you only visit restaurants that are either really tasty or really pretty inside — you'd never go somewhere that's both ugly and bad. After a while, you'd start thinking 'Hmm, the restaurants with great food always look ugly, and the pretty ones always have bad food.' But that's only because you never saw all the ugly restaurants with bad food — they're invisible to you because you'd never eat there. The pattern is fake; it's just because of how you picked where to eat.

Berkson's Paradox occurs when restricting observations to a pre-filtered subset of a population creates an illusory negative correlation between two traits that are actually independent or even positively correlated in the general population. The filtering mechanism acts as a 'collider' — a common effect caused by both variables — and conditioning on that collider mathematically induces a spurious inverse relationship. For example, among hospitalized patients, two unrelated diseases may appear negatively associated because a patient without one disease must have had the other disease (or something else) to be admitted in the first place. The paradox is ubiquitous in any setting where observation requires passing through a selection gate based on a combination of attributes, from college admissions to dating pools to celebrity fame.

SOUND FAMILIAR?

Where it shows up.

01 A medical researcher studies patients at a hospital and finds that patients with diabetes appear less likely to have gallbladder disease. She begins writing a paper suggesting diabetes may be protective. She hasn't considered that both conditions independently cause hospitalization, and people with neither condition are absent from her sample entirely.
02 Marcus reviews his university's student data and discovers that students with high SAT scores tend to have lower GPAs, and vice versa. He theorizes that standardized test ability somehow undermines classroom performance. He doesn't realize that students who scored low on both metrics were rejected, while those who excelled at both likely chose more prestigious schools.
03 A venture capitalist notices that among the startups in her portfolio, the ones with the most brilliant technical founders tend to have weaker business strategies, while the ones with sharp business plans tend to have mediocre technology. She concludes that technical genius and business acumen are inversely related — without recognizing that startups lacking both qualities never made it past her investment screening.
04 A film critic writes a column arguing that big-budget Hollywood movies are artistically worse than indie films, citing review scores. He doesn't account for the fact that low-budget films with poor artistic quality rarely get wide theatrical distribution or critical attention, so his visible sample of indie films is already filtered for quality in a way blockbusters are not.
05 A data scientist building a credit-scoring model trains it on historical loan applicants and finds that applicants with high income tend to have lower credit scores. She uses this to weight income negatively. She hasn't recognized that the training data only includes people who applied for loans — individuals with both low income and low credit scores may have self-selected out of applying, creating a filtered sample.

IN DIFFERENT DOMAINS

Where it shows up at work.

The same glitch looks different depending on the terrain. Finance, medicine, a relationship, a team — same mechanism, different costume.

Finance & investing

Credit risk models trained on approved loan applicants may find spurious negative correlations between income and credit history quality, because applicants lacking both attributes were already filtered out during the approval process, distorting the perceived relationship between financial indicators.

Medicine & diagnosis

Hospital-based case-control studies frequently discover false protective associations between unrelated diseases because the sample only includes people sick enough to be hospitalized — the absence of one condition in a patient implies something else caused their admission, creating an artifactual inverse relationship.

Education & grading

Within selective universities, GPA and standardized test scores may appear negatively correlated among enrolled students, even though they are positively correlated in the general applicant population, because the admissions filter excluded students who scored low on both and the top scorers on both may have gone elsewhere.

Relationships

People commonly perceive that attractive partners tend to have worse personalities because their dating pool is filtered — they never date people who are both unattractive and unkind, creating a false trade-off between looks and character within their observed sample.

Tech & product

Recommendation algorithms trained on user engagement data may learn spurious negative correlations between content attributes (e.g., popularity vs. depth of engagement) because the training data only includes content that passed a visibility threshold, excluding items that scored low on both dimensions.

Workplace & hiring

Among employees at competitive firms, technical skill and interpersonal skill may appear inversely related because hiring filters screen out candidates who lack both, while candidates exceptionally strong in both may have been recruited by more prestigious competitors.

Politics Media

Media consumers may believe that politicians who are charismatic tend to be less substantive, and vice versa, because politicians who are neither charismatic nor substantive never gain enough visibility to enter the observer's awareness.

HOW TO SPOT IT

Ask yourself…

Am I drawing conclusions from a sample that was pre-filtered by some selection criterion that depends on the very variables I'm analyzing?
Who is missing from this dataset — what kinds of individuals or cases would never appear in my sample, and could their absence be creating a false pattern?
If I imagined adding back all the cases that were excluded by the selection process, would this apparent negative correlation still hold?

HOW TO DEFEND AGAINST IT

The playbook.

Always ask 'Who is missing from this sample?' before drawing conclusions about correlations — mentally reconstruct the full population including those excluded by the selection process.
Draw a simple causal diagram (directed acyclic graph) with your two variables and the selection criterion; if both variables point into the selection variable (a collider), conditioning on it will create a spurious association.
When you notice a surprising negative correlation between two desirable traits, treat it as a red flag for Berkson's Paradox rather than evidence of a genuine trade-off.
Seek out data from the general population rather than relying on convenience samples drawn from filtered institutions or personal experience.
Run a 'selection simulation': mentally add back the excluded group and ask whether the correlation would survive in the complete dataset.

FAMOUS CASES

In history.

Joseph Berkson's 1946 Mayo Clinic study found a spurious negative association between diabetes and cholecystitis in hospitalized patients, which was entirely an artifact of studying only hospitalized individuals rather than the general population.
During the COVID-19 pandemic, early hospital-based studies suggested smoking might be protective against severe COVID-19, but this was identified as Berkson's Paradox because hospitalization served as a collider variable linking smoking and COVID-19 severity.
The 'obesity paradox' in cardiovascular disease — where obese patients with heart disease appeared to have better survival outcomes — has been partly attributed to Berkson's bias arising from conditioning on hospitalization or disease diagnosis.

WHERE IT COMES FROM

Academic origin

Joseph Berkson, 1946. Formalized in his paper 'Limitations of the Application of Fourfold Table Analysis to Hospital Data' published in Biometrics Bulletin, based on his analysis of hospital admission data at the Mayo Clinic. The concept gained wider acceptance after David Sackett's 1979 work providing strong evidence for the paradox's existence.

Evolutionary origin

Ancestral environments demanded rapid pattern detection from whatever information was locally available. Humans evolved to draw conclusions from the samples they could directly observe — their tribe, their territory, their immediate experience — without the statistical sophistication to recognize that their sample might be systematically filtered. In small-group survival contexts, the available sample was often close enough to the full population that this shortcut worked reasonably well.

IN AI SYSTEMS

How the machines inherit it.

Machine learning models trained on non-representative datasets are highly susceptible to Berkson's Paradox. When training data is filtered through a selection process (e.g., only approved loan applicants, only users who engaged, only patients who were hospitalized), models learn spurious negative correlations between features that are actually independent or positively correlated in the general population. This leads to biased predictions, unfair algorithmic decisions in hiring and lending, and recommendation systems that systematically undervalue balanced content. Social media algorithms trained on viral or niche content may falsely learn that popularity and engagement depth are inversely related.

10 glitches quietly running your life.

A free field-zine PDF — ten cognitive glitches named, illustrated, with a defense move for each. Plus the weekly Glitch Report on Fridays — one bias named, two spotted in the wild, one defense move. Unsubscribe any time.

EXPLORE MORE

Related glitches.

Perception & Attention

#34

Availability Heuristic

Probability & Logic

LAUNCH PRICE

Train against your blindspots.

50 cards are free to preview. Buyers unlock the rest of the deck plus the interactive training — Spot-the-Bias Quiz unlimited, Swipe Deck with spaced repetition, My Blindspots, Decision Pre-Flight, the Printable Deck + Cheat Sheets, and the Field Guide e-book. $29.50$59.