Data Story
The Two Ledgers
A double-entry audit: where IMDB and TMDB disagree on the money.
filmimdbtmdbfinancedata-qualityaudit
Dataset scope
7168
films1914–2024
years859
budget pairs ($)637
revenue pairs ($)The chart focuses on titles with values on both ledgers. USD-only mode keeps comparisons honest without currency conversion.
Loading ledgers
Hypothesis
Disagreement in reported budgets and revenue is systematic — driven by missingness, older releases, and metadata practices — not pure noise.
Question: How consistent are financial metrics between IMDB and TMDB for the same titles?
Method: Compare IMDB raw budget/gross text against TMDB budget/revenue, measuring log-ratio gaps and outlier density by decade.
Prediction: Older decades and sparse metadata show larger gaps and more outliers.
Test: Compute log-gap distributions by decade and inspect extreme outliers.
Narrative Arc
Act I
Two columns type themselves in: IMDB on the left, TMDB on the right.
Act II
Threads stretch between the numbers; most hold steady, some fray and snap.
Act III
Audit stamps reveal outliers — the cases where the ledgers tell different stories.
Datasets
- imdb.films
- tmdb.movies
- 14_two_ledgers.json
Limitations
- IMDB finances are text fields; currency and estimates can differ.
- TMDB budgets/revenue are incomplete and not always audited.
- USD-only is a cleaner slice, not a universal truth.
Next
Want another story? Head back to the film data stories index or explore a new concept.
Back to indexarrow_forward