Alternative data · Federal Reserve Beige Book
~440 books · 4 lenses · withholding overlay from 1998 · never revised
Macro regime · latest book
Recession · next 12m
Inflation > 2% · next 12m
regime classifies the latest book · probabilities fit by logistic regression on the loaded panel vs NBER recession dates and CPI>2% (synthetic labels here; swap in real NBER/CPI for live use)
Four-lens history
raw diffusion score by book, 1970–present · anchored z kept in the data (raw_ladder + z_anchored) · grey bands = NBER recessions · scroll / drag to zoom
showing 1970 – 2026 · 440 books
District × lens — latest book
colour spread is the dispersion (σ)
Topic-attention router
word-share drift · latest book
What firms are saying
paraphrased, never verbatim · latest book
Labor — soft signal vs hard withholding
the blended panel · withholding starts 1998 (Daily Treasury Statement), so the hard line only overlays the recent era
showing 1970 – 2026
This screen turns the Federal Reserve's Beige Book — the qualitative survey of business conditions published eight times a year across the twelve Fed districts — into numbers you can track. It reads the words firms and contacts use and scores the direction they imply, then layers a few forecasts on top.
Each book is downloaded and split into district-by-topic pieces by a parser
(BeautifulSoup). The wording is scored two ways: a fixed finance dictionary
(Loughran–McDonald) for tone and uncertainty, and a hand-built "adjective
ladder" for direction. The probability models are ordinary logistic regressions fit with
statsmodels. A separate hard-data series — daily tax withholding from the
U.S. Treasury — is filtered and overlaid on the labor read.
The adjective ladder maps intensity words to fixed points on a −2 to +2 scale: robust/strong = +2, grew/moderate = +1, modest/slight = +0.5, flat = 0, soft/declined = −1, collapsed = −2. A passage's score is the average of whatever ladder words appear in it; a district's score is the average across its sections; a lens is the average across districts. So +1.6 means "language averaging between moderate and robust," not a percentage — it's a tone reading, never a growth rate. Because real prose mixes intensities, scores cluster well inside ±2; a clean +2 needs strong language in every district with no qualifiers, which is rare.
The scores are never revised: a book's number uses only that book, so
it can't change when later books arrive — provided the scoring rule itself stays frozen.
The rule is versioned (v1); any future change runs as a new version beside
the old, never overwriting history.
Two forecasts sit on top of the descriptive scores, each a logistic regression. Recession in the next 12 months is fit on the growth score, its change since the last book, and the risk score — trained against the official NBER recession dates. Inflation above 2% in 12 months is fit on the inflation score, its change, and the bottlenecks score — trained against whether CPI inflation later exceeded 2%.
These are genuine forecasts, a different kind of thing from the descriptive scores, and they carry real uncertainty. To avoid hindsight, any historical view of a probability is computed "as it would have read at the time" — each past point uses only the data and model coefficients available on that date.
The data updates with each Beige Book release. Because the book is never revised, every reading is a true point-in-time vintage — the number you actually had in hand that day. Use Download CSV to export the full history.
Figures in this preview build are synthetic placeholders for layout; the live deployment is driven by the real pipeline output.