Data insights, methodology deep dives, and industry analysis.
Pipeline data comparing ATS feeds to job boards across 907M records: Indeed dedup hits 89%, ATS stays under 2%.
Six ML models enrich 907M job records with SOC codes, skills from 37K+ terms, salary predictions under 15% MAPE, and more.
907 million job records parsed to 95.1% state coverage and 92.5% city coverage. Here is how location parsing works at scale.
4.47 billion raw observations become 907 million unique jobs. A 79.7% dedup rate requires more than hashing.
Remote postings fell from 3.5% to 2.3% then rebounded, while hybrid doubled to 3.9M, across 907M job records from 2022 to 2025.
How a two-stage pipeline extracts 37,000+ skills from 907M job records with 84.6% coverage using Aho-Corasick matching and NLP filtering.
How we built a salary prediction model with MAPE under 15% using 50M observations and why returning -1 beats a bad guess.
Analysis of salary disclosure rates across 865M job observations shows transparency-law states reach 24.6% coverage vs 21.4% without mandates.
A field-by-field breakdown of 907M enriched job records across 22 sources, with 82 fields and real coverage rates by source and vintage.
Raw job postings give you 5-7 fields. Enrichment produces 85+. Real coverage numbers from 907M records show a 3.8x seniority multiplier.
Get 5,000 enriched records tailored to your criteria. Free, no commitment.