Data Quality Metrics
Transparent metrics on scraping accuracy, data freshness, validation processes, and quality control procedures.
Last updated: June 1, 2026
Scraping Frequency
RateAPI maintains a continuous scraping schedule to ensure data freshness while respecting source website resources.
Validation Process
Every scraped rate passes through multiple validation layers before being served via the API.
Schema Validation
All extracted data must conform to our schema: rate (number), apr (number or null), points (number), term (months), productType (canonical category).
Range Checks
Rates must fall within expected ranges: mortgage rates 3-12%, auto loans 2-25%, HELOCs 5-15%, personal loans 5-36%. Out-of-range values are flagged.
Consistency Checks
APR must be greater than or equal to rate. 15-year rates should be lower than 30-year rates. Jumbo rates should be near conforming rates.
Historical Comparison
Changes exceeding 50 basis points in 24 hours trigger review. Complete rate disappearance triggers investigation.
Cross-Source Verification
When possible, rates are verified against multiple pages within the same institution's website.
Error Handling
When errors occur, we follow a systematic process to minimize impact on data quality.
Scrape Failures
If a scrape fails, we retry with exponential backoff (1h, 4h, 12h). After 3 failures, the source is marked for manual review. Previous valid data is retained with a staleness flag.
Parsing Errors
When page structure changes break our parsers, we detect this via empty or malformed results. AI-assisted extraction attempts recovery. Human review follows if needed.
Data Anomalies
Anomalous data (rate jumps, impossible values) is quarantined and excluded from API responses until reviewed. We never serve unverified anomalous data.
Historical Accuracy
We track accuracy by comparing our scraped rates against manually verified samples.
Accuracy is measured by random sampling and manual verification against source websites. Discrepancies are investigated and corrected.
Quality Guarantees
No Stale Data
Data older than 7 days is marked with a staleness flag. Data older than 14 days is excluded from default API responses.
Source Attribution
Every rate includes the source URL where it was observed, allowing independent verification.
Timestamp Transparency
Every response includes observed_at timestamp showing exactly when the data was collected.
Correction Lineage
When corrections are made, we maintain full audit history. Original values are preserved with correction reason codes.
Questions About Data Quality?
We're committed to transparency. If you have questions about our data collection or find discrepancies, please reach out.