Data Methodology

Last reviewed May 2026 — About S.I.R.

Beta notice. Sovereign Industry Report is in active development. Data coverage, classification accuracy, and feature availability will improve over time. Do not rely on this platform as your sole source for compliance or legal decisions. See our Terms of Service and Privacy Policy.

What we track

S.I.R. monitors legislation directly relevant to seven regulated industries across all 50 U.S. states: e-liquid / vapor products, hemp / CBD, functional mushrooms, recreational marijuana, medicinal marijuana, kratom, and peptides.

We track bills at the state level only. Federal legislation and municipal ordinances are outside current scope. Active legislative sessions are prioritized; bills from prior sessions are retained but may not receive ongoing enrichment.

Data sources

Bill data comes from two external APIs, both free-tier:

LegiScanPrimary source for e-liquid, hemp, mushrooms, recreational marijuana, and medicinal marijuana. LegiScan provides bill text, sponsors, vote records, committee assignments, and action history. We operate within the LegiScan free-tier API quota (~30,000 requests/month). Bill text is fetched on initial discovery and periodically refreshed when status changes.
Open StatesSupplementary source for kratom and peptides. Open States aggregates legislative data from official state APIs and websites. We stay within the free-tier rate limit (250 requests/day, 10/minute). Open States coverage is strong for most states but can lag official publication by 24–48 hours during session peaks.

Both APIs are third-party aggregators — their data is derived from official state legislature websites, but may lag official publication by hours to days. S.I.R. does not scrape official state websites directly.

Bill inclusion rules

Not every bill touching a relevant keyword enters the platform. Inclusion is governed by a two-pass keyword filter:

Domain match — the bill title or description must contain a term clearly associated with the industry (e.g. vapor, kratom, mitragyna, cannabis).
Legislative relevance — a secondary filter removes bills that only mention the industry incidentally (e.g. broad budget bills that list vapor tax revenue as one of dozens of line items).

Bills that pass both filters are fetched, stored in our repository, and made available on the platform. Bills that fail either filter are ignored; no partial records are created.

Bills are deduplicated before storage. If LegiScan and Open States return the same bill, only one record is kept. The source of record is preserved in the bill's metadata.

Bill status classification

Each bill is assigned one of seven normalized statuses regardless of how the source API labels it:

Status	Meaning
introduced	Filed; not yet assigned to committee
in_committee	Referred to or actively in committee
passed_one_chamber	Passed House or Senate, awaiting the other
passed_both_chambers	Passed both chambers, awaiting governor
signed	Signed into law by the governor
vetoed	Vetoed by the governor
dead	Failed, tabled, or session ended without passage

Status is mapped from the source API's progress codes at ingest time. Ambiguous statuses default to in_committee. Status changes detected between pipeline runs trigger bill alert emails for subscribed users.

AI enrichment

Bill text is often long, technical, and written in legislative language. To make bills more accessible, S.I.R. runs an optional AI enrichment pass after each pipeline run using Google Gemini 2.5 Flash-Lite (free tier).

AI enrichment produces:

A plain-English summary of the bill's intent and key provisions
An industry impact classification (positive / negative / neutral / mixed)
A brief rationale for the impact classification

What AI does not do:

AI does not determine whether a bill is legally binding or enforceable
AI does not provide legal advice or compliance guidance
AI-generated summaries may contain errors, omissions, or mischaracterizations
Thin bills (no retrievable text) receive no AI enrichment and are flagged as such

AI enrichment is a convenience layer, not an authoritative interpretation. Always verify against the official bill text before acting on any S.I.R. summary.

Enrichment runs on a 12-second delay between bills to stay within free-tier rate limits. Not all bills are enriched on every pipeline run — priority is given to newly introduced and recently status-changed bills.

Human overrides

S.I.R. administrators can manually override any AI-generated field (summary, impact classification, impact rationale) or bill status for any bill. Overrides are stored separately from the pipeline data and take precedence over automated values.

Overrides are used to correct factual errors in AI output, update bills whose source-API status has not yet been refreshed, or flag bills that slipped through keyword filters but are clearly relevant.

Bills with active overrides display the corrected values. The override itself is not surfaced to end users — only the corrected data is shown.

Update schedule

The data pipeline runs automatically once per day at 11:00 AM UTC (7 AM ET / 6 AM CT) via GitHub Actions. Each run:

Fetches new and updated bills from LegiScan and Open States
Merges, deduplicates, and normalizes bill records
Applies AI enrichment to unenriched or newly-changed bills
Updates state overview summaries and heatmap data
Rebuilds RSS feeds
Commits updated JSON to the repository; triggers a site redeploy

The pipeline run timestamp and bill count are embedded in every deployed build (visible on bill detail pages as “Last pipeline run”). If the pipeline fails, the previously deployed data remains live and the team is alerted.

Manual pipeline runs can be triggered via GitHub Actions workflow dispatch for urgent updates between scheduled runs.

Reporting errors

If you find a bill that is missing, miscategorized, or contains an incorrect AI summary, use the feedback button on any bill detail page or contact us through the About page. Corrections are reviewed by a human and applied as overrides within one business day.

For questions about our data practices or privacy, see our Privacy Policy and Terms of Service.