New Big Local News dataset tracks the millions pouring into California campaigns
The Big Local News site now offers a refined dataset that can help local journalists track the money flooding California campaigns ahead of elections this November.
The new project, recently added to our data portal, features dozens of data files derived from CAL-ACCESS, the jumbled, dirty and difficult government database that tracks campaign and lobbying activity in Sacramento. It will update daily with the latest filings.
The records can be used to investigate the money powering campaigns for the statehouse, plus the hundreds of millions being spent this fall to influence voters on high-stakes ballot questions about sports gambling, flavored tobacco and abortion.
The data-processing pipeline was first developed by the California Civic Data Coalition, an open-source effort that drew hundreds of contributions from developers and journalists at news organizations around the world.
Today’s release includes files that were never previously documented, including tidied-up cuts of the most important campaign-finance filings. Joins are handled, fields are cleaned and amended filings are excluded.
The new release includes the following key files. A full list is available on the coalition’s site.
|Form460Filing||Periodic disclosure reports filed by recipient committees|
|Form460ScheduleAItem||Itemized monetary contributions|
|Form460ScheduleCItem||Itemized non-monetary contributions|
|Form496Filing||Late independent expenditure reports|
|Form497Filing||Late contribution reports|
With mail-in balloting only a month away, now is the time to dig in. Register for an account at biglocalnews.org and you can quickly download comma-delimited data from the “California campaign finance data” project.
The records are also easily available using the Big Local News API and Python client. Here’s a snippet showing how, once your register an API key, you can rapidly read in a
# Import the necessary utilities # You will need to `pip install bln pandas` import bln import pandas as pd # Connect `pandas` with our API bln.pandas.register(pd) # Set the `project_id` to our new campaign-finance project project_id = "UHJvamVjdDo2MDVjNzdiYS0wODI4LTRlOTEtOGM3OC03ZjA4NGI2ZDEwZWE=" # Put your API key here your_api_token = "<put your token here>" # Read in the dataframe from the Big Local News API df = pd.read_bln(project_id, "Form460Filing.csv", your_api_token) # Parse the date column df['date_filed'] = pd.to_datetime(df.date_filed, errors="coerce") # Filter down to key values from Form 460 filings trimmed_df = df[[ 'filing_id', 'filer_id', 'date_filed', 'filer_lastname', 'statement_type', 'from_date', 'thru_date', 'total_contributions', 'total_expenditures_made', 'ending_cash_balance', ]] # Exclude futured dated filings and show the most recent to land ( trimmed_df[trimmed_df.date_filed < "2030-01-01"] .sort_values("date_filed", ascending=False) .head(5) )
All of the code that powers the project is available as open-source software. If you need help understanding the data or getting a story done, reach out to [email protected]. We’re here to help.
About Big Local News
From its base at Stanford University, Big Local News gathers data, builds tools and collaborates with reporters to produce journalism that makes an impact. Its website at biglocalnews.org offers a free archiving service for journalists to store and share data. Learn more by visiting our about page.