Three easy ways to automatically upload data to Big Local News
The biglocalnews.org website offers a free archiving service that allows any journalist to store and share massive datasets. Log in. Create a project. Drag and drop a couple files. Boom, you’re done.
That’s pretty handy. But even a few minutes of manual work can be an obstacle.
That’s where the Big Local News API comes in.
With a little computer code, it allows you to automatically upload data, of any shape or size, on whatever schedule you’d like, for free. Learn how to use it and you can say goodbye to GitHub file limits, hacky workarounds and costly cloud storage — then say hello to the perfect backend for any web-scraping data horder.
That’s more than handy. That’s special.
To illustrate how easy it is, here are examples of three techniques for uploading any dataset into our system.
1. The bln
Python client
All of the Big Local News API methods require the same three inputs:
- An API key
- The unique identifier of a project
- The file you aim to upload
Any registered user at biglocalnews.org can acquire an API key by logging in and visiting the “API keys” section of the settings menu. Click “Generate Key” and then copy the entry that’s created.
You can find the unique identifier of your Big Local News project by selecting it from the “My Projects” page and clicking on the “Manage” tab. The long string printed underneath the “Access via our API” entry is what you want. You can also find it at the end of the URL of your project’s page.
Once you’ve got all that, you’re ready to plug into our system. One method is the Python wrapper on the API. It can be easily installed from the Python Package Index.
pipenv install bln
Now you’re able to import our client into your code.
from bln import Client
Initialize it with your API key.
client = Client(token="your_api_key")
If the token is set to the BLN_API_TOKEN
environment variable, it doesn’t have to be provided.
client = Client()
And upload any comma-delimited file on your filesystem.
client.upload_file(your_project_id, "./your-data/your-file.csv")
That’s all it takes. Run the code and your file should show up in a matter of minutes. The many other methods supported by the API are covered in our documentation.
2. Our pandas
extension
The bln
package also includes extensions for reading and writing to biglocalnews.org with the popular pandas data analysis library.
First import pandas
as usual.
import pandas as pd
Then import the Big Local News Python package.
import bln
Now connect it with your pd
module.
bln.pandas.register(pd)
Your standard pd
object now contains a set of custom methods for interacting with biglocalnews.org. This is accomplished by “monkey patching” the pandas library.
You can now read in files from biglocalnews.org as a pandas dataframe using the read_bln
function and our three standard inputs.
df = pd.read_bln(your_project_id, your_file_name, your_api_token)
You can write a file to biglocalnews.org just as easily using our custom to_bln
dataframe accessor. It requires the same three inputs.
df.to_bln(your_project_id, your_file_name, your_api_token)
As with the Python client, if the token is set to the BLN_API_TOKEN
environment variable, it doesn’t have to be passed in.
The to_bln
accessor will also accept any of the standard configuration options offered by pandas writer functions, like to_csv
. Here’s an example using the index
input.
df.to_bln(your_project_id, your_file_name, index=False)
That’s the whole deal. Learn more in our extension’s documentation.
3. Our upload-files
GitHub Action
Take it to the next level and have someone else run your code for you using GitHub’s powerful Actions framework. Big Local News’s upload-files action allows you to automate the ingestion of comma-delimited data with a few lines of YAML markup.
This bit will upload a single CSV file.
name: Example action
jobs:
job:
name: Upload file
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v3
- name: Upload file to biglocalnews.org
uses: biglocalnews/upload-files@v2
with:
api-key: ${{ secrets.BLN_API_KEY }}
project-id: ${{ secrets.BLN_PROJECT_ID }}
path: your-file.csv
Tweak that final step to upload a directory.
- name: Upload folder to biglocalnews.org
uses: biglocalnews/upload-files@v2
with:
api-key: ${{ secrets.BLN_API_KEY }}
project-id: ${{ secrets.BLN_PROJECT_ID }}
path: your-folder/
You can see a fully functioning example in the news-homepages repository, where data dumps are automatically uploaded to biglocalnews.org each night.