👉🏽 👉🏽 👉🏽 Full writeup: Flat Data Project 👈🏽 👈🏽 👈🏽
Flat Editor is a VSCode extension that steps you through the process of creating a Flat Data Action, which makes it easy to fetch data and commit it to your repository.
Flat Data is a GitHub action which makes it easy to fetch data and commit it to your repository as flatfiles. The action is intended to be run on a schedule, retrieving data from any supported target and creating a commit if there is any change to the fetched data. Flat Data builds on the “git scraping” approach pioneered by Simon Willison to offer a simple pattern for bringing working datasets into your repositories and versioning them, because developing against local datasets is faster and easier than working with data over the wire.
To use Flat Editor, first install the extension.
If you're starting from an empty repository, invoke the VSCode Command Palette via the shortcut Cmd+Shift+P and select Initialize Flat YML File
This will generate a flat.yml
file in the .github/workflows
directory, and will open a GUI through which you can configure your Flat action.
At any given time, you can view the raw content of the underlying YML file via the View the raw YAML button in the GUI, or via the following button at the top right of your VSCode workspace.
Changes to flat.yml
are saved automatically when using the GUI, but feel free to save manually via Cmd+S if the habit is as deeply engrained for you as it is for us 😀
Currently, Flat supports the ingestion of data via the two following sources:
- Publicly accessible HTTP endpoint
- SQL Database (accessible via connection string)
Flat assumes that you'd like to run your workflow on a given schedule, and to that end exposes a GUI for specifying a CRON job as part of the action definition. We've selected a handful of default values, but feel free to enter any valid CRON string here. We'll even validate the CRON for you as you type!
To create an HTTP action, you'll be asked for the following inputs:
- A result filename (the filename and extension that your results will be written to, e.g.,
vaccination-data.json
). - A publicly accessible URL (we'll try to render a helpful inline preview of the response from this endpoint if we can)
- An optional path to a postprocessing script, if you wish to perform further transformation or work on the fetched date
To create a SQL action, you'll be asked for the following inputs:
- A result filename (the filename and extension that your results will be written to, e.g.,
vaccination-data.json
). - A path to a SQL query
- A database connection string *
- An optional path to a postprocessing script, if you wish to perform further transformation or work on the fetched date
* Note that we will encrypt this value and create a GitHub secret in your repository for this connection string. No sensitive data will be committed to your repository. Keep in mind that your repository must have an upstream remote on github.com in order for us to create the secret.
After you've added the requisite steps to your Flat action, push your changes to your GitHub repository. Your workflow should run automatically. Additionally, under the hood, the extension lists your optional postprocessing and/or SQL query files as workflow triggers, meaning the workflow will run anytime these files change. You can run your workflows manually, too, thanks to the workflow_dispatch: {}
value that the extension adds to your Flat action.
workflow_dispatch: {}
push:
paths:
- .github/workflows/flat.yml
- ./rearrange-vax-data.ts
Deploy a new version with:
First make sure you're a part of the githubocto marketplace team here.
- Get a PAT here (first time)
vsce login githubocto
(first time)vsce publish [minor|major|patch]
This will create a new version and update package.json accordingly.git push
the change topackage.json
If you run into any trouble or have questions, feel free to open an issue. Sharing your flat.yml
with us in the issue will help us understand what might be happening.
❤️ GitHub OCTO