Python library for loading GIS raster data to standard cloud-based data warehouses that don't natively support raster data.
Raster Loader is currently tested on Python 3.8, 3.9, 3.10, and 3.11.
The Raster Loader documentation is available at raster-loader.readthedocs.io.
pip install raster-loader
git clone https://github.com/cartodb/raster-loader
cd raster-loader
pip install .
There are two ways you can use Raster Loader:
- Using the CLI by running
carto
in your terminal - Using Raster Loader as a Python library (
import raster_loader
)
After installing Raster Loader, you can run the CLI by typing carto
in your terminal.
Currently, Raster Loader supports uploading raster data to BigQuery.
Accessing BigQuery with Raster Loader requires the
GOOGLE_APPLICATION_CREDENTIALS
environment variable to be set to the path of a JSON
file containing your BigQuery credentials. See the
GCP documentation
for more information.
Two commands are available:
carto bigquery upload
loads raster data from a local file to a BigQuery table.
At a minimum, the carto bigquery upload
command requires a file_path
to a local
raster file that can be read by GDAL and processed with rasterio. It also requires
the project
(the GCP project name)
and dataset
(the BigQuery dataset name)
parameters. There are also additional parameters, such as table
(BigQuery table
name) and overwrite
(to
overwrite existing data).
For example:
carto bigquery upload \
--file_path /path/to/my/raster/file.tif \
--project my-gcp-project \
--dataset my-bigquery-dataset \
--table my-bigquery-table \
--overwrite
This command uploads the TIFF file from /path/to/my/raster/file.tif
to a BigQuery
project named my-gcp-project
, a dataset named my-bigquery-dataset
, and a table
named my-bigquery-table
. If the table already contains data, this data will be
overwritten because the --overwrite
flag is set.
Use the carto bigquery describe
command to retrieve information about a raster file
stored in a BigQuery table.
At a minimum, this command requires a GCP project name, a BigQuery dataset name, and a BigQuery table name.
For example:
carto bigquery describe \
--project my-gcp-project \
--dataset my-bigquery-dataset \
--table my-bigquery-table
After installing Raster Loader, you can import the package into your Python project. For example:
from raster_loader import rasterio_to_bigquery, bigquery_to_records
Currently, Raster Loader supports uploading raster data to BigQuery. Accessing BigQuery with Raster Loader requires the
GOOGLE_APPLICATION_CREDENTIALS
environment variable to be set to the path of a JSON
file containing your BigQuery credentials. See the
GCP documentation
for more information.
You can use Raster Loader to upload a local raster file to an existing
BigQuery table using the rasterio_to_bigquery()
function:
rasterio_to_bigquery(
file_path = 'path/to/raster.tif',
project_id = 'my-project',
dataset_id = 'my_dataset',
table_id = 'my_table',
)
This function returns True
if the upload was successful.
You can also access and inspect a raster file from a BigQuery table using the
bigquery_to_records()
function:
records_df = bigquery_to_records(
project_id = 'my-project',
dataset_id = 'my_dataset',
table_id = 'my_table',
)
This function returns a DataFrame with some samples from the raster table on BigQuery (10 rows by default).
See CONTRIBUTING.md for information on how to contribute to this project.
ROADMAP.md contains a list of features and improvements planned for future versions of Raster Loader.