diff --git a/notebooks/08-crop-raster-data.ipynb b/notebooks/08-crop-raster-data.ipynb index 98fed3d5..15c45d83 100644 --- a/notebooks/08-crop-raster-data.ipynb +++ b/notebooks/08-crop-raster-data.ipynb @@ -1,459 +1,2966 @@ { "cells": [ - { - "cell_type": "markdown", - "id": "049e0dc8", - "metadata": {}, - "source": [ - "- How can I crop my raster data to the area of interest?\n", - "\n", - "- Crop raster data with a bounding box.\n", - "- Crop raster data with a polygon.\n", - "- Match two raster datasets in different CRS.\n", - "\n", - "It is quite common that the raster data you have in hand is too large to process, or not all the pixels are relevant to your area of interest (AoI). In both situations, you should consider cropping your raster data before performing data analysis.\n", - "\n", - "In this episode, we will introduce how to crop raster data into the desired area. We will use one Sentinel-2 image over Amsterdam as the example raster data, and introduce how to crop your data to different types of AoIs.\n", - "\n", - "## Introduce the Data\n", - "\n", - "We will use the results of the satellite image search: `search.json`, which is generated in an exercise from\n", - "[Episode 5: Access satellite imagery using Python](05-access-data.md).\n", - "\n", - "If you would like to work with the data for this lesson without downloading data on-the-fly, you can download the\n", - "raster data using this [link](https://figshare.com/ndownloader/files/36028100). Save the `geospatial-python-raster-dataset.tar.gz`\n", - "file in your current working directory, and extract the archive file by double-clicking on it or by running the\n", - "following command in your terminal `tar -zxvf geospatial-python-raster-dataset.tar.gz`. Use the file `geospatial-python-raster-dataset/search.json`\n", - "(instead of `search.json`) to get started with this lesson.\n", - "\n", - "We also use the cropped fields polygons `fields_cropped.shp`, which was generated in an exercise from [Episode 7: Vector data in python](07-vector-data-in-python.md).\n", - "\n", - "## Align the CRS of the raster and the vector data\n", - "\n", - "We load a true color image using `pystac` and `rioxarray` and check the shape of the raster:" - ] - }, { "cell_type": "code", - "execution_count": null, - "id": "d0489d44", + "execution_count": 1, + "id": "4bfec5c9", "metadata": {}, "outputs": [], "source": [ "import pystac\n", - "import rioxarray\n", - "\n", - "# Load image and inspect the shape\n", - "items = pystac.ItemCollection.from_file(\"search.json\")\n", - "raster = rioxarray.open_rasterio(items[1].assets[\"visual\"].href) # Select a true color image\n", - "print(raster.shape)" - ] - }, - { - "cell_type": "markdown", - "id": "02232372", - "metadata": {}, - "source": [ - "``` output\n", - "(3, 10980, 10980)\n", - "```\n", - "\n", - "This will perform a “lazy” loading of the image, i.e. the image will not be loaded into the memory until necessary, but we can still access some attributes, e.g. the shape of the image.\n", - "\n", - "The large size of the raster data makes it time and memory consuming to visualize in its entirety. Instead, we can fetch and plot the overviews of the raster. “Overviews” are precomputed lower resolution representations of a raster, stored in the same COG that contains the original raster." + "items = pystac.ItemCollection.from_file('rhodes_sentinel-2.json')" ] }, { "cell_type": "code", - "execution_count": null, - "id": "ef68f951", + "execution_count": 2, + "id": "b0e1c9e0", "metadata": {}, "outputs": [], "source": [ - "# Get the overview asset\n", - "raster_overview = rioxarray.open_rasterio(items[1].assets[\"visual\"].href, overview_level=3)\n", - "print(raster_overview.shape)\n", - "\n", - "# Visualize it\n", - "raster_overview.plot.imshow(figsize=(8,8))" + "item = items[0]" ] }, { - "cell_type": "markdown", - "id": "60e7742d", + "cell_type": "code", + "execution_count": 3, + "id": "8679df22", "metadata": {}, + "outputs": [], "source": [ - "\n", - "\n", - "As we can see, the overview image is much smaller compared to the original true color image.\n", - "\n", - "To align the raster and vector data, we first check each coordinate system. For raster data, we use `pyproj.CRS`:" + "visual_href = item.assets['visual'].href" ] }, { "cell_type": "code", - "execution_count": null, - "id": "f952bf8f", + "execution_count": 4, + "id": "11b8548b", "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "text/html": [ + "
<xarray.DataArray (band: 3, y: 10980, x: 10980)>\n", + "[361681200 values with dtype=uint8]\n", + "Coordinates:\n", + " * band (band) int64 1 2 3\n", + " * x (x) float64 5e+05 5e+05 5e+05 ... 6.098e+05 6.098e+05 6.098e+05\n", + " * y (y) float64 4.1e+06 4.1e+06 4.1e+06 ... 3.99e+06 3.99e+06\n", + " spatial_ref int64 0\n", + "Attributes:\n", + " AREA_OR_POINT: Area\n", + " OVR_RESAMPLING_ALG: AVERAGE\n", + " _FillValue: 0\n", + " scale_factor: 1.0\n", + " add_offset: 0.0
<xarray.DataArray (band: 3, y: 4523, x: 4828)>\n", + "[65511132 values with dtype=uint8]\n", + "Coordinates:\n", + " * band (band) int64 1 2 3\n", + " * x (x) float64 5.615e+05 5.615e+05 ... 6.098e+05 6.098e+05\n", + " * y (y) float64 4.035e+06 4.035e+06 4.035e+06 ... 3.99e+06 3.99e+06\n", + " spatial_ref int64 0\n", + "Attributes:\n", + " AREA_OR_POINT: Area\n", + " OVR_RESAMPLING_ALG: AVERAGE\n", + " scale_factor: 1.0\n", + " add_offset: 0.0\n", + " _FillValue: 0