-
Notifications
You must be signed in to change notification settings - Fork 2
/
fast_track_grass_and_python.qmd
489 lines (386 loc) · 15.3 KB
/
fast_track_grass_and_python.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
---
title: "Fast track to GRASS & Python in Jupyter Notebooks"
author: "Veronica Andreo"
date: 2024-03-25
date-modified: today
format:
html:
toc: true
code-tools: true
code-copy: true
code-fold: false
categories: [GRASS GIS, Python, Jupyter, Intermediate]
engine: jupyter
execute:
eval: false
jupyter: python3
---
Python, a widely used general-purpose, high-level programming language provides
a powerful scripting interface for geospatial data processing. Being easy-to-use
yet powerful, it enables users to efficiently exploit the capabilities of the
GRASS GIS software. Python scripts for GRASS GIS can be written at high level
(GRASS GIS tools) as well as at low level (GRASS GIS libraries) through dedicated
interfaces. Indeed, GRASS GIS is distributed with a set of
[python packages](https://grass.osgeo.org/grass-stable/manuals/libpython/index.html)
to provide functionalities at different levels.
In this tutorial, we will focus on two packages:
[`grass.script`](https://grass.osgeo.org/grass-stable/manuals/libpython/script_intro.html)
and [`grass.jupyter`](https://grass.osgeo.org/grass-stable/manuals/libpython/grass.jupyter.html),
which provide Python interface to launch GRASS GIS tools in scripts and offer
classes and setup functions for running GRASS GIS in Jupyter Notebooks,
respectively.
We will show two different use cases:
- A. You are mostly a Python user and only need to run a certain GRASS GIS tool on your spatial data to get a specific output
- B. You are mostly a GRASS GIS user that wants to use GRASS GIS from a Python environment or combine GRASS GIS with other Python packages.
Let's first go through the main functions of GRASS GIS Python packages.
### Python package `grass.script`
The **grass.script** package or GRASS GIS Python Scripting Library provides
functions for calling GRASS GIS tools within Python scripts. The most
commonly used functions include:
- `run_command`: used when there is no text output or the text output does not need to be further processed
- `read_command`: used when the output of the tools is of text type
- `parse_command`: used with tools that output machine readable text output
- `write_command`: used with tools that expect text input, either in the form of a file or from stdin
There are several wrapper functions for frequently used tools, too.
For example:
- To get info from a raster, script.raster.raster_info() is used: `gs.raster_info('dsm')`
- To get info of a vector, script.vector.vector_info() is used: `gs.vector_info('roads')`
- To list the raster in a location, script.core.list_grouped() is used: `gs.list_grouped(type=['raster'])`
- To obtain the computational region, script.core.region() is used: `gs.region()`
- To run raster algebra with r.mapcalc, script.raster.mapcalc() is used: `gs.mapcalc()`
The grass.script package also comes with different functions that are useful
when you are writing your own GRASS GIS tools or converting your scripts or
workflows into GRASS GIS tools. Some examples of these functions are:
`append_uuid`, `use_temp_region`, `del_temp_region`, `parse_key_val`, etc.
:::{.callout-note}
Visit the grass.script documentation for more details and examples:
<https://grass.osgeo.org/grass-stable/manuals/libpython/script_intro.html>
:::
### Python package `grass.jupyter`
The **grass.jupyter** library improves the integration of GRASS and Jupyter, and
provides different classes to facilitate GRASS maps visualization:
- `Map`: 2D rendering
- `Map3D`: 3D rendering
- `InteractiveMap`: interactive map visualization with folium or ipyleaflet
- `SeriesMap`: visualizations of a series of raster or vector maps
- `TimeSeriesMap`: visualization of space-time datasets
:::{.callout-note}
Visit the grass.jupyter documentation for more details and examples:
<https://grass.osgeo.org/grass-stable/manuals/libpython/grass.jupyter.html>
:::
## Let's get started!
### Setup
This tutorial can be run locally. You need to have **GRASS GIS 8.4+** and
**Jupyter** installed. For part A, please download these
[Sentinel 2](https://grass.osgeo.org/sampledata/north_carolina/nc_sentinel_utm17n.zip)
scenes and move the unzipped download into the directory where you are running
this tutorial.
For part B, we asume that you have downloaded the North Carolina
[sample dataset](https://grass.osgeo.org/sampledata/north_carolina/nc_basic_spm_grass7.zip),
i.e., there's an existing GRASS project.
Be sure you also have the following Python libraries installed in your
environment: `folium` or `ipyleaflet`, `numpy`, `seaborn`, `matplotlib`, `pandas`.
The first thing we need to do is to
*import GRASS GIS python packages*. In order to do so, we need to
*add GRASS GIS python package to PATH*. Let's see how we do that.
```{python}
# import standard Python packages
import os
import sys
import subprocess
from pathlib import Path
```
```{python}
# check where GRASS GIS python packages are and add them to PATH
sys.path.append(
subprocess.check_output(["grass", "--config", "python_path"], text=True).strip()
)
```
```{python}
# import GRASS GIS python packages
import grass.script as gs
import grass.jupyter as gj
```
:::{.callout-note}
We recommend Windows users to review how to run [GRASS GIS in Jupyter Notebooks on Windows](JupyterOnWindows_OSGeo4W_Tutorial.qmd).
:::
### A. Use GRASS GIS tools within your Python spatial workflows
Now, let's assume you have some raster data you want to process with GRASS GIS
tools, eg. Sentinel 2 satellite data, to obtain texture indices. The first thing
you'll need to do is to **create a GRASS GIS project** to import your data. As
we saw already in a previous [fast track](fast_track.qmd) tutorial, GRASS
projects are folders where we store spatial data with the same spatial
reference. These projects can be placed wherever you want, including a temporary
directory if you are mostly interested in the outputs only.
So, let's *create a project in a temporary directory* to import, i.e. read, our
data with GRASS GIS. The `gs.create_project()` function allows us to create a
GRASS project passing different information. For example, we can use the EPSG
code of the data CRS or directly pass a georeferenced file.
```{python}
# Create a temporary folder where to place our GRASS project
import tempfile
tempdir = tempfile.TemporaryDirectory()
```
```{python}
# Create a project in the temporary directory
gs.create_project(path=tempdir.name,
name="nc_sentinel",
epsg="32617")
```
Alternatively, use a georeferenced file to read the spatial reference
information from:
```{python}
# gs.create_project(path=tempdir.name, name="nc_sentinel", filename="path/to/georef/file", overwrite=True)
```
Now that we created a project, let's **start a GRASS GIS** session there.
```{python}
# Start GRASS in the recently created project
session = gj.init(Path(tempdir.name, "nc_sentinel"))
```
We are now ready to **import data** into the recently created project. Let's use
a for loop to import all 10 m resolution bands. These are level 2A surface
reflectance data for blue, green, red and near infrared Sentinel 2 bands.
```{python}
import shutil
shutil.unpack_archive("./nc_sentinel_utm17n/S2A_MSIL2A_20220304T160151_N0400_R097_T17SQV_20220304T215812.zip", "./nc_sentinel_utm17n")
files = sorted(Path('./nc_sentinel_utm17n/S2A_MSIL2A_20220304T160151_N0400_R097_T17SQV_20220304T215812.SAFE/GRANULE/L2A_T17SQV_A034986_20220304T160221/IMG_DATA/R10m').glob('*B*.jp2'))
files
```
```{python}
for file in files:
name = str(file)[-11:-4]
print("importing " + name)
gs.run_command("r.import", input=file, output=name)
```
Let's check the files we just imported are there:
```{python}
gs.list_grouped(type="raster")["PERMANENT"]
```
Let's have a quick look at one of the imported bands. We can use the `InteractiveMap`
class from the grass.jupyter package to **visualize** it.
```{python}
m = gj.InteractiveMap()
m.add_raster("B08_10m")
m.show()
```
Next step is to do some **processing or analysis** with the imported data.
Since we'll be creating new raster maps, we first
need to set our *computational region* to the extent and resolution of one of our
imported bands.
```{python}
# Set computational region
gs.run_command("g.region", raster="B08_10m", flags="p")
```
```{python}
m = gj.InteractiveMap(tiles="OpenStreetMap")
m.add_raster("B08_10m")
m.show()
```
It is common to estimate texture measures over panchromatic bands. Since
we do not have one in Sentinel 2 data, we'll create a synthetic one by averaging
blue, green and red bands.
```{python}
# Create synthetic pan band
gs.mapcalc("pan = (B02_10m + B03_10m + B04_10m) / 3")
```
Now that we have the synthetic pan band, let's estimate some texture measures
with the [r.texture](https://grass.osgeo.org/grass-stable/manuals/r.texture.html)
tool.
```{python}
gs.run_command("r.texture",
input="pan",
output="pan",
size=5,
method="contrast,corr")
```
```{python}
gs.list_grouped(type="raster", pattern="pan*")["PERMANENT"]
```
```{python}
t = gj.InteractiveMap(tiles="OpenStreetMap")
t.add_raster("pan_Contr")
t.add_raster("pan_Corr")
t.show()
```
Finally, we can **export** our texture maps out of GRASS GIS and use them
somewhere else or load them into a webGIS.
```{python}
texture_maps = gs.list_grouped(type="raster", pattern="pan_*")["PERMANENT"]
texture_maps
```
```{python}
for texture in texture_maps:
gs.run_command("r.out.gdal", input=texture, output=f"{texture}.tif", format="GTiff")
```
This use case follows the
Extract-Transform-Load (ETL) process common in production systems. Indeed, this
approach allows to include GRASS tools into such workflows. These type of
tasks could be automatized in scripts to be run without even starting GRASS GIS
using the `--exec` tool... but that's material for a different tutorial :)
### B. Use Python tools within GRASS GIS workflows
This case is more relevant for GRASS users who want to combine GRASS GIS
with other Python tools for their data processing and analysis workflows.
Several GRASS users store most or all of their projects in a single folder,
which has traditionally been called `grassdata`. When this is the case, to
**start GRASS GIS in an existing project**, we also need to provide the path
to such a folder.
```{python}
# Start GRASS
session = gj.init("~/grassdata/nc_basic_spm_grass7/PERMANENT")
# alternatively
# session = gj.init("~/grassdata/nc_basic_spm_grass7")
# session = gj.init("~/grassdata", "nc_basic_spm_grass7", "PERMANENT")
```
We are now within a GRASS project, let's **obtain information** about it, like
CRS details, region settings, list of raster and vector maps, etc.
```{python}
# Print project's CRS
gs.parse_command("g.proj", flags="g")["srid"]
```
```{python}
# Print computational region
gs.region()
```
```{python}
# List raster maps
gs.list_grouped(["raster"])
```
Let's obtain metadata about the *elevation* raster map.
```{python}
# Raster info
gs.raster_info("elevation")
```
If we would only need to know or use the minimum value of the *elevation* raster,
we can get it as follows:
```{python}
gs.raster_info("elevation")["min"]
```
Let's now **visualize** raster and vector maps with a different `grass.jupyter`
class, the non-interactive `Map` class. This class creates and displays GRASS
maps as PNG files. We basically instantiate the class first, add maps and maps'
elements and finally show the result. There are 2 ways of calling display
(`d.*`) modules:
- replace `.` by `_` as in `m.d_rast()`
- use `run()` as in `m.run("d.rast")`
```{python}
# Instantiate the Map class
m = gj.Map(width=400)
```
The *Map* class will by default use the first raster or vector extent to set the
display extent. You could however also use the current computational region with
`use_region=True` or call a previously saved computational region (different
than the current) with the argument `saved_region`.
```{python}
# Add maps and map elements
m.d_rast(map="elevation")
m.d_vect(map="streams")
m.d_legend(raster="elevation", at=(50, 95, 85, 90), flags="b")
```
```{python}
# Display the result
m.show()
```
We can save our displayed maps by calling the `save()` method, i.e., `m.save()`.
For the Map class it will output a PNG file, while for the InteractiveMap class
an HTML.
#### GRASS GIS & NumPy
Let's now see how to convert our **GRASS rasters into numpy arrays**. Having our
raster maps as numpy arrays opens up a world of possibilities in terms of
visualization and data analysis and modeling. We won't go into anything complex
here, but we'll show how to read rasters into numpy arrays, plot them, modify
them and then write them back into GRASS.
```{python}
# Import required libraries
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from grass.script import array as garray
```
```{python}
# Read elevation as numpy array
elev = garray.array(mapname="elevation", null="nan")
print(elev.shape)
```
```{python}
# Estimate array average
print(np.average(elev))
```
```{python}
# Plot elev histogram
sns.set_style('darkgrid')
sns.histplot(data=elev.ravel(), labkde=True)
```
Let's modify our array and write it back into GRASS GIS. For this, we create a
new copy of the GRASS elevation map first as shown below.
```{python}
elev_2 = garray.array(mapname="elevation")
elev_2 *= 2
```
```{python}
# Plot elev*2
sns.histplot(data=[elev.ravel(), elev_2.ravel()], kde=True)
plt.legend(labels=["elevation * 2", "elevation"])
```
![](img/grass_python_histogram.png){.preview-image}
Now we write the modified array into a GRASS raster map and check it's actually
there.
```{python}
elev_2.write(mapname="elevation_2", overwrite=True)
```
```{python}
gs.list_grouped(type="raster", pattern="elev*")
```
#### GRASS GIS & Pandas
Let's now explore how to convert text outputs into pandas data frames. We will
get elevation univariate statistics for each land use class and parse the output
into a pandas data frame.
```{python}
import pandas as pd
from io import StringIO
```
```{python}
stats = gs.read_command("r.univar",
flags="t",
map="elevation",
zones="landuse",
separator="comma")
df = pd.read_csv(StringIO(stats))
df
```
Next, we plot the mean elevation per class as follows:
```{python}
plt.figure(figsize=(10, 5))
plt.bar(df['label'], df['mean'])
plt.ylabel('Elevation')
plt.title('Mean elevation by land cover type')
plt.xticks(rotation=90)
plt.show()
```
Similarly, if we need to do analysis with the attributes of GRASS vector maps,
it is also possible to read the attribute table as a pandas data frame. Let's
see an example with the census vector map:
```{python}
census = gs.parse_command("v.db.select", map="census", format="json")["records"]
df = pd.DataFrame(census)
df
```
Once the attribute table is a data frame, we can, e.g., filter data by a
condition and plot the results.
```{python}
fam_size_3 = df[df["FAM_SIZE"] > 3.0]
```
```{python}
fam_size_3.plot.scatter(x="FAM_SIZE", y="OWNER_U")
```
## Final remarks
In this tutorial, we have demonstrated, with very simple examples, how to use
GRASS GIS tools together with Python, putting a special focus on data import
and export as well as format conversions. Expert GRASS or Python users can then
implement their workflows combining tools accordingly.
**Enjoy!** {{< fa rocket >}}
***
:::{.smaller}
The development of this tutorial was funded by the US
[National Science Foundation (NSF)](https://www.nsf.gov/),
award [2303651](https://www.nsf.gov/awardsearch/showAward?AWD_ID=2303651).
:::