This section of the documentation will help you understand how to work with SpectralIndices.jl using DataFrames.jl as input.
This tutorial relies on data stored in data
. To access it we are going to use the following:
using SpectralIndices, DataFrames
+df = load_dataset("spectral", DataFrame)
+first(df, 5)
1 | 0.269054 | 297.328 | 0.100795 | 0.306206 | Urban | 0.165764 | 0.251949 | 0.132227 | 0.08985 |
2 | 0.281264 | 297.108 | 0.08699 | 0.267596 | Urban | 0.160979 | 0.217917 | 0.124404 | 0.0738588 |
3 | 0.28422 | 297.436 | 0.0860275 | 0.258384 | Urban | 0.140203 | 0.200098 | 0.120994 | 0.0729375 |
4 | 0.254479 | 297.204 | 0.103916 | 0.25958 | Urban | 0.163976 | 0.216735 | 0.135981 | 0.0877325 |
5 | 0.269535 | 297.098 | 0.109306 | 0.273234 | Urban | 0.18126 | 0.219554 | 0.15035 | 0.0905925 |
Each column of this dataset is the Surface Reflectance from Landsat 8 for 3 different classes. The samples were taken over Oporto. The data is taken from spyndex and this tutorial is meant to closely mirror the python version.
This dataset specifically contains three different classes:
unique(df[!, "class"])
3-element Vector{Any}:
+ "Urban"
+ "Water"
+ "Vegetation"
so to reflect that we are going to calculate three different indices: NDVI
for vegetation
, NDWI
for water
and NDBI
for urban
.
NDVI
NDVI: Normalized Difference Vegetation Index
+* Application Domain: vegetation
+* Bands/Parameters: Any["N", "R"]
+* Formula: (N-R)/(N+R)
+* Reference: https://ntrs.nasa.gov/citations/19740022614
+
NDWI
NDWI: Normalized Difference Water Index
+* Application Domain: water
+* Bands/Parameters: Any["G", "N"]
+* Formula: (G-N)/(G+N)
+* Reference: https://doi.org/10.1080/01431169608948714
+
NDBI
NDBI: Normalized Difference Built-Up Index
+* Application Domain: urban
+* Bands/Parameters: Any["S1", "N"]
+* Formula: (S1-N)/(S1+N)
+* Reference: http://dx.doi.org/10.1080/01431160304987
+
We have multiple ways to feed this data to SectralIndices.jl to generate our indices. We will try to cover most of them here.
A straightforward way to obtain the calculation of the indices is to feed a DataFrame
to compute_index
. In order to do this we need first to build the new DataFrame
. We can explore which bands we need by calling the bands
field in the indices:
NDVI.bands
2-element Vector{Any}:
+ "N"
+ "R"
NDWI.bands
2-element Vector{Any}:
+ "G"
+ "N"
NDBI.bands
2-element Vector{Any}:
+ "S1"
+ "N"
In this case we are going to need only Green, Red, NIR and SWIR1 bands. Since the compute_index
expects the bands to have the same name as the have in the bands
field we need to select the specific columns that we want out of the dataset and rename them. We can do this easily with select
:
params = select(df, :SR_B3=>:G, :SR_B4=>:R, :SR_B5=>:N, :SR_B6=>:S1)
+first(params, 5)
1 | 0.132227 | 0.165764 | 0.269054 | 0.306206 |
2 | 0.124404 | 0.160979 | 0.281264 | 0.267596 |
3 | 0.120994 | 0.140203 | 0.28422 | 0.258384 |
4 | 0.135981 | 0.163976 | 0.254479 | 0.25958 |
5 | 0.15035 | 0.18126 | 0.269535 | 0.273234 |
Now our dataset is ready, and we just need to call the compute_index
function
idx = compute_index(["NDVI", "NDWI", "NDBI"], params)
+first(idx, 5)
1 | 0.237548 | -0.340973 | 0.0645838 |
2 | 0.271989 | -0.386671 | -0.0249016 |
3 | 0.339326 | -0.402815 | -0.0476153 |
4 | 0.216278 | -0.303482 | 0.00992348 |
5 | 0.195821 | -0.283852 | 0.0068146 |
The result is a new DataFrame
with the desired indices as columns.
Another way to obtain this is to feed single DataFrame
s as kwargs. First we need to define the single DataFrame
s:
idx = compute_index(["NDVI", "NDWI", "NDBI"];
+ G = select(df, :SR_B3=>:G),
+ N = select(df, :SR_B5=>:N),
+ R = select(df, :SR_B4=>:R),
+ S1 = select(df, :SR_B6=>:S1))
+first(idx, 5)
1 | 0.237548 | -0.340973 | 0.0645838 |
2 | 0.271989 | -0.386671 | -0.0249016 |
3 | 0.339326 | -0.402815 | -0.0476153 |
4 | 0.216278 | -0.303482 | 0.00992348 |
5 | 0.195821 | -0.283852 | 0.0068146 |
Alternatively you can define a Dict
for the indices from the DataFrame
, going back to an example we saw in the previous page:
params = Dict("G" => df[!, "SR_B3"], "N" => df[!, "SR_B5"], "R" => df[!, "SR_B4"], "S1" => df[!, "SR_B6"])
Dict{String, Vector{Any}} with 4 entries:
+ "S1" => [0.306206, 0.267596, 0.258384, 0.25958, 0.273234, 0.32954, 0.271721, …
+ "N" => [0.269054, 0.281264, 0.28422, 0.254479, 0.269535, 0.277153, 0.26563, …
+ "G" => [0.132227, 0.124404, 0.120994, 0.135981, 0.15035, 0.152303, 0.135885,…
+ "R" => [0.165764, 0.160979, 0.140203, 0.163976, 0.18126, 0.19754, 0.170026, …
The computation is done in the same way:
ndvi, ndwi, ndbi = compute_index(["NDVI", "NDWI", "NDBI"], params)
3-element Vector{Any}:
+ [0.23754793677807357, 0.2719887844338796, 0.33932578974960087, 0.21627773595727137, 0.19582071673377036, 0.16771383579896465, 0.21944767233340506, 0.2251996432295527, 0.1655330261746833, 0.2675545906704802 … 0.810365666144593, 0.8104049969776344, 0.7616768543153676, 0.8027222040013119, 0.7929365431300779, 0.7862750574070626, 0.8080303042462863, 0.8025822103946664, 0.7135886988619672, 0.7672440264304153]
+ [-0.3409734444357916, -0.38667135030536093, -0.4028151808767594, -0.3034817907083952, -0.28385153077628394, -0.29071730449057526, -0.32313861250513676, -0.3563320964589312, -0.24060392753715099, -0.34356689100134846 … -0.7698492602846995, -0.7547124120206541, -0.7128263753013682, -0.7716516398212895, -0.7491201313937117, -0.7510114068441064, -0.7257608604061496, -0.7401234567901236, -0.6752241340558899, -0.7074355283543386]
+ [0.06458384035045028, -0.02490161425500128, -0.04761531780788457, 0.009923476645422341, 0.006814596455672831, 0.08634934501415456, 0.01133569522728392, 0.03875665342611921, 0.006910176170362171, -0.0322322650047355 … -0.47115094032591764, -0.46672499804111056, -0.40825671490715415, -0.5414949557901297, -0.43083696212857336, -0.43525525151156264, -0.4700842430846934, -0.4585879184008887, -0.4050436713235448, -0.44864683453438614]
Just be careful with the naming, SpectralIndices.jl brings into the namespace all the indices as defined in indices
. The all caps version of the indices is reserved for them, as we illustrated at the beginning of this tutorial:
NDVI
NDVI: Normalized Difference Vegetation Index
+* Application Domain: vegetation
+* Bands/Parameters: Any["N", "R"]
+* Formula: (N-R)/(N+R)
+* Reference: https://ntrs.nasa.gov/citations/19740022614
+
The two steps can be merged by providing the values directly as kwargs:
ndvi, ndwi, ndbi = compute_index(["NDVI", "NDWI", "NDBI"];
+ G = df[!, "SR_B3"],
+ N = df[!, "SR_B5"],
+ R = df[!, "SR_B4"],
+ S1 = df[!, "SR_B6"])
3-element Vector{Any}:
+ [0.23754793677807357, 0.2719887844338796, 0.33932578974960087, 0.21627773595727137, 0.19582071673377036, 0.16771383579896465, 0.21944767233340506, 0.2251996432295527, 0.1655330261746833, 0.2675545906704802 … 0.810365666144593, 0.8104049969776344, 0.7616768543153676, 0.8027222040013119, 0.7929365431300779, 0.7862750574070626, 0.8080303042462863, 0.8025822103946664, 0.7135886988619672, 0.7672440264304153]
+ [-0.3409734444357916, -0.38667135030536093, -0.4028151808767594, -0.3034817907083952, -0.28385153077628394, -0.29071730449057526, -0.32313861250513676, -0.3563320964589312, -0.24060392753715099, -0.34356689100134846 … -0.7698492602846995, -0.7547124120206541, -0.7128263753013682, -0.7716516398212895, -0.7491201313937117, -0.7510114068441064, -0.7257608604061496, -0.7401234567901236, -0.6752241340558899, -0.7074355283543386]
+ [0.06458384035045028, -0.02490161425500128, -0.04761531780788457, 0.009923476645422341, 0.006814596455672831, 0.08634934501415456, 0.01133569522728392, 0.03875665342611921, 0.006910176170362171, -0.0322322650047355 … -0.47115094032591764, -0.46672499804111056, -0.40825671490715415, -0.5414949557901297, -0.43083696212857336, -0.43525525151156264, -0.4700842430846934, -0.4585879184008887, -0.4050436713235448, -0.44864683453438614]
You are free to choose whichever method you prefer, there is no meaningful trade-off in speed
@time ndvi, ndwi, ndbi = compute_index(["NDVI", "NDWI", "NDBI"], params)
3-element Vector{Any}:
+ [0.23754793677807357, 0.2719887844338796, 0.33932578974960087, 0.21627773595727137, 0.19582071673377036, 0.16771383579896465, 0.21944767233340506, 0.2251996432295527, 0.1655330261746833, 0.2675545906704802 … 0.810365666144593, 0.8104049969776344, 0.7616768543153676, 0.8027222040013119, 0.7929365431300779, 0.7862750574070626, 0.8080303042462863, 0.8025822103946664, 0.7135886988619672, 0.7672440264304153]
+ [-0.3409734444357916, -0.38667135030536093, -0.4028151808767594, -0.3034817907083952, -0.28385153077628394, -0.29071730449057526, -0.32313861250513676, -0.3563320964589312, -0.24060392753715099, -0.34356689100134846 … -0.7698492602846995, -0.7547124120206541, -0.7128263753013682, -0.7716516398212895, -0.7491201313937117, -0.7510114068441064, -0.7257608604061496, -0.7401234567901236, -0.6752241340558899, -0.7074355283543386]
+ [0.06458384035045028, -0.02490161425500128, -0.04761531780788457, 0.009923476645422341, 0.006814596455672831, 0.08634934501415456, 0.01133569522728392, 0.03875665342611921, 0.006910176170362171, -0.0322322650047355 … -0.47115094032591764, -0.46672499804111056, -0.40825671490715415, -0.5414949557901297, -0.43083696212857336, -0.43525525151156264, -0.4700842430846934, -0.4585879184008887, -0.4050436713235448, -0.44864683453438614]
@time ndvi, ndwi, ndbi = compute_index(["NDVI", "NDWI", "NDBI"];
+ G = df[!, "SR_B3"],
+ N = df[!, "SR_B5"],
+ R = df[!, "SR_B4"],
+ S1 = df[!, "SR_B6"])
3-element Vector{Any}:
+ [0.23754793677807357, 0.2719887844338796, 0.33932578974960087, 0.21627773595727137, 0.19582071673377036, 0.16771383579896465, 0.21944767233340506, 0.2251996432295527, 0.1655330261746833, 0.2675545906704802 … 0.810365666144593, 0.8104049969776344, 0.7616768543153676, 0.8027222040013119, 0.7929365431300779, 0.7862750574070626, 0.8080303042462863, 0.8025822103946664, 0.7135886988619672, 0.7672440264304153]
+ [-0.3409734444357916, -0.38667135030536093, -0.4028151808767594, -0.3034817907083952, -0.28385153077628394, -0.29071730449057526, -0.32313861250513676, -0.3563320964589312, -0.24060392753715099, -0.34356689100134846 … -0.7698492602846995, -0.7547124120206541, -0.7128263753013682, -0.7716516398212895, -0.7491201313937117, -0.7510114068441064, -0.7257608604061496, -0.7401234567901236, -0.6752241340558899, -0.7074355283543386]
+ [0.06458384035045028, -0.02490161425500128, -0.04761531780788457, 0.009923476645422341, 0.006814596455672831, 0.08634934501415456, 0.01133569522728392, 0.03875665342611921, 0.006910176170362171, -0.0322322650047355 … -0.47115094032591764, -0.46672499804111056, -0.40825671490715415, -0.5414949557901297, -0.43083696212857336, -0.43525525151156264, -0.4700842430846934, -0.4585879184008887, -0.4050436713235448, -0.44864683453438614]