Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update testdata-minimal scripts #691

Draft
wants to merge 8 commits into
base: main-dev
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 0 additions & 8 deletions pyaerocom/scripts/testdata-minimal/TM5_subset.sh

This file was deleted.

47 changes: 0 additions & 47 deletions pyaerocom/scripts/testdata-minimal/calc_example_coldata.py

This file was deleted.

28 changes: 0 additions & 28 deletions pyaerocom/scripts/testdata-minimal/create_subsets_emep.sh

This file was deleted.

72 changes: 0 additions & 72 deletions pyaerocom/scripts/testdata-minimal/create_subsets_ghost.py

This file was deleted.

Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
# Scripts for test dataset creation of pyaerocom

This directory consists of scripts to create the minimal test dataset needed
for automatic testing and continuous integration of pyaerocom. The scripts need access to Met Norway's
internal file storage and are therefore
Expand All @@ -8,8 +9,26 @@ they are included in the main pyaerocom gihub repository anyway.
The minimal test data created from these scripts will usually go to the subdirectory `~/MyPyaerocom/testdata-minimal`
Example model and observation data can be found in sub-directories `modeldata` and `obsdata`, respectively.

At this time only `create_subset_ebas.py` is running with the
latest version of pyaerocom
``` bash
python -m scripts.testdata-minimal --help
```

``` man
Usage: python -m scripts.testdata-minimal [OPTIONS] COMMAND [ARGS]...

Crete minimal test datasets for pyaerocom

Options:
--help Show this message and exit.

Commands:
Aeronet minimal Aeronet dataset
Colocated collocated data example
EBAS minimal EBAS dataset
EMEP minimal EMEP dataset
GHOST minimal GHOST dataset
TM5 minimal TM5 dataset
```

## Data usage guidelines

Expand All @@ -18,31 +37,34 @@ The data is generally NOT intended to be downloaded and used. If you download th
general data policy terms and restrictions of each provided dataset apply. These will be listed in the following.

### AERONET data

See: [https://aeronet.gsfc.nasa.gov/new_web/data_usage.html](https://aeronet.gsfc.nasa.gov/new_web/data_usage.html)

### EBAS data

See: [https://ebas.nilu.no/](https://ebas.nilu.no/)

Under "Data policy".

### Model data

- TM5 :Courtesy of Twan van Noije (KNMI)
- TM5: Courtesy of Twan van Noije (KNMI)

### Satellite data

- MODIS: start with the [MODIS landing page](https://modis.gsfc.nasa.gov/data/)

## Updating testdata for CI

**Note:** The test data has to be updated by hand for CI to pickup the changes!

Howto for that:
```

``` bash
cd ~/MyPyaerocom
mkdir -p ~/tmp
tar -cvzf ~/tmp/testdata-minimal.tar.gz testdata-minimal
```

The resulting file `~/tmp/testdata-minimal.tar.gz` then needs to be copied to the right place.
Please ask your fellow developers in case you do not know how to do that.


Empty file.
13 changes: 13 additions & 0 deletions scripts/testdata-minimal/__main__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
import typer

from . import aeronet, coldata, ebas, emep, ghost, tm5

main = typer.Typer(help="Crete minimal test datasets for pyaerocom", add_completion=False)
main.command(name="Aeronet")(aeronet.main)
main.command(name="Colocated")(coldata.main)
main.command(name="EBAS")(ebas.main)
main.command(name="EMEP")(emep.main)
main.command(name="GHOST")(ghost.main)
main.command(name="TM5")(tm5.main)

main()
20 changes: 10 additions & 10 deletions ...estdata-minimal/create_subsets_aeronet.py → scripts/testdata-minimal/aeronet.py
100755 → 100644
Original file line number Diff line number Diff line change
@@ -1,7 +1,5 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Goal
Minimal Aeronet subset for testing purposes
"""

import os
Expand All @@ -10,13 +8,10 @@
from pathlib import Path

import numpy as np
import typer

import pyaerocom as pya

OUTBASE = Path(pya.const._TESTDATADIR).joinpath("obsdata")

if not OUTBASE.exists():
OUTBASE.mkdir()
from tests.fixtures.data_access import DataForTests

MIN_NUM_VALID = 300

Expand All @@ -36,7 +31,12 @@
]

revision_files = {}
if __name__ == "__main__":


def main(
out_path: Path = typer.Argument(DataForTests("obsdata").path, exists=True, dir_okay=True)
):
"""minimal Aeronet dataset"""

loaded = {}
for name, varlist in NETWORKS.items():
Expand Down Expand Up @@ -95,7 +95,7 @@

for name, data in loaded.items():
data_id = IDS[name]
outdir = OUTBASE.joinpath(data_id)
outdir = out_path / data_id
# make sure to remove old data
if outdir.exists():
print("REMOVING EXISTING DATA FOR {}".format(data_id))
Expand Down
35 changes: 35 additions & 0 deletions scripts/testdata-minimal/coldata.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
from pathlib import Path

import typer

import pyaerocom as pya
from tests.fixtures.data_access import DataForTests
from tests.fixtures.tm5 import CHECK_PATHS

MOD_PATH = DataForTests(CHECK_PATHS.tm5aod).path
OUT_PATH = DataForTests("coldata").path


def main(
mod_path: Path = typer.Argument(MOD_PATH, exists=True, dir_okay=True),
out_path: Path = typer.Argument(OUT_PATH, exists=True, dir_okay=True),
):
"""collocated data example"""

mod = pya.GriddedData(mod_path)
obs = pya.io.ReadAeronetSunV3("AeronetSunV3L2Subset.daily").read("od550aer")

coldata = pya.colocation.colocate_gridded_ungridded(mod, obs)
coldata.to_netcdf(out_path)
print(coldata.calc_statistics())

coldata.plot_coordinates()

mod = mod.sel(latitude=(0, 3), longitude=(0, 4))
cgg = pya.colocation.colocate_gridded_gridded(mod, mod)
cgg.data = cgg.data[:, :3]

cgg.plot_scatter()
cgg.to_netcdf(out_path)

pya.plot.mapping.plot_nmb_map_colocateddata(cgg)
Loading