Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

subset_MODIS_l2 #260

Closed
wants to merge 47 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
47 commits
Select commit Hold shift + click to select a range
4dc8c94
Start on control script for raster data module development.
dwfncar Jan 3, 2024
61d323d
Merge branch 'develop' into develop_swath
dwfncar Mar 18, 2024
581562e
Merge branch 'develop' into develop_swath
davidfillmore Apr 22, 2024
19403b8
Added background satellite datasets section.
dwfncar Apr 23, 2024
3569773
Merge branch 'develop' into develop_swath
dwfncar May 28, 2024
e9d9e93
Added comments in open_sat_obs modis_l2 block.
dwfncar May 30, 2024
aa121b6
Start on time_interval_subset.subset_MODIS_l2.
dwfncar May 30, 2024
9b6ed9d
Added example obs datadir.
dwfncar Jun 18, 2024
937cbc4
Comment for filename nomenclature.
dwfncar Jun 23, 2024
045be26
Comment for filename nomenclature.
dwfncar Jun 23, 2024
b7fd42c
Added time_interval_subset.subset_MODIS_l2.
dwfncar Jun 23, 2024
d3d4c13
Merge branch 'develop' into develop_swath
dwfncar Jun 25, 2024
147ae45
Added observation.generate_obs_grid.
dwfncar Jul 11, 2024
72e1723
Fixed modis_l2 reader import.
dwfncar Jul 13, 2024
2fdd9af
Added MODIS AOD Combined to obs variables.
dwfncar Jul 13, 2024
45f08e7
Removed some unused config parameters.
dwfncar Jul 13, 2024
a42be5f
Moved obs_grid config params up one level.
dwfncar Jul 13, 2024
86a9162
Renamed MODIS L2 example.
dwfncar Jul 13, 2024
467b450
Merge branch 'develop' into develop_swath
dwfncar Jul 13, 2024
1e02e30
Moved generate_obs_grid to analysis class.
dwfncar Jul 13, 2024
b29eea2
Added analysis.setup_obs_grid.
dwfncar Jul 13, 2024
8e6ab48
Initialize obs gridded data and counts.
dwfncar Jul 13, 2024
8bb739f
Adding update_obs_grid.
dwfncar Jul 14, 2024
841a4a2
Added Latitude and Longitude to config.
dwfncar Jul 14, 2024
1c67cd5
Print obs_time update.
dwfncar Jul 14, 2024
c621b5a
Convert obs time to timestamp.
dwfncar Jul 14, 2024
6b1cf0b
Save.
dwfncar Jul 14, 2024
8f0089d
Save.
dwfncar Jul 14, 2024
2483fb5
Use subset_MODIS_l2 for file selection in time window.
dwfncar Jul 14, 2024
a5e4f1a
Added AOD range to YAML config.
dwfncar Jul 14, 2024
3a4ea5b
Added update_obs_grid call.
dwfncar Jul 14, 2024
03cf3f8
Renamed some methods.
dwfncar Jul 14, 2024
9223587
Added normalize_obs_gridded_data.
dwfncar Jul 14, 2024
bbc9c62
Save.
dwfncar Jul 14, 2024
83a39c3
Added obs_gridded_dataset.
dwfncar Jul 15, 2024
e83d5e1
Added time coordinate to obs_gridded_dataset.
dwfncar Jul 15, 2024
b9163d0
Start on model config block.
dwfncar Jul 25, 2024
44bd118
Added time_chunking_with_obs_gridding option.
dwfncar Jul 26, 2024
5ece2b3
Start on regrid to obs grid example.
dwfncar Aug 4, 2024
640e332
Start on regrid to obs grid example.
dwfncar Aug 4, 2024
8cce230
Added self.da_obs_grid.
dwfncar Aug 4, 2024
5874b5e
Added target_grid option to regrid_util.setup_regridder.
dwfncar Aug 4, 2024
ba0f10d
Added target_grid option to regrid_util.setup_regridder.
dwfncar Aug 4, 2024
6cc5b96
Output gridded data.
dwfncar Aug 13, 2024
7d03972
Added data and count parameters in obs_gridded_dataset.
dwfncar Aug 13, 2024
d99ad69
Added debugging attempts in grid_util module.
dwfncar Aug 13, 2024
bef41e4
Added temporary data normalization to top level process_modis_l2.py s…
dwfncar Aug 13, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions docs/background/satellite_datasets.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
Satellite Datasets
==================

Satellite datasets

1 change: 1 addition & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@ MONETIO please refer to:
background/supported_plots
background/supported_stats
background/time_chunking
background/satellite_datasets
background/gridded_datasets

.. toctree::
Expand Down
56 changes: 56 additions & 0 deletions examples/process_swath_data/control_modis_l2.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
analysis:
# 2020 September 9 - September 11 253 - 255
start_time: '2020-09-09'
end_time: '2020-09-11'
time_interval: '24h'
output_dir: $HOME/Plots
debug: True
time_chunking_with_obs_gridding: True
regrid: True
target_grid: obs_grid

obs_grid:
start_time: '2020-09-09'
end_time: '2020-09-11'
ntime: 72
nlat: 180
nlon: 360

obs:
Terra_MODIS:
# MOD04_L2.AYYYYDDD.HHMM.0XX.timestamp.hdf
debug: False
obs_type: 'sat_swath_clm'
sat_type: 'modis_l2'
filename: $HOME/Data/MODIS/Terra/C61/2020/*/MOD04_L2.*.hdf
variables:
AOD_550_Dark_Target_Deep_Blue_Combined:
minimum: 0.0
maximum: 10.0
scale: 0.001
units: none

Aqua_MODIS:
# MYD04_L2.AYYYYDDD.HHMM.0XX.timestamp.hdf
debug: False
obs_type: 'sat_swath_clm'
sat_type: 'modis_l2'
filename: $HOME/Data/MODIS/Aqua/C61/2020/*/MYD04_L2.*.hdf
variables:
AOD_550_Dark_Target_Deep_Blue_Combined:
minimum: 0.0
maximum: 10.0
scale: 0.001
units: none

model:
MERRA2:
regrid:
base_grid: $HOME/Data/Grids/merra2_grid.nc
method: bilinear
mapping:
Terra_MODIS:
TOTEXTTAU: AOD_550_Dark_Target_Deep_Blue_Combined
Aqua_MODIS:
TOTEXTTAU: AOD_550_Dark_Target_Deep_Blue_Combined

31 changes: 31 additions & 0 deletions examples/process_swath_data/process_modis_l2.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
from melodies_monet import driver

an = driver.analysis()
an.control = 'control_modis_l2.yaml'
an.read_control()

an.setup_obs_grid()
# print(an.obs_grid)

# an.setup_regridders()

for time_interval in an.time_intervals:

print(time_interval)

an.open_obs(time_interval=time_interval)
an.update_obs_gridded_data()

an.normalize_obs_gridded_data()
print(an.obs_gridded_dataset)

for param in ['Terra_MODIS_AOD_550_Dark_Target_Deep_Blue_Combined',
'Aqua_MODIS_AOD_550_Dark_Target_Deep_Blue_Combined']:
param_data = an.obs_gridded_dataset[param + '_data'].values
param_count = an.obs_gridded_dataset[param + '_count'].values
mask = (param_count > 0)
param_data[mask] = param_data[mask] / param_count[mask]
an.obs_gridded_dataset[param + '_data'].values = param_data

an.obs_gridded_dataset.to_netcdf('MODIS.nc')

108 changes: 100 additions & 8 deletions melodies_monet/driver.py
Original file line number Diff line number Diff line change
Expand Up @@ -239,7 +239,7 @@ def rename_vars(self):
self.obj = self.obj.rename({v:d['rename']})
self.variable_dict[d['rename']] = self.variable_dict.pop(v)

def open_sat_obs(self,time_interval=None):
def open_sat_obs(self, time_interval=None, control_dict=None):
"""Methods to opens satellite data observations.
Uses in-house python code to open and load observations.
Alternatively may use the satpy reader.
Expand Down Expand Up @@ -282,10 +282,15 @@ def open_sat_obs(self,time_interval=None):
self.obj = mio.sat._mopitt_l3_mm.open_dataset(self.file, ['column','pressure_surf','apriori_col',
'apriori_surf','apriori_prof','ak_col'])
elif self.sat_type == 'modis_l2':
from monetio import modis_l2
# from monetio import modis_l2
print('Reading MODIS L2')
self.obj = modis_l2.read_mfdataset(
self.file, self.variable_dict, debug=self.debug)
flst = tsub.subset_MODIS_l2(self.file,time_interval)
# self.obj = mio.sat._modis_l2_mm.read_mfdataset(
# self.file, self.variable_dict, debug=self.debug)
self.obj = mio.sat._modis_l2_mm.read_mfdataset(
flst, self.variable_dict, debug=self.debug)
# self.obj = granules, an OrderedDict of Datasets, keyed by datetime_str,
# with variables: Latitude, Longitude, Scan_Start_Time, parameters, ...
elif self.sat_type == 'tropomi_l2_no2':
#from monetio import tropomi_l2_no2
print('Reading TROPOMI L2 NO2')
Expand All @@ -297,7 +302,7 @@ def open_sat_obs(self,time_interval=None):
except ValueError as e:
print('something happened opening file:', e)
return

def filter_obs(self):
"""Filter observations based on filter_dict.

Expand Down Expand Up @@ -646,10 +651,16 @@ def __init__(self):
self.save = None
self.read = None
self.time_chunking_with_gridded_data = False # Default to False
self.time_chunking_with_obs_gridding = False # Default to False
self.regrid = False # Default to False
self.target_grid = None
self.obs_regridders = None
self.model_regridders = None
self.obs_grid = None
self.obs_edges = None
self.obs_gridded_data = {}
self.obs_gridded_count = {}
self.obs_gridded_dataset = None

def __repr__(self):
return (
Expand Down Expand Up @@ -725,6 +736,8 @@ def read_control(self, control=None):
# set time_chunking_with_gridded_data option, regrid option, and target_grid
if 'time_chunking_with_gridded_data' in self.control_dict['analysis'].keys():
self.time_chunking_with_gridded_data = self.control_dict['analysis']['time_chunking_with_gridded_data']
if 'time_chunking_with_obs_gridding' in self.control_dict['analysis'].keys():
self.time_chunking_with_obs_gridding = self.control_dict['analysis']['time_chunking_with_obs_gridding']
if 'regrid' in self.control_dict['analysis'].keys():
self.regrid = self.control_dict['analysis']['regrid']
if 'target_grid' in self.control_dict['analysis'].keys():
Expand Down Expand Up @@ -818,8 +831,11 @@ def setup_regridders(self):
"""
from .util import regrid_util
if self.regrid:
self.obs_regridders = regrid_util.setup_regridder(self.control_dict, config_group='obs')
self.model_regridders = regrid_util.setup_regridder(self.control_dict, config_group='model')
if self.target_grid == 'obs_grid':
self.model_regridders = regrid_util.setup_regridder(self.control_dict, config_group='model', target_grid=self.da_obs_grid)
else:
self.obs_regridders = regrid_util.setup_regridder(self.control_dict, config_group='obs')
self.model_regridders = regrid_util.setup_regridder(self.control_dict, config_group='model')

def open_models(self, time_interval=None,load_files=True):
"""Open all models listed in the input yaml file and create a :class:`model`
Expand Down Expand Up @@ -956,11 +972,87 @@ def open_obs(self, time_interval=None, load_files=True):
if load_files:
if o.obs_type in ['sat_swath_sfc', 'sat_swath_clm', 'sat_grid_sfc',\
'sat_grid_clm', 'sat_swath_prof']:
o.open_sat_obs(time_interval=time_interval)
o.open_sat_obs(time_interval=time_interval, control_dict=self.control_dict)
else:
o.open_obs(time_interval=time_interval, control_dict=self.control_dict)
self.obs[o.label] = o

def setup_obs_grid(self):
from .util import grid_util
ntime = self.control_dict['obs_grid']['ntime']
nlat = self.control_dict['obs_grid']['nlat']
nlon = self.control_dict['obs_grid']['nlon']
self.obs_grid, self.obs_edges = grid_util.generate_uniform_grid(
self.control_dict['obs_grid']['start_time'],
self.control_dict['obs_grid']['end_time'],
ntime, nlat, nlon)

self.da_obs_grid = xr.DataArray(dims=['lon', 'lat'],
coords={'lon': self.obs_grid['longitude'],
'lat': self.obs_grid['latitude']})
# print(self.da_obs_grid)

for obs in self.control_dict['obs']:
for var in self.control_dict['obs'][obs]['variables']:
print('initializing gridded data and counts ', obs, var)
self.obs_gridded_data[obs + '_' + var] = np.zeros([ntime, nlon, nlat], dtype=np.float32)
self.obs_gridded_count[obs + '_' + var] = np.zeros([ntime, nlon, nlat], dtype=np.int32)

def update_obs_gridded_data(self):
from .util import grid_util
"""
"""
for obs in self.obs:
for obs_time in self.obs[obs].obj:
print('updating obs time: ', obs, obs_time)
obs_timestamp = pd.to_datetime(
obs_time, format='%Y%j%H%M').timestamp()
# print(obs_timestamp)
for var in self.obs[obs].obj[obs_time]:
key = obs + '_' + var
print(key)
n_obs = self.obs[obs].obj[obs_time][var].size
grid_util.update_data_grid(
self.obs_edges['time_edges'],
self.obs_edges['lon_edges'],
self.obs_edges['lat_edges'],
np.full(n_obs, obs_timestamp, dtype=np.float32),
self.obs[obs].obj[obs_time].coords['lon'].values.flatten(),
self.obs[obs].obj[obs_time].coords['lat'].values.flatten(),
self.obs[obs].obj[obs_time][var].values.flatten(),
self.obs_gridded_count[key],
self.obs_gridded_data[key])

def normalize_obs_gridded_data(self):
from .util import grid_util
"""
"""
self.obs_gridded_dataset = xr.Dataset()

for obs in self.obs:
for obs_time in self.obs[obs].obj:
print('normalizing obs time: ', obs, obs_time)
for var in self.obs[obs].obj[obs_time]:
key = obs + '_' + var
print(key)
grid_util.normalize_data_grid(
self.obs_gridded_count[key],
self.obs_gridded_data[key])
da_data = xr.DataArray(
self.obs_gridded_data[key],
dims=['time', 'lon', 'lat'],
coords={'time': self.obs_grid['time'],
'lon': self.obs_grid['longitude'],
'lat': self.obs_grid['latitude']})
da_count = xr.DataArray(
self.obs_gridded_count[key],
dims=['time', 'lon', 'lat'],
coords={'time': self.obs_grid['time'],
'lon': self.obs_grid['longitude'],
'lat': self.obs_grid['latitude']})
self.obs_gridded_dataset[key + '_data'] = da_data
self.obs_gridded_dataset[key + '_count'] = da_count


def pair_data(self, time_interval=None):
"""Pair all observations and models in the analysis class
Expand Down
53 changes: 48 additions & 5 deletions melodies_monet/util/grid_util.py
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ def update_sparse_data_grid(time_edges, x_edges, y_edges,

def normalize_sparse_data_grid(count_grid, data_grid):
"""
Normalize accumuxed data on a uniform grid
Normalize accumulated data on a uniform grid

Parameters
count_grid (dict): number of obs points in grid cell
Expand Down Expand Up @@ -134,7 +134,7 @@ def update_data_grid(time_edges, x_edges, y_edges,

def normalize_data_grid(count_grid, data_grid):
"""
Normalize accumuxed data on a uniform grid
Normalize accumulated data on a uniform grid

Parameters
count_grid (np.array): number of obs points in grid cell
Expand All @@ -144,9 +144,52 @@ def normalize_data_grid(count_grid, data_grid):
None
"""
data_grid[count_grid == 0] = np.nan
data_grid[count_grid > 0] /= count_grid[count_grid > 0]
# data_grid[count_grid > 0] /= count_grid[count_grid > 0]

def generate_uniform_grid(paired_dims,start,end,obstime,ntime,nlat,nlon):
"""
mask = np.where(count_grid > 0)
data_grid[mask] = data_grid[mask] / count_grid[mask]
"""

"""
ntime, nx, ny = data_grid.shape
for i_time in range(ntime):
for i_x in range(nx):
for i_y in range(ny):
if count_grid[i_time, i_x, i_y] > 0:
print(count_grid[i_time, i_x, i_y], data_grid[i_time, i_x, i_y])
data_grid[i_time, i_x, i_y] = data_grid[i_time, i_x, i_y] / count_grid[i_time, i_x, i_y]
print(count_grid[i_time, i_x, i_y], data_grid[i_time, i_x, i_y])
"""

def generate_uniform_grid(start, end, ntime, nlat, nlon):
import pandas as pd
start_timestamp = pd.to_datetime(start).timestamp()
end_timestamp = pd.to_datetime(end).timestamp()

ntime = ntime
nlat = nlat
nlon = nlon
lon0 = -180

# generate uniform grid
time_edges = np.linspace(start_timestamp, end_timestamp, ntime+1, endpoint=True, dtype=float)
time_grid = 0.5 * (time_edges[0:ntime] + time_edges[1:ntime+1])
lat_edges = np.linspace(-90, 90, nlat+1, endpoint=True, dtype=float)
lat_grid = 0.5 * (lat_edges[0:nlat] + lat_edges[1:nlat+1])
lat_min, lat_max = lat_edges[0:nlat], lat_edges[1:nlat+1]
lon_edges = np.linspace(lon0, lon0 + 360, nlon+1, endpoint=True, dtype=float)
lon_grid = 0.5 * (lon_edges[0:nlon] + lon_edges[1:nlon+1])

grid = {'longitude':lon_grid,
'latitude':lat_grid,
'time':time_grid}
edges = {'time_edges':time_edges,'lon_edges':lon_edges,'lat_edges':lat_edges}

return grid, edges


def generate_uniform_grid_with_paired_dims(paired_dims,start,end,obstime,ntime,nlat,nlon):
import pandas as pd
#import xarray as xr

Expand All @@ -173,4 +216,4 @@ def generate_uniform_grid(paired_dims,start,end,obstime,ntime,nlat,nlon):
'latitude':lat_grid,
'time':time_grid}
edges = {'time_edges':time_edges,'lon_edges':lon_edges,'lat_edges':lat_edges}
return grid,edges,time_stamps
return grid,edges,time_stamps
12 changes: 9 additions & 3 deletions melodies_monet/util/regrid_util.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
import xarray as xr


def setup_regridder(config, config_group='obs'):
def setup_regridder(config, config_group='obs', target_grid=None):
"""
Setup regridder for observations or model

Expand All @@ -25,8 +25,14 @@ def setup_regridder(config, config_group='obs'):
print('regrid_util: xesmf module not found')
raise

target_file = os.path.expandvars(config['analysis']['target_grid'])
ds_target = xr.open_dataset(target_file)
print('setup_regridder.target_grid')
print(target_grid)

if target_grid is not None:
ds_target = target_grid
else:
target_file = os.path.expandvars(config['analysis']['target_grid'])
ds_target = xr.open_dataset(target_file)

regridder_dict = dict()

Expand Down
21 changes: 20 additions & 1 deletion melodies_monet/util/time_interval_subset.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,4 +34,23 @@ def subset_OMPS_l2(file_path,timeinterval):
for j in fst:
interval_files.append(j)
return interval_files


def subset_MODIS_l2(file_path,timeinterval):
'''Dependent on filenaming convention
MOD04_L2.AYYYYDDD.HHMM.collection.timestamp.hdf
MYD04_L2.AYYYYDDD.HHMM.collection.timestamp.hdf
'''
import pandas as pd
from glob import glob
import fnmatch
all_files = glob(file_path)
interval_files = []
subset_interval = pd.date_range(start=timeinterval[0],end=timeinterval[-1],freq='h',inclusive='left')

for i in subset_interval:
fst = fnmatch.filter(all_files,'*M?D04_L2.A{}*.hdf'.format(i.strftime('%Y%j.%H')))
fst.sort()
for j in fst:
interval_files.append(j)
return interval_files

Loading