[Enhancement]: Temporal averaging performance #688

chengzhuzhang · 2024-08-28T19:51:12Z

Is your feature request related to a problem?

This may not be a high priority issue, but I think it is worthwhile to document here:
When refactoring e3sm_diags with xcdat, I was working on using temporal.climatology operation to get annual cycle of a data stream which has 1hourly data for 30 years (ntime = 262800) from 1 grid point. Using xcdat is slower than the native function from e3sm_diags. It is probably expected, because the xcdat code is more sophisticated in reading in, writing out the data. I just noticed that it seems any operations on xarray.dataarray is much slower than numpy arrays, so the native e3sm_diags climo function needs to be optimized accordingly to get better performance.

Describe the solution you'd like

We can check if there are places where it is possible to get performance gain

Describe alternatives you've considered

No response

Additional context

Attaching example code and performance data.

### 1. Using temporal.climatology from xcdat
import xcdat as xc

file_path = '/global/cfs/cdirs/e3sm/e3sm_diags/postprocessed_e3sm_v2_data_for_e3sm_diags/20221103.v2.LR.amip.NGD_v3atm.chrysalis/arm-diags-data/PRECT_sgpc1_198501_201412.nc'
ds = xc.open_dataset(file_path)
ds_annual_cycle = ds.temporal.climatology("PRECT", "month")
#real    0m48.785s
#user    0m46.604s
#sys     0m12.709s

### 2. With external climo function from E3SM Diags
import xcdat as xc
from e3sm_diags.driver.utils.climo_xr import climo
file_path = '/global/cfs/cdirs/e3sm/e3sm_diags/postprocessed_e3sm_v2_data_for_e3sm_diags/20221103.v2.LR.amip.NGD_v3atm.chrysalis/arm-diags-data/PRECT_sgpc1_198501_201412.nc'
ds = xc.open_dataset(file_path)
ds_annual_cycle = climo(ds, "PRECT", "ANNUALCYCLE")

#real    0m23.995s
#user    0m19.630s
#sys     0m16.825s

### 3. Similar implementation but with cdat
import cdms2
from e3sm_diags.driver.utils.climo import climo
file_path = '/global/cfs/cdirs/e3sm/e3sm_diags/postprocessed_e3sm_v2_data_for_e3sm_diags/20221103.v2.LR.amip.NGD_v3atm.chrysalis/arm-diags-data/PRECT_sgpc1_198501_201412.nc'
ds = cdms2.open(file_path)
ds_annual_cycle = climo(ds("PRECT"), "ANNUALCYCLE")

#real    0m8.332s
#user    0m5.654s
#sys     0m11.613s

###4. Use cdat and cdutil for climatology
import cdms2
import cdutil

from e3sm_diags.driver.utils.climo import climo
file_path = '/global/cfs/cdirs/e3sm/e3sm_diags/postprocessed_e3sm_v2_data_for_e3sm_diags/20221103.v2.LR.amip.NGD_v3atm.chrysalis/arm-diags-data/PRECT_sgpc1_198501_201412.nc'ds = cdms2.open(file_path)
ds_annual_cycle = cdutil.ANNUALCYCLE.climatology(ds("PRECT"))

#real    3m40.007s
#user    3m34.030s
#sys     0m12.117s

The text was updated successfully, but these errors were encountered:

tomvothecoder · 2024-08-28T21:35:49Z

Possible next steps:

Try flox with e3sm_diags and xCDAT temporal APIs
Reference Explore `flox` for improving Xarray's `groupby()` performance #490
Analyze codebase for other bottlenecks besides grouping

chengzhuzhang · 2024-08-28T22:17:17Z

Regarding to results: Case2,3,4 are identical, while, xcdat result from Case 1 is slightly off.

Case 1: [1.63433977e-08 1.73700556e-08 2.73745702e-08 3.22052784e-08
 3.28640795e-08 3.27481651e-08 3.03053831e-08 2.27138450e-08
 2.60270063e-08 2.38527367e-08 1.89776266e-08 1.71358785e-08]
Case 2: [1.63593346e-08 1.73546146e-08 2.73492017e-08 3.22492671e-08
 3.28317165e-08 3.28051981e-08 3.02749046e-08 2.27307623e-08
 2.59688303e-08 2.38724820e-08 1.89765019e-08 1.71450951e-08]
Case 3: [1.63593346e-08 1.73546146e-08 2.73492017e-08 3.22492671e-08
 3.28317165e-08 3.28051981e-08 3.02749046e-08 2.27307623e-08
 2.59688303e-08 2.38724820e-08 1.89765019e-08 1.71450951e-08]
Case 4: [1.63593346e-08 1.73546146e-08 2.73492017e-08 3.22492671e-08
 3.28317165e-08 3.28051981e-08 3.02749046e-08 2.27307623e-08
 2.59688303e-08 2.38724820e-08 1.89765019e-08 1.71450951e-08]

tomvothecoder · 2024-08-29T19:02:39Z

Do the runtimes you captured include I/O?

chengzhuzhang · 2024-08-29T19:17:11Z

Do the runtimes you captured include I/O?

Yes, it includes everything from imports to the end.

tomvothecoder · 2024-08-29T23:05:20Z

I believe I've found the root cause for the most major performance bottleneck:

Generating labeled time coordinates (aka assign groups) and adding it to the existing time dimension with existing coordinates, then performing the Xarray groupby yields extremely slow results (not sure why, it's an Xarray issue).

My initial solution:

Replace time coordinates with the labeled time coordinates directly (this happens downstream anyways) then group on the "new" time coordinates
Cut down the time from ~48 secs to ~8 secs.

I just opened PR #689 to explore and address this issue.

chengzhuzhang · 2024-08-29T23:19:06Z

Thank you for looking into this! It looks like we should be able to also get some performance improvement following your solution for E3SM Diags refactored code, which we can address later..

chengzhuzhang added the type: enhancement New enhancement request label Aug 28, 2024

tomvothecoder mentioned this issue Aug 29, 2024

[Refactor] Improve the performance of temporal group averaging #689

Merged

14 tasks

tomvothecoder closed this as completed in #689 Sep 4, 2024

tomvothecoder mentioned this issue Sep 30, 2024

[Explore] Investigate the differences in xCDAT's annual cycle climatology compared to E3SM Diags climo.py/climo_xr.py and cdms2/cdutil E3SM-Project/e3sm_diags#856

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Enhancement]: Temporal averaging performance #688

[Enhancement]: Temporal averaging performance #688

chengzhuzhang commented Aug 28, 2024 •

edited

Loading

tomvothecoder commented Aug 28, 2024

chengzhuzhang commented Aug 28, 2024

tomvothecoder commented Aug 29, 2024

chengzhuzhang commented Aug 29, 2024

tomvothecoder commented Aug 29, 2024 •

edited

Loading

chengzhuzhang commented Aug 29, 2024

[Enhancement]: Temporal averaging performance #688

[Enhancement]: Temporal averaging performance #688

Comments

chengzhuzhang commented Aug 28, 2024 • edited Loading

Is your feature request related to a problem?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

tomvothecoder commented Aug 28, 2024

chengzhuzhang commented Aug 28, 2024

tomvothecoder commented Aug 29, 2024

chengzhuzhang commented Aug 29, 2024

tomvothecoder commented Aug 29, 2024 • edited Loading

chengzhuzhang commented Aug 29, 2024

chengzhuzhang commented Aug 28, 2024 •

edited

Loading

tomvothecoder commented Aug 29, 2024 •

edited

Loading