You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There are three sets of ETo estimations at each station that we compare for various reasons. These are daily station estimates, longterm estimates, and raster-based estimates. We use the following prefixes for these:
s_eto = CIMIS reported station ETo. These are the station data as reported contemporaneously with the raster data calculations. That means we have estimate for these every day for each day we have spatial CIMIS estimations. These are stored in the station table. There are about 500K entries in this table, for each day and each station.
r_eto = Spatial CIMIS calcuated ETo. These are the ETo estimations that are calculated each day, from the combination of the station data and the GOES estimated Rs estimations. We have these for every day spatial CIMIS was calculated as well. These are stored in the raster table. Thee are about 1M entries in this table, more than the station table, since we get the estimation regardless if the station reported a result for that particular day or now.
lt_eto = Long term average ETo. These are long term averages as supplied by the CIMIS program. We have a daily long term estimate, but these are independent of year, since they are averages. These are kept in the cimis_15day table. There are 134 stations, and 366 days per station in that table.
In addition, we often use 15 day window running averages of these data. We do this especially when summarizing our yearly data into 52 weekly average values. These values are what are used for calculating errors, and differences. They are also what are used as inputs to the FFT transfomations to further summarized our yearly ETo estimations into 5 FFT parameters, Three powers and two phase components. We add a 15 to the prefix designations. s15_eto , r15_eto, lt15_eto.
Station - Raster Comparisons
Long Term Average Comparisons
DWR has supplied some long term averages for a about 122 stations. Our interest is to compare these data with the Spatial CIMIS raster long term averages. The raster long term average data exists in the table, fft.raster_15avg_ed. There is one for every pixel, So we just need to extract the station pixels. We created a table for the station's associated pid from compare.station_xy that combines the station_info w/ the cimis boundaries, so we can just use that.
Station Location differences
Note, however, the lt_* data reports some stations considerably far from the
station_info data as reported by the et.water.ca.gov website. We are assuming the station info is correct, but these are the stations more then 500m from as reported be et.water.
station_id
longitude
latitude
diff
135
-114.666
33.557
15431
196
-122.144
38.685
11337
88
-119.605
34.932
6388
84
-121.311
39.271
2088
152
-118.994
34.232
1407
114
-121.29
36.359
1305
170
-122.02
38.004
1264
194
-120.851
37.719
911
136
-116.154
33.516
868
175
-114.726
33.389
863
74
-116.973
33.09
758
56
-120.761
37.093
752
79
-122.421
38.549
698
62
-117.222
33.49
691
77
-122.41
38.434
614
90
-120.479
41.433
589
200
-116.258
33.746
553
We can then calculate the ratio of lt_p0/r_p0 to compare the DWR long term averages.
Contemporaneous Station Data Comparisons
When we are looking for biases in the station vs. raster estimations, we look at these data. One important table we have is the compare.ymd15 table. This compares the ETo estimations for s_eto and for r_eto for every 15day time window in Spatial CIMIS history. So, for each 15day time window, we calculate the average station and raster eto for that window. You can think of this as a 15x reduction in the data to compare, by only looking at those average values. There are about 69K entries in this table, covering overlap in each station, and each 15 day window, so each entry is an average of 15 days, or sometimes less. There is a range of overlapping windows based on these comparisons. The Station-Raster Dates and CountGoogle Sheet, shows the starting and stopping dates for the comparisons, and how many window entries overlap.
Now, we can take Just the overlapping time windows from this ymd15 table, and we can calculate our FFT transform parameters from that. So, note, for each raster location we are are calculating special FFT parameters, specific to the overlapping time windows with the stations. That way when we calculate a ratio, the ratio are comparing estimates from the same time period.
Combined Ratio Comparisons
The Long Term / Station / Raster Ratios Tab in the Google Sheet, shows a summary of the long_term and station ratios. Note there are two estimates from the raster data, the r_p0 is the long term data, and the s_r_p0 is the raster values from the data that overlap the station information. The too ratios then are s_p0_ratio = s_p0/s_r_p0 and lt_p0_ratio = lt_p0/r_p0. The ratios are fairly similar, but there are some differences. In that sheet, the column station_overlap_yrs shows the length of the comparison overlap. It's been suggested that for the station ratios to only look at stations with an overlap of 5 years or more.
If you were interested in seeing the largest differences, you could compared these two ways. You could look at the biggest differences in the p0 ratios, by looking at | (s_p0 / s_r_p0 ) -1 | where the absolute value orders by big differences in the ratio. If we are looking for a station to raster conversion, this ratio can be used. Or you could just look at the absolute value of the difference of s_r_p0 and r_p0, `| s_p0 - s_r_p0 |'. Here the values are equivalent to the average daily difference in ETo.
We plan to create a single multiplier for p0, we will look at the ration. The tab Rapid Change in s_p0/r_p0 In the Google Sheet, shows the stations that have the most rapid change in s_p0/r_p0 ratio in the images. Higher numbers mean more rapid changes from one station to another.
Ratio Splines.
This ratios are then used as input to a 3-d spline parameterization, Using Grass' v.vol.rst An example invocation looks like
p0=${r}_s${s}_z${z}_t${t}_p0;
v.vol.rst --overwrite input=ratio wcolumn=${r}_p0_ratio \
cross_input=Z@2km maskmap=state@2km \
tension=${t} zscale=${z} smooth=${s} cross_output=${p0} \
where="${r}_p0_ratio is not null and station_overlap_yrs > 4";
A result of running a set of these splines is shown in the Splines Cloud directory.
The three parameters that are modified are
tension which affects the ability of a point to pull the interpolation to it. Higher tensions allow
for higher bends in the fit. If you look at the t10 files, you can most easily see where the stations most differ from the rasters (the ratio is farthest from 1.)
zscale affects how much the elevation affects the spline. We have this low which makes this almost a 2-d fit.
smooth affects how far the spline can miss the input data. Higher smoothness allow for the data to not match the points exactly. This would however affect our desire for a matching layer, and the values are kept low.
Some of the parameters used result in an overshoot, that is the spline cannot be made to fit the data without extrapolating beyond the bounds of the input data. This is an indication that the spline is probably not too reliable.
You can see the data are pretty similar between the lt_ and 's_` values. for s=0 you need to increase tension to 7 before you remove overshoot, the result is a ratio that is probably a bit to blotchy. For s=0.02, you do get some overshoot at t=3, but the results are move believable.
Big Drivers for the Spline
Note the may be some indication of systematic changes west of the central valley, but they are not super clear. Note the LA stations show the biggest bend, but there are large bends up the west coast, and in the NE Ca (one station) as well.
In LA, the stations driving the spline are station_id=204 with a very high ratio of 1.2, near station_id=133 with a low ratio of 0.9.
In NE CA, its just station_id=57 with a ratio of 1.15
In the West it's more convoluted, but it involves station_id=109 that has a ratio of 1.005, but is surrounded by stations with a higher ratio, and then the pairs, station_id=122,212,140,167 That are high, near, station_id=166,42,70, that are low.
The text was updated successfully, but these errors were encountered:
Ricardo suggested that we not include stations with comparison scales less than 5 years. That would eliminate about 38 stations from the 159 stations we have. That seems like a pretty good idea, as these staions with little overlap can have large errors.
There are three sets of ETo estimations at each station that we compare for various reasons. These are daily station estimates, longterm estimates, and raster-based estimates. We use the following prefixes for these:
s_eto
= CIMIS reported station ETo. These are the station data as reported contemporaneously with the raster data calculations. That means we have estimate for these every day for each day we have spatial CIMIS estimations. These are stored in thestation
table. There are about 500K entries in this table, for each day and each station.r_eto
= Spatial CIMIS calcuated ETo. These are the ETo estimations that are calculated each day, from the combination of the station data and the GOES estimated Rs estimations. We have these for every day spatial CIMIS was calculated as well. These are stored in theraster
table. Thee are about 1M entries in this table, more than the station table, since we get the estimation regardless if the station reported a result for that particular day or now.lt_eto
= Long term average ETo. These are long term averages as supplied by the CIMIS program. We have a daily long term estimate, but these are independent of year, since they are averages. These are kept in thecimis_15day
table. There are 134 stations, and 366 days per station in that table.In addition, we often use 15 day window running averages of these data. We do this especially when summarizing our yearly data into 52 weekly average values. These values are what are used for calculating errors, and differences. They are also what are used as inputs to the FFT transfomations to further summarized our yearly ETo estimations into 5 FFT parameters, Three powers and two phase components. We add a
15
to the prefix designations.s15_eto
,r15_eto
,lt15_eto
.Station - Raster Comparisons
Long Term Average Comparisons
DWR has supplied some long term averages for a about 122 stations. Our interest is to compare these data with the Spatial CIMIS raster long term averages. The raster long term average data exists in the table,
fft.raster_15avg_ed
. There is one for every pixel, So we just need to extract the station pixels. We created a table for the station's associated pid fromcompare.station_xy
that combines the station_info w/ the cimis boundaries, so we can just use that.Station Location differences
Note, however, the lt_* data reports some stations considerably far from the
station_info data as reported by the et.water.ca.gov website. We are assuming the station info is correct, but these are the stations more then 500m from as reported be et.water.
We can then calculate the ratio of lt_p0/r_p0 to compare the DWR long term averages.
Contemporaneous Station Data Comparisons
When we are looking for biases in the station vs. raster estimations, we look at these data. One important table we have is the
compare.ymd15
table. This compares the ETo estimations fors_eto
and forr_eto
for every 15day time window in Spatial CIMIS history. So, for each 15day time window, we calculate the average station and raster eto for that window. You can think of this as a 15x reduction in the data to compare, by only looking at those average values. There are about 69K entries in this table, covering overlap in each station, and each 15 day window, so each entry is an average of 15 days, or sometimes less. There is a range of overlapping windows based on these comparisons. The Station-Raster Dates and Count Google Sheet, shows the starting and stopping dates for the comparisons, and how many window entries overlap.Now, we can take Just the overlapping time windows from this
ymd15
table, and we can calculate our FFT transform parameters from that. So, note, for each raster location we are are calculating special FFT parameters, specific to the overlapping time windows with the stations. That way when we calculate a ratio, the ratio are comparing estimates from the same time period.Combined Ratio Comparisons
The Long Term / Station / Raster Ratios Tab in the Google Sheet, shows a summary of the long_term and station ratios. Note there are two estimates from the raster data, the
r_p0
is the long term data, and thes_r_p0
is the raster values from the data that overlap the station information. The too ratios then ares_p0_ratio = s_p0/s_r_p0
andlt_p0_ratio = lt_p0/r_p0
. The ratios are fairly similar, but there are some differences. In that sheet, the columnstation_overlap_yrs
shows the length of the comparison overlap. It's been suggested that for the station ratios to only look at stations with an overlap of 5 years or more.If you were interested in seeing the largest differences, you could compared these two ways. You could look at the biggest differences in the p0 ratios, by looking at
| (s_p0 / s_r_p0 ) -1 |
where the absolute value orders by big differences in the ratio. If we are looking for a station to raster conversion, this ratio can be used. Or you could just look at the absolute value of the difference of s_r_p0 and r_p0, `| s_p0 - s_r_p0 |'. Here the values are equivalent to the average daily difference in ETo.We plan to create a single multiplier for p0, we will look at the ration. The tab Rapid Change in s_p0/r_p0 In the Google Sheet, shows the stations that have the most rapid change in
s_p0/r_p0
ratio in the images. Higher numbers mean more rapid changes from one station to another.Ratio Splines.
This ratios are then used as input to a 3-d spline parameterization, Using Grass' v.vol.rst An example invocation looks like
A result of running a set of these splines is shown in the Splines Cloud directory.
The three parameters that are modified are
for higher bends in the fit. If you look at the t10 files, you can most easily see where the stations most differ from the rasters (the ratio is farthest from 1.)
matching
layer, and the values are kept low.Some of the parameters used result in an overshoot, that is the spline cannot be made to fit the data without extrapolating beyond the bounds of the input data. This is an indication that the spline is probably not too reliable.
You can see the data are pretty similar between the
lt_
and 's_` values. for s=0 you need to increase tension to 7 before you remove overshoot, the result is a ratio that is probably a bit to blotchy. For s=0.02, you do get some overshoot at t=3, but the results are move believable.Big Drivers for the Spline
Note the may be some indication of systematic changes west of the central valley, but they are not super clear. Note the LA stations show the biggest bend, but there are large bends up the west coast, and in the NE Ca (one station) as well.
In LA, the stations driving the spline are station_id=204 with a very high ratio of 1.2, near station_id=133 with a low ratio of 0.9.
In NE CA, its just station_id=57 with a ratio of 1.15
In the West it's more convoluted, but it involves station_id=109 that has a ratio of 1.005, but is surrounded by stations with a higher ratio, and then the pairs, station_id=122,212,140,167 That are high, near, station_id=166,42,70, that are low.
The text was updated successfully, but these errors were encountered: