Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TimeSeries retrieval broken on latest CWMS schema master branch #622

Closed
adamkorynta opened this issue Apr 19, 2024 · 29 comments
Closed

TimeSeries retrieval broken on latest CWMS schema master branch #622

adamkorynta opened this issue Apr 19, 2024 · 29 comments
Labels
approved-W192HQ23F0232-task4 Only valid if set by MikeNeilson, DanielO, CharlesG priority:high timeseries

Comments

@adamkorynta
Copy link
Collaborator

Trying to retrieve numerical time series results in the following error:

19-Apr-2024 11:10:00.788 WARNING [http-nio-7001-exec-8] cwms.cda.ApiServlet.lambda$init$16 error on request[6812722450334899286]: /cwms-data/timeseries
	org.jooq.exception.DataException: SQL [with "valid"("tscode", "tsid", "office_id", "loc_part", "units", "interval", "parm_part") as (select "validts"."tscode" "tscode", "validts"."tsid" "tsid", "validts"."office_id" "office_id", "CWMS_20"."CWMS_UTIL"."SPLIT_TEXT"("validts"."tsid", ?, ?, ?) "loc_part", "CWMS_20"."CWMS_UTIL"."GET_DEFAULT_UNITS"("CWMS_20"."CWMS_TS"."GET_BASE_PARAMETER_ID"("CWMS_20"."CWMS_TS"."GET_TS_CODE"(?, "CWMS_20"."CWMS_UTIL"."GET_DB_OFFICE_ID"(?))), ?) "units", "CWMS_20"."CWMS_TS"."GET_TS_INTERVAL"("validts"."tsid") "interval", upper("CWMS_20"."CWMS_UTIL"."SPLIT_TEXT"("validts"."tsid", ?, ?, ?)) "parm_part" from (select "CWMS_20"."CWMS_TS"."GET_TS_CODE"(?, "CWMS_20"."CWMS_UTIL"."GET_DB_OFFICE_ID"(?)) "tscode", "CWMS_20"."CWMS_TS"."GET_TS_ID"(?, "CWMS_20"."CWMS_UTIL"."GET_DB_OFFICE_ID"(?)) "tsid", "CWMS_20"."CWMS_UTIL"."GET_DB_OFFICE_ID"(?) "office_id" from DUAL) "validts") select "valid"."tsid" "NAME", "valid"."office_id" "office_id", "valid"."units" "units", "valid"."interval" "interval", "valid"."loc_part" "loc_part", "valid"."parm_part" "parm_part", case "valid"."parm_part" when ? then "CWMS_20"."CWMS_LOC"."GET_VERTICAL_DATUM_INFO_F"("valid"."loc_part", "valid"."units", "valid"."office_id") else ? end "VERTICAL_DATUM", (select count(*) from (select DATE_TIME "DATE_TIME", VALUE "VALUE", QUALITY_CODE "QUALITY_CODE" from table(cwms_20.cwms_ts.retrieve_ts_out_tab("valid"."tsid","valid"."units",cwms_20.cwms_util.to_timestamp(?),cwms_20.cwms_util.to_timestamp(?),'UTC',?,?,?,?,?,cwms_20.cwms_util.to_timestamp(?),?,"valid"."office_id") ) retrieveTsTotal) "alias_43918362") "TOTAL", "CWMS_20"."AV_CWMS_TS_ID2"."INTERVAL_UTC_OFFSET", "CWMS_20"."AV_CWMS_TS_ID2"."TIME_ZONE_ID" from "valid" left outer join "CWMS_20"."AV_CWMS_TS_ID2" on ("CWMS_20"."AV_CWMS_TS_ID2"."DB_OFFICE_ID" = "valid"."office_id" and "CWMS_20"."AV_CWMS_TS_ID2"."TS_CODE" = "valid"."tscode" and "CWMS_20"."AV_CWMS_TS_ID2"."ALIASED_ITEM" is null)]; ORA-01878: specified field not found in datetime or interval
ORA-06512: at "CWMS_20.CWMS_TS", line 3553
ORA-06512: at line 1

		at org.jooq_3.18.7.ORACLE18C.debug(Unknown Source)
		at org.jooq_3.18.7.ORACLE18C.debug(Unknown Source)
		at org.jooq.impl.Tools.translate(Tools.java:3466)
		at org.jooq.impl.Tools.translate(Tools.java:3458)
		at org.jooq.impl.DefaultExecuteContext.sqlException(DefaultExecuteContext.java:802)
		at org.jooq.impl.AbstractQuery.execute(AbstractQuery.java:360)
		at org.jooq.impl.AbstractResultQuery.fetchLazy(AbstractResultQuery.java:301)
		at org.jooq.impl.AbstractResultQuery.fetchLazyNonAutoClosing(AbstractResultQuery.java:322)
		at org.jooq.impl.SelectImpl.fetchLazyNonAutoClosing(SelectImpl.java:2862)
		at org.jooq.impl.ResultQueryTrait.fetchOne(ResultQueryTrait.java:509)
		at org.jooq.impl.ResultQueryTrait.fetchOne(ResultQueryTrait.java:514)
		at cwms.cda.data.dao.TimeSeriesDaoImpl.getTimeseries(TimeSeriesDaoImpl.java:335)
		at cwms.cda.api.TimeSeriesController.getAll(TimeSeriesController.java:457)
@rma-rripken
Copy link
Collaborator

I wish it told you what the name of "specified field" was when it can't find it.

@MikeNeilson MikeNeilson added priority:medium We care, and will try and get to it soon. approved-W192HQ23F0232-task4 Only valid if set by MikeNeilson, DanielO, CharlesG labels May 7, 2024
@MikeNeilson
Copy link
Contributor

@adamkorynta
FYI, this is failing also:

curl -X 'GET' \
  'https://<host>/nwdm-data/timeseries?name=STON-Tailwater.Elev.Inst.1Day.0.Best-NWK&office=NWDM&unit=ft&begin=2024-03-06T00%3A00%3A00-0000&end=2024-03-15T00%3A00%3A00-0000&trim=true' \
  -H 'accept: application/json;version=2'

Which the following:

ORA-20023: INVALID_T_F_FLAG: "%1" - Must be either T or F.
ORA-06512: at "CWMS_20.CWMS_ERR", line 80
ORA-06512: at "CWMS_20.CWMS_UTIL", line 450
ORA-06512: at "CWMS_20.CWMS_TS", line 3194
ORA-06512: at "CWMS_20.CWMS_TS", line 3441
ORA-06512: at "CWMS_20.CWMS_TS", line 3491
ORA-06512: at "CWMS_20.CWMS_TS", line 3535
ORA-06512: at line 1

It's the latest released CDA + the HEC T7 24.x.y schema database.

@MikeNeilson MikeNeilson added priority:high and removed priority:medium We care, and will try and get to it soon. labels May 13, 2024
@rma-rripken
Copy link
Collaborator

rma-rripken commented May 13, 2024

I feel like something changed, either in jooq or the way we have it configured. We used to be able to pass null to those t/f args and the pl/sql default would get called. Now we have to pass a valid value. We saw that when finishing up the version work too.

@MikeNeilson
Copy link
Contributor

@rma-rripken or @adamkorynta Let me know if you want work on this, if not I need to do it, I think someone keeps pressuring Kevin to just randomly test CDA from CWMSVue/CAVI without regard to actually you know, checking to see if any updates have been made.

@rma-rripken rma-rripken self-assigned this May 20, 2024
@rma-rripken
Copy link
Collaborator

I'm on it, at least for a day or two

@rma-rripken
Copy link
Collaborator

Do we know when this first started happening? I see there is a 24.04.05 schema installer on nexus. It seems like that would trigger the problem but it doesn't from what I can see so far.

@MikeNeilson
Copy link
Contributor

It was with 24.04.05 on the HEC T7, on an older CDA, it's entirely plausible we fixed it already and I just need to make a new release.

So with develop there was no issue?

@adamkorynta
Copy link
Collaborator Author

I only started testing CDA against the mainline schema for the forecast updates, so I don't know any history beyond that. Good to know 24.04.05 isn't throwing the error though!

@rma-rripken
Copy link
Collaborator

I ran the TimeseriesControllerTestIT with registry.hecdev.net/cwms/schema_installer:latest-dev and then again with registry.hecdev.net/cwms/schema_installer:24.04.05. Neither had a problem so I'm not sure the issue has been fixed or if my test isn't in-depth enough. I'm going to look into that INVALID_T_F_FLAG error message. That is probably still present and if I can reproduce that then something isn't right in my env

@MikeNeilson
Copy link
Contributor

Interesting. This is the request that caused it (coming through hec-cwms-data-access, though I doubt that's the culprit:

https://<host>:8243/nwdm-data/timeseries?unit=SI&page-size=5000&timezone=UTC&name=3000C_1563.Precip.Inst.1Hour.0.Raw-DenverAlert&end=2024-05-02T08%3A00%3A00Z&office=NWDM&begin=2020-01-02T08%3A00%3A00Z

Did just notice the "unit=SI" but I thought we already covered that case, and pretty sure hec-cwms-data-access always use the application/json;version=2

@rma-rripken
Copy link
Collaborator

I'm able to duplicate the error. Even though you pointed it out I didn't appreciate the unit=SI thing until I was looking at my test case in more detail. I don't think that is the cause but I'm not sure if that should be an error or not. The parameter says it takes {EN, SI, Other} . I don't really know what Other does. Do users actually pass "Other" or is that suppose to mean that users can pass what they want? When I pass "SI" to the timeseries end-point I get back data in "in" so that part doesn't seem to be working. Was "Other" from legacy v1 pl/sql or is that a feature that is still supposed to work?

@MikeNeilson
Copy link
Contributor

Other would be the specific unit. So you can say: SI, EN , which would give you the default SI or EN units (for example meters or feet for elevation) or if today I decided I wanted a big number, I could say "given me this elevation in inches.

CWMSVue (through hec-cwms-data-access) is doing the reasonable thing of just using the unit family, so it does need to work that way, but an individual query can arbitrarily choose there units, the database will throw an error if the conversion doesn't make any sense (like requesting CFS for an elevation.)

@rma-rripken
Copy link
Collaborator

The issue duplicates if the begin/end dates span a daylight savings switch over. That cryptic message "specified field not found in datetime or interval" is an Oracle message that I think means something is asking for a datetime that isn't valid. I need to dig a little more to figure out exactly where in our code that is happening.

@JeremyDKellett
Copy link

JeremyDKellett commented May 21, 2024

I never understood the 'other' bit either. The view (cwms_v_loc, etc.) has unit_system which is tied to an API call for default display units of the office signed in. The use case is to support output like Storage in kaf for offices like NWD-MR stem dams vs ac-ft for offices like NAE.

Can the 'other' be just dropped ? Or is the underlying query tied to the dqu view and thus "other" could be a possible unit_id in there?

@rma-rripken
Copy link
Collaborator

I've made a new issue for updating the documentation and handling the EN/SI units part of this

@rma-rripken rma-rripken added approved-W912HQ22F0193 Approval label for the named contract. Only valid for contract billing if set by MikeNeilson. timeseries and removed approved-W912HQ22F0193 Approval label for the named contract. Only valid for contract billing if set by MikeNeilson. labels May 21, 2024
@rma-rripken
Copy link
Collaborator

Its this query in CWMS_TS

l_local_time_zone := cwms_loc.get_local_timezone(l_location_code);
l_query_str :=
'select cast(from_tz(cast(t.date_time as timestamp), ''UTC'') at time zone '':tz'' as :date_time_type) "DATE_TIME",
case
when value is nan then null
else value + :l_value_offset
end "VALUE",
cwms_ts.normalize_quality(nvl(quality_code, :unscreened)) "QUALITY_CODE"
from (
select date_time,
value,
quality_code
from av_tsv_dqu
where ts_code = :ts_code
and unit_id = :units
and start_date <= :reg_l_end
and end_date > :l_reg_start_time
and version_date = :version
and aliased_item is null
) v
right outer join
(select column_value as date_time
from table(cwms_ts.get_lrts_times_utc(:reg_start, :reg_end, :interval, :local_timezone))
) t
on v.date_time = t.date_time
order by t.date_time asc';
replace_strings;
cwms_util.check_dynamic_sql(l_query_str);

@rma-rripken
Copy link
Collaborator

The pl/sql above is not the cause. With some help from @rma-psmorris I was able to dig into it a little deeper. I think my IDE had stale pl/sql from the previous schema. Once I refreshed things the error stopped pointing there.

The error points at this line of code: https://bitbucket.hecdev.net/projects/CWMS/repos/cwms_database/browse/schema/src/cwms/cwms_ts_pkg_body.sql?at=refs%2Fheads%2Frelease%2F24.04.05#3556

"FETCH query_cursor INTO output_row"
query_cursor comes from "retrieve_ts_out" and output_row is a zts_rec_t

The cursor returned by retrieve_ts_out is the return value of a call to retrieve_ts_f.

 open l_crsr for
         select case p_date_time_type
                when 'DATE' then
                   nvl(q1.date_time, q2.date_time)
                when 'TIMESTAMP' then
                   cast(nvl(q1.date_time, q2.date_time) as timestamp)
                when 'TIMESTAMP WITH TIME ZONE' then
                   from_tz(cast(nvl(q1.date_time, q2.date_time) as timestamp), l_date_range.time_zone)
                end as date_time,
                value,
                nvl(quality_code, 0) as quality_code
            from (select date_time,
                         value,
                         quality_code
                    from table(l_ts_retrieved)
                 ) q1
                 full outer join
                 (select column_value as date_time
                    from table(l_reg_ts_times)
                 ) q2 on q2.date_time = q1.date_time
         order by 1;
      return l_crsr;
   end retrieve_ts_f;

@rma-rripken
Copy link
Collaborator

Running this query in Intellij SQL Console doesn't reproduce the error but running it in SQL Developer does.

declare
    l_cwms_ts_id_out varchar2(200);
    l_units_out      varchar2(200);
    l_time_zone_id   varchar2(200);
    l_start_time     date;
    l_end_time       date;
    l_crsr           sys_refcursor;
    l_dates          cwms_t_date_table;
    l_values         cwms_t_double_tab;
    l_quality_codes  cwms_t_number_tab;
begin
    l_start_time := cwms_20.cwms_util.to_timestamp(1615190400000);
    l_end_time := cwms_20.cwms_util.to_timestamp(1615795200000);
    --l_start_time := cwms_20.cwms_util.to_timestamp(1612771200000);
    --l_end_time := cwms_20.cwms_util.to_timestamp(1615190400000);

    dbms_output.put_line('Start time: ' || TO_CHAR(l_start_time, 'YYYY-MM-DD HH24:MI:SS'));
    dbms_output.put_line('End time: ' || TO_CHAR(l_end_time, 'YYYY-MM-DD HH24:MI:SS'));


    l_crsr := cwms_ts.retrieve_ts_f(l_cwms_ts_id_out, l_units_out, l_time_zone_id,
                                    '3000C_1563.Precip.Inst.1Hour.0.Raw-DenverAlert',
                                    l_start_time,
                                    l_end_time,
                                    'UTC',
                                    'DATE',
                                    'm',
                                    'SI',
                                    'F',
                                    'T',
                                    'T',
                                    'F',
                                    'F',
                                    null,
                                    'F',
                                    'SPK');
    dbms_output.put_line('l_cwms_ts_id_out: ' || l_cwms_ts_id_out);
    dbms_output.put_line('l_units_out: ' || l_units_out);
    dbms_output.put_line('l_time_zone_id: ' || l_time_zone_id);

    fetch l_crsr
        bulk collect
        into l_dates,
        l_values,
        l_quality_codes;
    for i in 1 .. l_values.count
        loop
            dbms_output.put_line('Value at index ' || i || ': ' || l_values(i));
        end loop;
    close l_crsr;

    dbms_output.put_line('done ');
end;
/

@rma-rripken
Copy link
Collaborator

Setting sessiontimezone to UTC fixes the error.

SET TIME ON
SELECT SYSDATE FROM DUAL;
SET ECHO ON
SET SERVEROUTPUT ON
BEGIN DBMS_OUTPUT.ENABLE(100000); END;
/
-- Check the database time zone
SELECT dbtimezone FROM dual;

-- Check the session time zone
SELECT sessiontimezone FROM dual;

-- Set the session time zone to UTC
ALTER SESSION SET TIME_ZONE = 'UTC';

-- Verify the session time zone is set correctly
SELECT sessiontimezone FROM dual;

-- Check the system time zone
SELECT systimestamp AT TIME ZONE 'UTC' FROM dual;

@MikeNeilson
Copy link
Contributor

I was about to say "ugh" but that actually makes sense. The database is in UTC and the host should be. But so should the session so maybe the connections aren't properly defaulting to such?

@rma-rripken
Copy link
Collaborator

What part of the schema changes to 24.04.05 made setting the sessiontimezone necessary? In that retreive_ts_f code I posted above we're in the "DATE" case btw. So the cursor is doing "nvl(q1.date_time, q2.date_time)" I haven't debugged into things further to figure out the exact types of q1.date_time or q2.date_time. Even though we pass "UTC" into retrieve_ts_f something isn't using it and is falling back on the sessiontimezone.

Is this just a CDA thing? Or something specific to using the packaged Docker container?

@rma-rripken
Copy link
Collaborator

We could add some sql to the connection pool config that sets the sessiontimezone on the connection before a connection is loaned out from the pool.

@MikeNeilson
Copy link
Contributor

Quite possibly the LRTS work as Adam just found a test that files because of a missing location. But always setting the session TZ to a known value is probably is a good idea, OpenDCS behaves that say and it's not a slow call.

@MikeNeilson
Copy link
Contributor

"FETCH query_cursor INTO output_row"
query_cursor comes from "retrieve_ts_out" and output_row is a zts_rec_t

Well, Mike P did just do a giant rewrite to that procedure for both LRTS and performance.

@rma-rripken
Copy link
Collaborator

@MikeNeilson Should I dig further into into zretrieve and try and find the bad cast? Or is this not a bug and just a mis-configuration? Must sessiontimezone always be utc? If it isn't then the other error messages aren't actually an error and I'm just using it wrong?

@MikeNeilson
Copy link
Contributor

Sessiontimezone should always be UTC, otherwise I think it messes with the date math we use.

Did just that fix it or is there just another issue? (sorry, I think I mixed things up within this particular issue.)

@rma-rripken
Copy link
Collaborator

I have a test case that fails with that same error message. If I set sessiontimezone then the test case passes with no code changes.

@MikeNeilson
Copy link
Contributor

They I say we say it's the session and that code is now just part of the connection creation and handing out.

@rma-rripken rma-rripken removed their assignment May 29, 2024
@rma-rripken
Copy link
Collaborator

#663 should have fixed this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved-W192HQ23F0232-task4 Only valid if set by MikeNeilson, DanielO, CharlesG priority:high timeseries
Projects
Development

No branches or pull requests

4 participants