You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We've been experiencing some occasional "Could not find element X in row" exceptions when mapping over queries. In each case where this happens, it's implausible that the column would be missing from that query.
The majority of queries are free from this issue, so it's been difficult to reproduce (the issue vanished after my first attempt at logging with four separate log statements, be it a coincidence or due to skewing a race condition).
At @penland365's suggestion, I added logging for result.completedCommand, result.columns and result.rows. I replaced calls to pgClient.query with a call to this method:
I've since been able to identify specific queries where this exceptions occurs, along with the associated result contents. Here's one request that resulted in an error. I've formatted it a bit and anonymized (e.g., the rows were on a single line but I've split it across lines for readability).
One such error is roc.postgresql.failures$ElementNotFoundFailure: Could not find element 'report_count in Row and the corresponding request and result is this:
- timestamp: 2016-06-29T02:17:24.471Z
- request: Request(
select * from question_reports where
topic_slug = '_f8507801-d9d8-4863-a6ac-499fd35efdaf' and lifecycle = 'new'
and report_type = 'inappropriate')
- result
- completedCommand: SELECT 0
- columns: [Lscala.Symbol;@460039e4
- rows: []
(Unfortunately, it looks like something is wrong with how I'm printing the columns.)
In the following query, ~900ms later, it appears that the result from the previous query is returned:
- timestamp: 2016-06-29T02:18:25.344Z
- request: Request(
SELECT COUNT(player_id) AS report_count
FROM player_reports WHERE player_id = '22222222')
- result
- completedCommand: SELECT 1
- columns: [Lscala.Symbol;@35247c44
- rows: [
Text('topic_slug,1043,_f8507801-d9d8-4863-a6ac-499fd35efdaf)
Text('reporter_id,1043,11111111)
Text('language,1043,en)
Text('reason,1043,other)
Text('message,1043,being a meanie)
Text('lifecycle,1043,new)Text('created,1114,2016-05-30 17:12:41.816803)
]
This is an internal, low-volume service, so the next query ~1 minute later was an INSERT query whose result had completedCommand as "SELECT 0" and no rows).
~2 minutes later, we finally get the response from the count query in the result of an INSERT query:
- timestamp: 2016-06-29T02:20:50.026Z
- req: Request(
insert into topic_reports
(topic_slug, reporter_id, reason, language, message)
values (
'_5c1068a1-5780-45ff-a7c6-01ee7e3887b8',
'333333333',
'boring_topic',
'en',
'I do not care for this topic'
))
- result:
- completedCommand: SELECT 1
- columns: [Lscala.Symbol;@6caec085
- res.rows: [Text('report_count,20,9)]
Looking through the logs, this kind of mixup appears to happen quite a bit. In many cases it's benign since we're not processing the result (e.g. for insert queries).
If you have any hunches about what might be causing this, I might be able to contribute a fix.
The text was updated successfully, but these errors were encountered:
We've been experiencing some occasional "Could not find element X in row" exceptions when mapping over queries. In each case where this happens, it's implausible that the column would be missing from that query.
The majority of queries are free from this issue, so it's been difficult to reproduce (the issue vanished after my first attempt at logging with four separate log statements, be it a coincidence or due to skewing a race condition).
At @penland365's suggestion, I added logging for
result.completedCommand
,result.columns
andresult.rows
. I replaced calls topgClient.query
with a call to this method:I've since been able to identify specific queries where this exceptions occurs, along with the associated result contents. Here's one request that resulted in an error. I've formatted it a bit and anonymized (e.g., the rows were on a single line but I've split it across lines for readability).
One such error is
roc.postgresql.failures$ElementNotFoundFailure: Could not find element 'report_count in Row
and the corresponding request and result is this:(Unfortunately, it looks like something is wrong with how I'm printing the columns.)
In the following query, ~900ms later, it appears that the result from the previous query is returned:
This is an internal, low-volume service, so the next query ~1 minute later was an INSERT query whose result had
completedCommand
as "SELECT 0" and no rows).~2 minutes later, we finally get the response from the count query in the result of an INSERT query:
Looking through the logs, this kind of mixup appears to happen quite a bit. In many cases it's benign since we're not processing the result (e.g. for insert queries).
If you have any hunches about what might be causing this, I might be able to contribute a fix.
The text was updated successfully, but these errors were encountered: