Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[new]: is_table_row_number_anomalous(fully_qualified_table_name) #31

Open
2 tasks done
unytics opened this issue Dec 7, 2022 · 2 comments
Open
2 tasks done

[new]: is_table_row_number_anomalous(fully_qualified_table_name) #31

unytics opened this issue Dec 7, 2022 · 2 comments
Labels
new-bigfunction Suggest a New BigFunction

Comments

@unytics
Copy link
Owner

unytics commented Dec 7, 2022

Check the idea has not already been suggested

Edit the title above with self-explanatory function name and argument names

  • The function name and the argument names I entered in the title above seems self explanatory to me.

BigFunction Description as it would appear in the documentation

Get table row count for the last 7 days (using time travelling feature of BigQuery) and check that current row count is not unusual compared to yesterday row count and regarding the evolution of the 7 latest days

Examples of (arguments, expected output) as they would appear in the documentation

one_public_table_name --> false

@unytics unytics added the new-bigfunction Suggest a New BigFunction label Dec 7, 2022
@shivam221098
Copy link
Contributor

I can't understand clearly. Does that mean comparing today's row count with yesterday's and today's row count with the day before yesterday and so on up to 7 days latest days?

For example - if today is 28 April 2023

then the comparison will be

Today previous is_equal
28-April-2023 27-April-2023 true
28-April-2023 26-April-2023 true
28-April-2023 25-April-2023 true
28-April-2023 24-April-2023 false
28-April-2023 23-April-2023 true
28-April-2023 22-April-2023 true
28-April-2023 21-April-2023 true

Is this need to be done or just have to compare it with 7 days old table? and output as true or false.

@unytics
Copy link
Owner Author

unytics commented Apr 28, 2023

I was thinking on:

  • getting table row count for today using INFORMATION_SCHEMA
  • getting table row count for latest 7 days using INFORMATION_SCHEMA and time travel
  • calling prophet bigfunction to forecast the row count of today given the row counts of the latest 7 days
  • return an anomaly score given the distance from the prediction and the real value. The score computation must be clever.

We can also avoid prophet and use a business rule.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new-bigfunction Suggest a New BigFunction
Projects
None yet
Development

No branches or pull requests

2 participants