Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collect data on performance improvements to the dashboard #145

Open
shankari opened this issue Oct 20, 2024 · 61 comments · Fixed by #150
Open

Collect data on performance improvements to the dashboard #145

shankari opened this issue Oct 20, 2024 · 61 comments · Fixed by #150

Comments

@shankari
Copy link
Contributor

Plan discussed at meeting

We will use the following environments:

  • stage
  • ccebikes (large: 86k trips, 108 users)
  • smart commute (medium, long-term: 18k trips, 21 users)
  • STM community (small: 735 trips, 4 users)

We will access the environments at the following times:

Steps to perform:

  • Log in aka "cold boot" of the dashboard (initializes to "Overview")
  • Data page, UUIDs table, Trip table, Trajectories table
  • Maps page with no filters

We will collect data for one week for the baseline and one week for each change.
We will start with reverting the previous batching changes so that we can assess the impact before moving on.

Timeline:

@shankari
Copy link
Contributor Author

to clarify, when I talk about "new PR for reversion", I expect the following

  1. git revert b9b0c347a633a1f44bf697b4cd50e720a279ac09
  2. commit (generate SHA1)
  3. add timing (generate SHA2)
  4. we determine baseline
  5. then, we do git revert SHA2; commit
  6. git revert SHA1; commit
  7. then we can test the new code

@JGreenlee
Copy link
Contributor

  1. master has been rolled back to Sep 2, before @TeachMeTW started working on the dashboard
  2. what used to be master is now called future
  3. @TeachMeTW is going to add timing to the new master. Once that is done, it is considered the "baseline".
  4. We will push the baseline to staging

We aim to have the baseline on staging and tested tonight. If good, we'll promote to the 3 prod environments we are testing on, and I will perform the first access time @ 8am MT tomorrow.

If any delay, we will push this back to 8am MT Wednesday.

@shankari
Copy link
Contributor Author

I now see the float error again; we need to cherry-pick that change

ERROR:app_sidebar_collapsible:Exception on /_dash-update-component [POST]
Traceback (most recent call last):
  File "/root/miniconda-23.5.2/envs/emission/lib/python3.9/site-packages/flask/app.py", line 2529, in wsgi_app
    response = self.full_dispatch_request()
  File "/root/miniconda-23.5.2/envs/emission/lib/python3.9/site-packages/flask/app.py", line 1825, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/root/miniconda-23.5.2/envs/emission/lib/python3.9/site-packages/flask/app.py", line 1823, in full_dispatch_request
    rv = self.dispatch_request()
  File "/root/miniconda-23.5.2/envs/emission/lib/python3.9/site-packages/flask/app.py", line 1799, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
  File "/root/miniconda-23.5.2/envs/emission/lib/python3.9/site-packages/dash/dash.py", line 1310, in dispatch
    ctx.run(
  File "/root/miniconda-23.5.2/envs/emission/lib/python3.9/site-packages/dash/_callback.py", line 442, in add_context
    output_value = func(*func_args, **func_kwargs)  # %% callback invoked %%
  File "/usr/src/app/app_sidebar_collapsible.py", line 318, in update_store_trips
    df, user_input_cols = query_confirmed_trips(start_date, end_date, timezone)
  File "/usr/src/app/utils/db_utils.py", line 183, in query_confirmed_trips
    df["data.primary_ble_sensed_mode"] = df.ble_sensed_summary.apply(get_max_mode_from_summary)
  File "/root/miniconda-23.5.2/envs/emission/lib/python3.9/site-packages/pandas/core/series.py", line 4771, in apply
    return SeriesApply(self, func, convert_dtype, args, kwargs).apply()
  File "/root/miniconda-23.5.2/envs/emission/lib/python3.9/site-packages/pandas/core/apply.py", line 1123, in apply
    return self.apply_standard()
  File "/root/miniconda-23.5.2/envs/emission/lib/python3.9/site-packages/pandas/core/apply.py", line 1174, in apply_standard
    mapped = lib.map_infer(
  File "pandas/_libs/lib.pyx", line 2924, in pandas._libs.lib.map_infer
  File "/usr/src/app/utils/db_utils.py", line 179, in <lambda>
    get_max_mode_from_summary = lambda md: max(md["distance"], key=md["distance"].get) if len(md["distance"]) > 0 else "INVALID"
TypeError: 'float' object is not subscriptable
  • Trips trend does not show up
  • Trips table is still blank
  • Contrary to @TeachMeTW's experience, the UUID table is not blank
Screenshot 2024-10-21 at 9 22 39 PM

@shankari
Copy link
Contributor Author

Re-opening since this is not actually done.

@shankari shankari reopened this Oct 23, 2024
@shankari
Copy link
Contributor Author

While collecting data at 4pm MT (3pm PT) today, I ran into some blank tables, likely due to timeouts.

e.g.
Screenshot 2024-10-23 at 3 33 44 PM

Screenshot 2024-10-23 at 3 37 06 PM Screenshot 2024-10-23 at 3 38 02 PM
Object { message: "Callback error updating tabs-content.children", html: "<html>\r\n<head><title>504 Gateway Time-out</title></head>\r\n<body>\r\n<center><h1>504 Gateway Time-out</h1></center>\r\n<hr><center>nginx</center>\r\n</body>\r\n</html>\r\n" }
[dash_renderer.v2_15_0m1729614367.min.js:2:95440](https://openpath-stage.nrel.gov/admin/_dash-component-suites/dash/dash-renderer/build/dash_renderer.v2_15_0m1729614367.min.js)
    _o https://openpath-stage.nrel.gov/admin/_dash-component-suites/dash/dash-renderer/build/dash_renderer.v2_15_0m1729614367.min.js:2
    ui https://openpath-stage.nrel.gov/admin/_dash-component-suites/dash/dash-renderer/build/dash_renderer.v2_15_0m1729614367.min.js:2
    r https://openpath-stage.nrel.gov/admin/_dash-component-suites/dash/dash-renderer/build/dash_renderer.v2_15_0m1729614367.min.js:2
    r https://openpath-stage.nrel.gov/admin/_dash-component-suites/dash/dash-renderer/build/dash_renderer.v2_15_0m1729614367.min.js:2
    p https://openpath-stage.nrel.gov/admin/_dash-component-suites/dash/dash-renderer/build/dash_renderer.v2_15_0m1729614367.min.js:2
    nt https://openpath-stage.nrel.gov/admin/_dash-component-suites/dash/dash-renderer/build/dash_renderer.v2_15_0m1729614367.min.js:2
    zi https://openpath-stage.nrel.gov/admin/_dash-component-suites/dash/dash-renderer/build/dash_renderer.v2_15_0m1729614367.min.js:2
Screenshot 2024-10-23 at 3 39 02 PM

Not sure if we are recording that anywhere or can record it given that it is an error in the plotly framework

Screenshot 2024-10-23 at 3 39 35 PM
Object { message: "Callback error updating card-active-users.children", html: "<html>\r\n<head><title>504 Gateway Time-out</title></head>\r\n<body>\r\n<center><h1>504 Gateway Time-out</h1></center>\r\n<hr><center>nginx</center>\r\n</body>\r\n</html>\r\n" }
[dash_renderer.v2_15_0m1729614367.min.js:2:95440](https://ccebikes-openpath.nrel.gov/admin/_dash-component-suites/dash/dash-renderer/build/dash_renderer.v2_15_0m1729614367.min.js)
    _o https://ccebikes-openpath.nrel.gov/admin/_dash-component-suites/dash/dash-renderer/build/dash_renderer.v2_15_0m1729614367.min.js:2
    ui https://ccebikes-openpath.nrel.gov/admin/_dash-component-suites/dash/dash-renderer/build/dash_renderer.v2_15_0m1729614367.min.js:2
    r https://ccebikes-openpath.nrel.gov/admin/_dash-component-suites/dash/dash-renderer/build/dash_renderer.v2_15_0m1729614367.min.js:2
    r https://ccebikes-openpath.nrel.gov/admin/_dash-component-suites/dash/dash-renderer/build/dash_renderer.v2_15_0m1729614367.min.js:2
    p https://ccebikes-openpath.nrel.gov/admin/_dash-component-suites/dash/dash-renderer/build/dash_renderer.v2_15_0m1729614367.min.js:2
    nt https://ccebikes-openpath.nrel.gov/admin/_dash-component-suites/dash/dash-renderer/build/dash_renderer.v2_15_0m1729614367.min.js:2
    zi https://ccebikes-openpath.nrel.gov/admin/_dash-component-suites/dash/dash-renderer/build/dash_renderer.v2_15_0m1729614367.min.js:2

@TeachMeTW
Copy link
Contributor

TeachMeTW commented Oct 23, 2024

Unlike @shankari's findings; all STM, Ebike, Smart Commute, and Stage loaded fine for me

Tested: 4 PM MT - 4:47 MT

Load order speeds [STM, Smart Commute, Stage, ccebikes ]

-Initial Load

-From sidebar click "Data"

-Click "Trips" tab

-Click "Demographics" tab

-Click "Trajectories" tab

-From sidebar click "Map"

-Select "Density Heatmap"

-Select "Unlabeled"

Everything functioned normally for me

@JGreenlee
Copy link
Contributor

As I reported on Teams this morning:

On ccebikes, UUIDs and Trajectories both timed out after several minutes and failed to display

This held true for the 12pm and 4pm accesses as well. No timeouts on smart commute or stm community.

@shankari
Copy link
Contributor Author

@JGreenlee what about on stage? Are you testing that periodically as well?

@JGreenlee
Copy link
Contributor

I tested Stage just now. UUIDs eventually loaded. Trips, Trajectories, and Heatmap were empty – I don't think it's due to a timeout, I think it is due to a lack of data in the last week

Starting tomorrow I'll access Stage along with the others.

@TeachMeTW
Copy link
Contributor

8 PM MT:

  • CCEbike did not load active users box
  • Smart Commute did not load uuids
  • Stage did not load uuids or trajectories or trips (might need to add date change to this test?)

@TeachMeTW
Copy link
Contributor

12 AM MT:

  • CCEbike did not load active users box
  • Smart Commute now loads uuids
  • CCEbike long load times in trajectories
  • Stage loads uuids
  • Stage no trips or trajectories

@JGreenlee
Copy link
Contributor

On ccebikes, UUIDs and Trajectories both timed out after several minutes and failed to display

This held true for the 12pm and 4pm accesses as well. No timeouts on smart commute or stm community.

I tested Stage just now. UUIDs eventually loaded. Trips, Trajectories, and Heatmap were empty – I don't think it's due to a timeout, I think it is due to a lack of data in the last week

Starting tomorrow I'll access Stage along with the others.

8am MT was consistent with my findings from yesterday.

The active users box is indeed missing on ccebikes; it was likely missing yesterday as well but I didn't notice.

@JGreenlee
Copy link
Contributor

I'm going to make sure my staging phones are on and connected to Wi-Fi today so that we have at least a couple trips on stage to populate Trips, Trajectories and the map

@TeachMeTW
Copy link
Contributor

Tested: 4 PM MT

  • Home page functioned normally for all
  • CCEbike UUIDs did NOT load
  • Stage still no trips or trajectories; seems not added yet
  • CCEbike slow on trajectories

@TeachMeTW
Copy link
Contributor

@shankari I discussed with @JGreenlee in regards to the analysis, he suggested this workflow; what are your thoughts?

I recommend performing exploratory analysis+viz before removing outliers so that you know the data you're working with, you can see the outliers and identify them.
 
You will have dozens of fine-grained measures for execution time of small tasks. I think you should try to extract a couple higher-level measures that are specifically related to the changes you made.
I suggest something like i) total time to load all figures on the overview and ii) time to load the UUIDs table
 
Then generate descriptives+ charts for each high-level measure ("basic stuff" like mean execution time by week, dataset size)
You could generate these for fine-grained measures too, for exploratory purposes, but probably won't make it in the paper.
 
If I remember stats, I think the main analysis you want to perform is probably a two-way repeated measures ANOVA. We are mostly interested in 2 IVs:
The week (week 1 = baseline; week 2 = first batch of improvements; week 3 = both batches of improvements).
Dataset size (small = stm community, medium = smart commute, large = ccebikes)
With the DV being execution time for the high level tasks
 
This will allow you to test for:
Main effect of week (Did performance change significantly as we rolled out improvements?)
Main effect of dataset size (Does performance significantly degrade with size of the dataset?)
Interaction effect (Does the performance change differently depending on the size of the dataset? (highlight this for scalability)

I don't see what a correlation matrix would be useful for.
I don't see a need for modeling or ML unless (i) you think the time of day has a significant effect that you want to model or (ii) you want a regression model to predict how the system would scale to even larger programs (larger than ccebikes). But I suggest trying to keep your analysis fairly small-scale.

@shankari
Copy link
Contributor Author

I was a little late starting today because I forgot (I should put a reminder in my calendar). I started at closer to 4:30pm MT.

@shankari
Copy link
Contributor Author

shankari commented Oct 24, 2024

@TeachMeTW @JGreenlee is your mentor, so I would take his advice 😄
Seriously, if there is something that JGreenlee is unclear about and needs my help, I am happy to step in.
But otherwise, I trust Jack's judgement, and am happy to review results once they are ready and provide feedback that way.

BTW, I completely agree on "no ML", EDA != ML
I would suggest starting with basic data viz (e.g. box plots) even before using "fancy" stats like ANOVA.
Ideally, we could show the difference visually between weeks/datasets

@TeachMeTW
Copy link
Contributor

8 PM MT:

  • Home Page loaded normally albeit slow on ccebikes
  • CCEbike did not load uuids
  • Still no trajectories or trips on stage

@shankari
Copy link
Contributor Author

Still no trajectories or trips on stage

Interesting. I saw one trip and its associated trajectory on stage (presumably from Jack's phone)

@TeachMeTW
Copy link
Contributor

TeachMeTW commented Oct 25, 2024

12 AM MT:

  • Home page loaded normally
  • CCEbike did not load uuids
  • Still no trajectories or trips on stage (maybe it hasnt updated on my end?)

@JGreenlee
Copy link
Contributor

I saw my trip on staging yesterday and this morning.
Behavior is still fairly consistent on my end, except this morning I noticed that Trajectories on ccebikes loaded successfully instead of timing out.

@JGreenlee
Copy link
Contributor

Seriously, if there is something that JGreenlee is unclear about and needs my help, I am happy to step in. But otherwise, I trust Jack's judgement, and am happy to review results once they are ready and provide feedback that way.

@shankari For the record, I had suggested that @TeachMeTW get your advice in addition to mine since most of my experience with study design + statistical analysis has been in a different domain

@TeachMeTW
Copy link
Contributor

4 PM MT:

  • Only STM loaded uuids
  • Trajectories for CCEbike taking 15+ min/permanently loading

@shankari
Copy link
Contributor Author

I forgot today too and am just loading now. @TeachMeTW did you do this at 4pm PT or 4pm MT?

@TeachMeTW
Copy link
Contributor

@shankari I started at 3:15 PST, did not finish until 4 PST; mainly ccebike not finishing loading at all

@TeachMeTW
Copy link
Contributor

TeachMeTW commented Oct 26, 2024

~ Fixing ~

@TeachMeTW
Copy link
Contributor

TeachMeTW commented Oct 29, 2024

3 MT:

  • CCEbike not load UUIDs
  • All are slow except STM

@TeachMeTW
Copy link
Contributor

Late comment but, 8 MT:

  • All Loaded Normally

@TeachMeTW
Copy link
Contributor

12 AM MT:

  • CCEbike no uuid

@TeachMeTW
Copy link
Contributor

6 PM MT:
Tested Update on Staging

  • Loading updates work
  • Home Page normal
  • UUIDs loaded chunks normally
  • Tested table page retention: Works
  • Trajectories switcher worked albeit slow
  • Maps Work

@JGreenlee
Copy link
Contributor

I started the next phase of data collection this morning, repeating the same steps as before. I observed batch loading of UUIDs working. I waited for all UUIDs to finish loading before I proceeded to the "Trips" tab.

No 'active minutes' on ccebikes

@JGreenlee
Copy link
Contributor

JGreenlee commented Oct 31, 2024

On ccebikes, I found that the UUIDs tab still took a long time to initialize – even though it now shows only 10 users at first. Then it loads 10 users every batch at a constant rate, so it took a while for all 111 users to be shown.

@TeachMeTW
Copy link
Contributor

@shankari @JGreenlee

Given the current setup where only 10 users are loaded initially and subsequent batches load at a steady rate (10 users per 24000 ms), would it be possible to improve testing speed by only loading a subset of batches? This would reduce overall wait time during testing, although we may lose some data points and miss potential issues that might appear in later batches. Do you think this trade-off would be acceptable, or is it important to retain the full load for thorough testing?

@TeachMeTW
Copy link
Contributor

TeachMeTW commented Oct 31, 2024

4 PM MT:

  • STM Community loaded normally
  • Smart Commute and CCEbike loaded batches way later than expected 24000ms but eventually loaded.
  • Trajectories and switching keys take a long time

Dynamic Interval Loading might be something to explore

@TeachMeTW
Copy link
Contributor

9 PM MT:

  • Everything works, slightly faster than last time

@TeachMeTW
Copy link
Contributor

4 AM MT:

  • Same as last entry

@TeachMeTW
Copy link
Contributor

TeachMeTW commented Nov 9, 2024

@shankari @JGreenlee CCEbike Graphs -- for round 1 of changes (NOT BASELINE)
image

image

image

@TeachMeTW
Copy link
Contributor

More graphs:
image
image

@shankari
Copy link
Contributor Author

Results do seem good. But I would like some explanation and narrative around this.
Are these results only for staging, or also for other environments? What is the explanation for them, etc?
I would suggest filling out the results section of your report with the charts and the associated narrative.
The goal is not to make visually appealing charts; the goal is to make it easier for people to understand your results

@shankari
Copy link
Contributor Author

shankari commented Nov 11, 2024

Methodology:

  • block diagram of admin dashboard
  • instrumented the various sections
  • logged in at some cadence; collected data for a week
  • made improvements
  • logged in at another cadence; collected data for another week
  • compared before and after

Implementation and results:

  • found that the majority of the time was spent in these section
  • improvement 1
    • we realized that the issue was that we were loading all the data at the same time
    • started batching
    • issues with implementing batching
    • improvement results
  • improvement 2
    • now max time was spent in ....
    • can precompute and cache
    • issues with implementing precomputation and caching
    • improvement results (TBD
  • summary and overall results

@TeachMeTW
Copy link
Contributor

@shankari until stage is reuploaded here are my insights:

Function, Execution Time (seconds)

Lowest 80%

data.name,data.reading
admin/data/update_dropdowns_trips/determine_hidden_columns_and_label,1.947862946675444e-06
admin/data/populate_datatable/check_dataframe_type,2.046613198896547e-06
admin/data/update_store_trajectories/prepare_store_data,2.060361875919625e-06
update_card_users/calculate_number_of_users,3.4012816479948702e-06
admin/home/update_card_trips/calculate_number_of_trips,3.9885911499605635e-06
admin/data/update_sub_tab/retrieve_and_process_data,3.998180697332525e-06
admin/home/generate_plot_trips_trend/convert_iso_to_date_only,4.4479934529036935e-06
admin/db_utils/clean_location_data/total_time,3.3919877825276354e-05
admin/db_utils/query_uuids/logging_and_query_initiation,8.000350707152393e-05
admin/data/render_content/handle_demographics_tab,0.00013015080367040355
admin/db_utils/query_demographics/organize_survey_keys,0.00015084072795919687
admin/home/generate_card/generate_card_layout,0.00016447972178438623
admin/home/find_last_get/convert_to_uuid_objects,0.00023495605481522425
admin/db_utils/query_trajectories/convert_date_range_to_timestamps,0.00023877198247646448
admin/db_utils/query_confirmed_trips/convert_date_range_to_timestamps,0.0003273542747809275
admin/home/generate_barplot/update_layout_with_title,0.00037205597217397103
admin/home/compute_trips_trend/extract_date,0.0005048959373316789
admin/home/get_number_of_active_users/calculate_active_users,0.0006949753012685548
admin/db_utils/query_confirmed_trips/rename_columns,0.0006985895775609866
admin/home/update_card_active_users/convert_to_dataframe,0.0007020320802453965
admin/home/generate_plot_sign_up_trend/convert_to_dataframe,0.0007046631796926748
admin/data/update_sub_tab/filter_columns,0.0007352904097691256
admin/home/compute_sign_up_trend/convert_to_datetime,0.0011078711812857723
admin/home/compute_trips_trend/group_by_and_calculate_counts,0.0019007130519995649
admin/data/update_sub_tab/convert_to_dataframe,0.001956743379448906
admin/db_utils/query_confirmed_trips/process_binary_and_named_columns,0.0026572991253141524
admin/db_utils/query_uuids/dataframe_processing,0.0028467976734427793
admin/db_utils/add_user_stats/retrieve_user_profile_data,0.003226581826238717
admin/home/compute_sign_up_trend/group_by_and_calculate_counts,0.0032378398798039703
admin/home/generate_plot_trips_trend/convert_to_dataframe,0.0032821491783935294
admin/db_utils/df_to_filtered_records/filter_dataframe_if_needed,0.003408925289243042
admin/home/compute_trips_trend/convert_to_datetime,0.003680147491926935
admin/db_utils/query_confirmed_trips/process_coordinates_and_modes,0.005306827750419394
admin/db_utils/query_confirmed_trips/expand_and_filter_user_inputs,0.009418276858057305
admin/db_utils/query_uuids/convert_query_result_to_dataframe,0.010353045917858253
admin/home/generate_card/total_time,0.011176629554714
admin/data/update_dropdowns_trips/total_time,0.012006994434029496
admin/db_utils/query_demographics/create_dataframes,0.015817682580615968
admin/home/update_card_active_users/generate_active_users_card,0.01920611160023818
admin/db_utils/query_demographics/process_dataframes,0.019244120024051795
admin/data/render_content/handle_uuids_tab,0.019926290027797222
admin/db_utils/query_confirmed_trips/humanize_distance_and_duration,0.022384994969278132
admin/home/update_card_trips/generate_trips_card,0.022472899088667043
update_card_users/generate_user_card,0.025483410393239044
admin/home/compute_sign_up_trend/total_time,0.02933595221123246
admin/data/update_sub_tab/populate_datatable,0.0318839150521247
admin/data/render_content/total_time,0.03323825279854329
admin/home/compute_trips_trend/total_time,0.035378578287908684
admin/data/populate_datatable/create_datatable,0.03774038002808034
admin/home/generate_barplot/generate_barplot_with_data,0.039363626396305536
admin/home/generate_plot_sign_up_trend/compute_sign_up_trend,0.0400892436870786
admin/home/generate_plot_trips_trend/compute_trips_trend,0.044780456077680116
admin/home/update_card_trips/total_time,0.046317080631406105
admin/db_utils/df_to_filtered_records/convert_to_dict_of_records,0.04962066936440856
admin/db_utils/query_uuids/total_time,0.04976970487803555
admin/data/render_content/handle_trips_tab,0.0507602873720316
admin/home/generate_barplot/initialize_empty_barplot,0.051887884061879025
update_card_users/total_time,0.0561006364000381
admin/data/populate_datatable/total_time,0.05679313991567768
admin/db_utils/add_user_stats/get_last_trip_timestamp,0.06127944804549133
admin/db_utils/add_user_stats/get_first_trip_timestamp,0.06184159415772293
admin/db_utils/add_user_stats/count_labeled_trips,0.06342866654713583
admin/db_utils/df_to_filtered_records/total_time,0.07345906148523829
admin/data/update_sub_tab/total_time,0.0753064348988725
admin/db_utils/query_demographics/query_data,0.08909172358757392
admin/home/generate_plot_trips_trend/generate_barplot,0.12236394326804578
admin/home/generate_barplot/total_time,0.1288078651732151
admin/home/generate_plot_sign_up_trend/generate_barplot,0.1545202970718507
admin/db_utils/query_trajectories/add_mode_string,0.1633187119957438
admin/db_utils/query_demographics/total_time,0.17353401138483432
admin/data/render_content/prepare_final_dataframe_and_return,0.2047745715041297
admin/home/generate_plot_trips_trend/total_time,0.21455150090016262
admin/home/generate_plot_sign_up_trend/total_time,0.2335722325064364
admin/data/update_store_trajectories/filter_records,0.4412800445715402
admin/db_utils/query_trajectories/process_dataframe_columns,0.5254711637183209
admin/db_utils/add_user_stats/count_total_trips,0.8359979697176627
admin/db_utils/query_confirmed_trips/retrieve_aggregate_time_series,2.383731212208139
admin/db_utils/query_confirmed_trips/total_time,2.399663755555122
admin/db_utils/add_user_stats/process_user,2.4734303609223955
admin/db_utils/add_user_stats/get_last_server_call_timestamp,2.890578399274938

Top 20%

data.name,data.reading
admin/home/update_card_active_users/total_time,380.5705834423147
admin/home/update_card_active_users/calculate_active_users,380.51765034232466
admin/home/get_number_of_active_users/total_time,380.50897508538844
admin/home/get_number_of_active_users/find_last_get_entries,380.49126240247415
admin/home/find_last_get/total_time,380.48344891932277
admin/home/find_last_get/query_timeseries_db,380.4540818235763
admin/data/render_content/handle_trajectories_tab,364.7788547482385
admin/data/update_store_trajectories/total_time,364.5321865298467
admin/data/update_store_trajectories/query_trajectories,364.06441769682766
admin/db_utils/query_trajectories/total_time,364.0559167172793
admin/data/render_content/total_time,352.24918736368954
admin/data/render_content/handle_uuids_tab,350.33841586602665
admin/db_utils/add_user_stats/total_time,350.3015870457756
admin/db_utils/query_trajectories/retrieve_entries,213.97728912887717
admin/db_utils/query_trajectories/convert_entries_to_dataframe,149.3421039039905
admin/db_utils/add_user_stats/process_batches,44.96146263926428
admin/db_utils/add_user_stats/processing_loop_stage,44.95224149511222
admin/db_utils/add_user_stats/get_last_trip_timestamp,44.94357279768172
admin/db_utils/add_user_stats/get_last_server_call_timestamp,23.636255698262932
admin/db_utils/add_user_stats/process_user,20.8464001387978
admin/db_utils/query_confirmed_trips/retrieve_aggregate_time_series,8.48705214093778
admin/db_utils/query_confirmed_trips/total_time,8.38841827082712
admin/db_utils/add_user_stats/count_total_trips,6.84013815707495

@TeachMeTW
Copy link
Contributor

Function-wise:

Bottom 80%:

function,data.reading
add_user_stats,350.3015870457756
clean_location_data,3.3919877825276354e-05
compute_sign_up_trend,0.02933595221123246
compute_trips_trend,0.035378578287908684
df_to_filtered_records,0.07345906148523829
generate_barplot,0.1288078651732151
generate_card,0.011176629554714
generate_plot_sign_up_trend,0.2335722325064364
generate_plot_trips_trend,0.21455150090016262
populate_datatable,0.05679313991567768
query_confirmed_trips,4.53850465386655
query_demographics,0.17353401138483432
query_uuids,0.04976970487803555
render_content,332.586646515266
update_card_trips,0.046317080631406105
update_dropdowns_trips,0.012006994434029496
update_sub_tab,0.0753064348988725

Top 20%:

function,data.reading
find_last_get,380.48344891932277
get_number_of_active_users,380.50897508538844
query_trajectories,364.0559167172793
update_card_active_users,380.5705834423147
update_store_trajectories,364.5321865298467

@shankari
Copy link
Contributor Author

@TeachMeTW I suggested removing the bottom 80% of stats for the pipeline changes not the admin dashboard. The admin dashboard is only launched when a user logs in, so the additional data storage is minimal. The pipeline runs every hour, so the additional data storage is substantial.

I am not opposed to (eventually) dropping the bottom 80% of timing stats in the admin dashboard, but that is not blocking the push to production of the pipeline timing changes.

@TeachMeTW
Copy link
Contributor

@shankari please reupload the staging data snapshot; the one from tuesday became corrupted

@shankari
Copy link
Contributor Author

I also see add_user_stats,350.3015870457756 in the bottom 80% which looks wrong. What does the number there represent?

@TeachMeTW
Copy link
Contributor

TeachMeTW commented Nov 15, 2024

I also see add_user_stats,350.3015870457756 in the bottom 80% which looks wrong. What does the number there represent?

That number is in seconds of execution time, maybe there wasn't enough functions; this is what i dod

    # Compute the 80th percentile threshold
    threshold_80 = df_total_time_grouped['data.reading'].quantile(0.8)

    # Split into top 20% and bottom 80%
    df_top_20_total_time = df_total_time_grouped[df_total_time_grouped['data.reading'] > threshold_80]
    df_bottom_80_total_time = df_total_time_grouped[df_total_time_grouped['data.reading'] <= threshold_80]

@TeachMeTW
Copy link
Contributor

As for pipeline, these are the values (Not the most recent):

Bottom 80:

data.name,data.reading
ACCURACY_FILTERING,0.02578527937728916
CLEAN_RESAMPLING,2.146863155281558
CREATE_COMPOSITE_OBJECTS,1.7069802132425573
CREATE_CONFIRMED_OBJECTS,1.4718458523202962
EXPECTATION_POPULATION,0.088950674887729
GENERATE_STORE_AND_RANGE,0.0036104017272961075
JUMP_SMOOTHING,1.0057648629329274
LABEL_INFERENCE,0.18408866121307846
STORE_USER_STATS,0.3237537619999994
TRIP_SEGMENTATION/check_out_of_order_points,0.00028674999999989126
TRIP_SEGMENTATION/create_dist_filter,2.9649600000247746e-06
TRIP_SEGMENTATION/create_places_and_trips/create_new_place,0.00014887752303130057
TRIP_SEGMENTATION/create_places_and_trips/create_raw_trip,0.00015851584249614424
TRIP_SEGMENTATION/create_places_and_trips/get_last_place_entry,0.002671166571432642
TRIP_SEGMENTATION/create_places_and_trips/get_time_series,8.577142859828818e-06
TRIP_SEGMENTATION/create_places_and_trips/handle_untracked_period,0.0034603348518523224
TRIP_SEGMENTATION/create_places_and_trips/insert_last_place,0.0006448212857141604
TRIP_SEGMENTATION/create_places_and_trips/link_and_save,0.002157715270430808
TRIP_SEGMENTATION/create_places_and_trips/start_new_chain,0.0002463035714241256
TRIP_SEGMENTATION/create_time_filter,2.7318399999920473e-06
TRIP_SEGMENTATION/fetch_location_data,0.18090795849999997
TRIP_SEGMENTATION/get_filters_in_df,0.047898718239999916
TRIP_SEGMENTATION/get_time_range,0.0019015625000001757
TRIP_SEGMENTATION/get_time_range_for_segmentation,0.001661970040000007
TRIP_SEGMENTATION/get_time_series,1.3946793103429986e-05
TRIP_SEGMENTATION/handle_out_of_order_points,0.0008165149199999533
TRIP_SEGMENTATION/identify_active_filters,0.0005030420000000646
TRIP_SEGMENTATION/initialize_filters,4.625000000091362e-06
TRIP_SEGMENTATION/mark_segmentation_done,0.0014093750000014893
TRIP_SEGMENTATION/segment_into_trips/append_segmentation,0.0014314497241378363
TRIP_SEGMENTATION/segment_into_trips/calculations_per_iteration,0.005296046663291943
TRIP_SEGMENTATION/segment_into_trips/check_transitions_post_loop,0.0022789169999981596
TRIP_SEGMENTATION/segment_into_trips/continue_just_ended,0.0002355626958475048
TRIP_SEGMENTATION/segment_into_trips/filter_bogus_points,0.025727010249998017
TRIP_SEGMENTATION/segment_into_trips/find_last_valid_point,0.00013377017130620743
TRIP_SEGMENTATION/segment_into_trips/get_filtered_location,0.21980716699999991
TRIP_SEGMENTATION/segment_into_trips/get_last_trip_end_point,0.00016301416981153078
TRIP_SEGMENTATION/segment_into_trips/handle_final_trip_end,0.0024963339999999334
TRIP_SEGMENTATION/segment_into_trips/handle_trip_end,0.0003509445943395742
TRIP_SEGMENTATION/segment_into_trips/has_trip_ended,0.004189222965080378
TRIP_SEGMENTATION/segment_into_trips/initialize_last_ts_processed,1.8749999988187938e-06
TRIP_SEGMENTATION/segment_into_trips/mark_valid,0.0003279579999997395
TRIP_SEGMENTATION/segment_into_trips/post_loop,0.008723180666670771
TRIP_SEGMENTATION/segment_into_trips/process_row,0.004088960585417935
TRIP_SEGMENTATION/segment_into_trips/select_last10Points,3.91455750994308e-05
TRIP_SEGMENTATION/segment_into_trips/set_new_trip_start,0.00039244493103457136
TRIP_SEGMENTATION/segment_into_trips/set_new_trip_start_point,8.748888886442627e-07
TRIP_SEGMENTATION/segment_into_trips/set_valid_column,0.00031091600000010544
TRIP_SEGMENTATION/segment_into_trips_dist/append_segmentation,0.0013334490765433572
TRIP_SEGMENTATION/segment_into_trips_dist/find_last_valid_point,0.0001416281774779063
TRIP_SEGMENTATION/segment_into_trips_dist/get_transition_df,1.6218878335
TRIP_SEGMENTATION/segment_into_trips_dist/handle_final_trip_end,0.002880708499998441
TRIP_SEGMENTATION/segment_into_trips_dist/has_trip_ended,0.0006933613848144605
TRIP_SEGMENTATION/segment_into_trips_dist/process_row,0.003653419676263156
TRIP_SEGMENTATION/segment_into_trips_dist/select_last10Points,3.9679244972060673e-05
TRIP_SEGMENTATION/segment_into_trips_dist/set_new_trip_start,0.0003628507185185189
TRIP_SEGMENTATION/segment_into_trips_dist/set_valid_column,0.00034896849999971336
TRIP_SEGMENTATION/segment_into_trips_time/calculate_last10PointsDistances,0.0007601339955688976
TRIP_SEGMENTATION/segment_into_trips_time/calculate_last5MinTimes,0.0014257610955149005
TRIP_SEGMENTATION/segment_into_trips_time/calculate_last5MinsDistances,0.0016315710044803179
TRIP_SEGMENTATION/segment_into_trips_time/calculate_last5MinsPoints,0.0007967894185909164
TRIP_SEGMENTATION/segment_into_trips_time/filter_bogus_points,0.026154527666664745
TRIP_SEGMENTATION/segment_into_trips_time/find_last_valid_point,0.0002195419999999615
TRIP_SEGMENTATION/segment_into_trips_time/get_last_trip_end_point,0.0004832671014494053
TRIP_SEGMENTATION/segment_into_trips_time/has_trip_ended,0.006613862309123931
TRIP_SEGMENTATION/segment_into_trips_time/process_row,0.0001331676384539252
TRIP_SEGMENTATION/segment_into_trips_time/select_last10Points,4.438657852513701e-05
TRIP_SEGMENTATION/segment_into_trips_time/set_new_trip_start_before,7.739584210796873e-05
TRIP_SEGMENTATION/segment_into_trips_time/set_new_trip_start_else,2.1675398940299194e-06
TRIP_SEGMENTATION/segment_into_trips_time/set_valid_column,0.00032020799999976646
TRIP_SEGMENTATION/setup_filter_methods,2.6244999997704355e-06
USERCACHE,0.3603736498283744
USER_INPUT_MATCH_INCOMING,0.9195849730486005

Top 20%:

data.name,data.reading
MODE_INFERENCE,3.666587927518007
OUTPUT_GEN,24.278678695494197
SECTION_SEGMENTATION,5.911851689464339
TRIP_SEGMENTATION,42.91766042598784
TRIP_SEGMENTATION/create_places_and_trips,9.011653595800025
TRIP_SEGMENTATION/create_places_and_trips/loop_segmentation_points,6.382551351285717
TRIP_SEGMENTATION/get_data_df,14.46887485496
TRIP_SEGMENTATION/process_segmentation_points,3.9579038750000013
TRIP_SEGMENTATION/segment_into_trips,534.1603671744546
TRIP_SEGMENTATION/segment_into_trips/get_filtered_points_df,6.29462725
TRIP_SEGMENTATION/segment_into_trips/get_filtered_points_pre_ts_diff_df,22.4045843335
TRIP_SEGMENTATION/segment_into_trips/get_transition_df,4.042007972000001
TRIP_SEGMENTATION/segment_into_trips/loop,1799.1103744306668
TRIP_SEGMENTATION/segment_into_trips/loop_over_points,15.94234375
TRIP_SEGMENTATION/segment_into_trips_dist/get_filtered_location,2.6661164687500003
TRIP_SEGMENTATION/segment_into_trips_dist/loop_over_points,89.206788323
TRIP_SEGMENTATION/segment_into_trips_time/get_filtered_location,18.28722388525
TRIP_SEGMENTATION/segment_into_trips_time/get_transition_df,4.271482604250001
TRIP_SEGMENTATION/segment_single_filter,10.568774542

@shankari
Copy link
Contributor Author

shankari commented Dec 22, 2024

After merging e-mission/e-mission-server#993 and e-mission/e-mission-server#1005 and #153, I logged in to staging.

@JGreenlee @TeachMeTW

And the initial "Overview" load was still pretty slow. But we were still loading the UUIDs in a batch when I went to the data table.

So presumably the "Overview" load didn't load all the UUIDs? Because if it did, we should just have cached and reused that data, right? So then what is the overview loading?

The functionality seems to be working correctly, and we can collect data next week, but I think that this can benefit from some cleanup.

@shankari
Copy link
Contributor Author

shankari commented Dec 22, 2024

@JGreenlee @TeachMeTW as a suggestion, while cleaning up e-mission/e-mission-server#1005, you can also split the user stats into pipeline-dependent and pipeline-independent. The pipeline-dependent stats can stay as the last stage in the pipeline, and the pipeline-independent stats can be outside the early return. This will allow us to not waste time recomputing the pipeline-dependent stats when the pipeline wasn't run.

The current structure of run_intake_pipeline_for_user may make it hard to test the pipeline-independent code; feel free to refactor/restructure if that makes it cleaner.

@shankari
Copy link
Contributor Author

Deployed to the three production environments as well.
Happy testing and make sure to write down your testing time/results here.

@JGreenlee
Copy link
Contributor

And the initial "Overview" load was still pretty slow. But we were still loading the UUIDs in a batch when I went to the data table.

So presumably the "Overview" load didn't load all the UUIDs? Because if it did, we should just have cached and reused that data, right? So then what is the overview loading?

I believe the Overview page does create the entire list of UUIDs.

On the Data page UUIDs table, we are loading stats for those UUIDs. We already have the full list of UUIDs by the time we get to the Data table, and we split that list into slices of 10 UUIDs and load the stats for one slice at a time.

@JGreenlee
Copy link
Contributor

Accessed all 4 dashboards ~9am MT. Nothing appears significantly different than before.
The Overview is slow. UUIDs table batching works but each batch takes ~25 seconds.

The Trips and Trajectories tabs did not take as long as I remembered. Maybe it's simply because it's winter and there is less travel in the last week.

I didn't notice anything broken on any of the dashboards.

There are some recent trips on stage but no recent trips on stm-community.

@JGreenlee
Copy link
Contributor

UUIDs table batching works but each batch takes ~25 seconds.

This is likely happening faster computationally due to offloading the user stats computation to the pipeline, but we are not observing the effect because of the 20 second interval between batches.
I think we will need to look at stored stats/dashboard_time entries to see how much faster it is.

@JGreenlee
Copy link
Contributor

JGreenlee commented Jan 2, 2025

Loading the overview and trajectories seemed slower today across the board.
I also notice that the "Active Users" box failed to load on ccebikes and smart-commute. I don't believe it has consistently been that way

@TeachMeTW
Copy link
Contributor

Same results for me as well

@TeachMeTW
Copy link
Contributor

Too add to this, today seems better for smart commut but ccebikes still fails on active users

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants