-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Collect data on performance improvements to the dashboard #145
Comments
to clarify, when I talk about "new PR for reversion", I expect the following
|
We aim to have the baseline on staging and tested tonight. If good, we'll promote to the 3 prod environments we are testing on, and I will perform the first access time @ 8am MT tomorrow. If any delay, we will push this back to 8am MT Wednesday. |
I now see the float error again; we need to cherry-pick that change
|
Re-opening since this is not actually done. |
Unlike @shankari's findings; all STM, Ebike, Smart Commute, and Stage loaded fine for me Tested: 4 PM MT - 4:47 MT Load order speeds [STM, Smart Commute, Stage, ccebikes ] -Initial Load -From sidebar click "Data" -Click "Trips" tab -Click "Demographics" tab -Click "Trajectories" tab -From sidebar click "Map" -Select "Density Heatmap" -Select "Unlabeled" Everything functioned normally for me |
As I reported on Teams this morning:
This held true for the 12pm and 4pm accesses as well. No timeouts on smart commute or stm community. |
@JGreenlee what about on stage? Are you testing that periodically as well? |
I tested Stage just now. UUIDs eventually loaded. Trips, Trajectories, and Heatmap were empty – I don't think it's due to a timeout, I think it is due to a lack of data in the last week Starting tomorrow I'll access Stage along with the others. |
8 PM MT:
|
12 AM MT:
|
8am MT was consistent with my findings from yesterday. The active users box is indeed missing on ccebikes; it was likely missing yesterday as well but I didn't notice. |
I'm going to make sure my staging phones are on and connected to Wi-Fi today so that we have at least a couple trips on |
Tested: 4 PM MT
|
@shankari I discussed with @JGreenlee in regards to the analysis, he suggested this workflow; what are your thoughts?
|
I was a little late starting today because I forgot (I should put a reminder in my calendar). I started at closer to 4:30pm MT. |
@TeachMeTW @JGreenlee is your mentor, so I would take his advice 😄 BTW, I completely agree on "no ML", EDA != ML |
8 PM MT:
|
Interesting. I saw one trip and its associated trajectory on stage (presumably from Jack's phone) |
12 AM MT:
|
I saw my trip on staging yesterday and this morning. |
@shankari For the record, I had suggested that @TeachMeTW get your advice in addition to mine since most of my experience with study design + statistical analysis has been in a different domain |
4 PM MT:
|
I forgot today too and am just loading now. @TeachMeTW did you do this at 4pm PT or 4pm MT? |
@shankari I started at 3:15 PST, did not finish until 4 PST; mainly ccebike not finishing loading at all |
~ Fixing ~ |
3 MT:
|
Late comment but, 8 MT:
|
12 AM MT:
|
6 PM MT:
|
I started the next phase of data collection this morning, repeating the same steps as before. I observed batch loading of UUIDs working. I waited for all UUIDs to finish loading before I proceeded to the "Trips" tab. No 'active minutes' on ccebikes |
On ccebikes, I found that the UUIDs tab still took a long time to initialize – even though it now shows only 10 users at first. Then it loads 10 users every batch at a constant rate, so it took a while for all 111 users to be shown. |
Given the current setup where only 10 users are loaded initially and subsequent batches load at a steady rate (10 users per 24000 ms), would it be possible to improve testing speed by only loading a subset of batches? This would reduce overall wait time during testing, although we may lose some data points and miss potential issues that might appear in later batches. Do you think this trade-off would be acceptable, or is it important to retain the full load for thorough testing? |
4 PM MT:
Dynamic Interval Loading might be something to explore |
9 PM MT:
|
4 AM MT:
|
@shankari @JGreenlee CCEbike Graphs -- for round 1 of changes (NOT BASELINE) |
Results do seem good. But I would like some explanation and narrative around this. |
Methodology:
Implementation and results:
|
@shankari until stage is reuploaded here are my insights: Function, Execution Time (seconds)Lowest 80%
Top 20%
|
Function-wise:Bottom 80%:
Top 20%:
|
@TeachMeTW I suggested removing the bottom 80% of stats for the pipeline changes not the admin dashboard. The admin dashboard is only launched when a user logs in, so the additional data storage is minimal. The pipeline runs every hour, so the additional data storage is substantial. I am not opposed to (eventually) dropping the bottom 80% of timing stats in the admin dashboard, but that is not blocking the push to production of the pipeline timing changes. |
@shankari please reupload the staging data snapshot; the one from tuesday became corrupted |
I also see |
That number is in seconds of execution time, maybe there wasn't enough functions; this is what i dod
|
As for pipeline, these are the values (Not the most recent): Bottom 80:
Top 20%:
|
After merging e-mission/e-mission-server#993 and e-mission/e-mission-server#1005 and #153, I logged in to staging. And the initial "Overview" load was still pretty slow. But we were still loading the UUIDs in a batch when I went to the data table. So presumably the "Overview" load didn't load all the UUIDs? Because if it did, we should just have cached and reused that data, right? So then what is the overview loading? The functionality seems to be working correctly, and we can collect data next week, but I think that this can benefit from some cleanup. |
@JGreenlee @TeachMeTW as a suggestion, while cleaning up e-mission/e-mission-server#1005, you can also split the user stats into pipeline-dependent and pipeline-independent. The pipeline-dependent stats can stay as the last stage in the pipeline, and the pipeline-independent stats can be outside the early return. This will allow us to not waste time recomputing the pipeline-dependent stats when the pipeline wasn't run. The current structure of |
Deployed to the three production environments as well. |
I believe the Overview page does create the entire list of UUIDs. On the Data page UUIDs table, we are loading stats for those UUIDs. We already have the full list of UUIDs by the time we get to the Data table, and we split that list into slices of 10 UUIDs and load the stats for one slice at a time. |
Accessed all 4 dashboards ~9am MT. Nothing appears significantly different than before. The Trips and Trajectories tabs did not take as long as I remembered. Maybe it's simply because it's winter and there is less travel in the last week. I didn't notice anything broken on any of the dashboards. There are some recent trips on |
This is likely happening faster computationally due to offloading the user stats computation to the pipeline, but we are not observing the effect because of the 20 second interval between batches. |
Loading the overview and trajectories seemed slower today across the board. |
Same results for me as well |
Too add to this, today seems better for smart commut but ccebikes still fails on active users |
Plan discussed at meeting
We will use the following environments:
We will access the environments at the following times:
Steps to perform:
We will collect data for one week for the baseline and one week for each change.
We will start with reverting the previous batching changes so that we can assess the impact before moving on.
Timeline:
The text was updated successfully, but these errors were encountered: