Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🧹 ✈️ 👤 Pending cleanups #149

Open
1 of 4 tasks
Abby-Wheelis opened this issue Sep 7, 2024 · 14 comments
Open
1 of 4 tasks

🧹 ✈️ 👤 Pending cleanups #149

Abby-Wheelis opened this issue Sep 7, 2024 · 14 comments

Comments

@Abby-Wheelis
Copy link
Member

Abby-Wheelis commented Sep 7, 2024

Centralizing some cleanups that need to happen pending the unification of base mode colors:

  • finish unifying the labels
  • remove the csvs
  • filter by baseMode for "land trips" see this comment
  • consider displaying user-entered modes if they're common enough

At a high level, in a future fix, I think we should consider user labels if they represent a large enough proportion of the aggregate data. So if 40% of the data is "Other" and 90% of the "Other" is "Air", maybe we should just show "Air". But that is going to require a lot of judgement calls - what is "large enough proportion"? what if the text entered by users is subtly different? so let's punt on that for now.

user mode display comment

@Abby-Wheelis
Copy link
Member Author

what is "large enough proportion"?

My initial though of what would be a "large enough" proportion of user-entered labels might be "if they're big enough to not get grouped into other anyways", which I suppose could result in a lot of bars that are just bigger than our cutoff (2%), so we would probably want to do some experimenting.

We could also use the cutoff that for the numbers to be printed in the bar - which is currently 4%

what if the text entered by users is subtly different?

We can easily take care of things like capitalization and spaces, but other factors like spelling or subtle wording (roller skate vs roller blade) might get difficult to control for

@Abby-Wheelis
Copy link
Member Author

Additional Cleanup required - the patches to prevent both "Other" and "OTHER" are messy

@iantei
Copy link
Contributor

iantei commented Sep 15, 2024

We do not have INVALID as BASE_MODES in e-mission-common base_modes.py's list of BASE_MODES. We will need to update the list with INVALID into the list of BASE_MODES as move to use BASE_MODES for mapping labels to color palette.

@Abby-Wheelis
Copy link
Member Author

We do not have INVALID as BASE_MODES in e-mission-common base_modes.py's list of BASE_MODES. We will need to update the list with INVALID into the list of BASE_MODES as move to use BASE_MODES for mapping labels to color palette.

The patch added INVALID as a sensed mode, not as a base mode, so I'm not sure we need to add it to the base modes, unless we discover a need for an INVALID base mode as well as sensed mode

sensed_values = ["WALKING", "BICYCLING", "IN_VEHICLE", "AIR_OR_HSR", "UNKNOWN", "OTHER", "INVALID"]

@iantei
Copy link
Contributor

iantei commented Sep 16, 2024

Yes, INVALID is added up as a sensed mode. However, we are either mapping internal labels to the base mode, or using the base mode as is with sensed_values. Therefore, we would need to add INVALID as a base mode.

colors_sensed = dict(zip(sensed_values, [BASE_MODES[x.upper()]['color'] for x in sensed_values]))

@Abby-Wheelis
Copy link
Member Author

Additional cleanup coming out of #148 related to fleet and survey deployments:

  • consider combining survey_metrics and generic_metrics_sensed - there does not seem to be a reason to replace the sensed mode with BLE sensed mode in two separate notebooks - I think we can add the new charts from survey_metrics to generic_metrics_sensed, since they do display sensed modes, and then not show them in the dashboard for label-based deployments, since those charts really just replace those in generic-metrics 148#discussion_r1766009942
  • stop notebooks that don't need to run for survey deployments from running to avoid wasting time and resources (eg generic-metrics) 148#issuecomment-2360045768

@Abby-Wheelis
Copy link
Member Author

More cleanup from #145

  • use server code to filter inferred confidence comment on #145
  • remove all use of "display modes" and just translate the modes when it's time to display them comment here
  • clean up labels as mentioned in this comment
  • investigate dropping NaN inferred values here

@iantei
Copy link
Contributor

iantei commented Sep 25, 2024

  • Use server code to filter inferred confidence:

I am merging this for now, but note that there is a similar function in the server when we handle metrics to include trips with confidence > 90% as "confirmed". We do not and should not need to reinvent code.

https://github.com/e-mission/e-mission-server/blob/52adee205f686d87e167bd4b1d166098938870c6/emission/analysis/result/metrics/time_grouping.py#L134

Made some code changes:

ashrest2-41625s:em-public-dashboard ashrest2$ git diff viz_scripts/scaffolding.py 
diff --git a/viz_scripts/scaffolding.py b/viz_scripts/scaffolding.py
index 880c819..a20b13c 100644
--- a/viz_scripts/scaffolding.py
+++ b/viz_scripts/scaffolding.py
@@ -7,6 +7,7 @@ from collections import OrderedDict
 import difflib
 
 import emission.storage.timeseries.abstract_timeseries as esta
+import emission.storage.decorations.trip_queries as esdt
 import emission.storage.timeseries.tcquery as esttc
 import emission.core.wrapper.localdate as ecwl
 import emcommon.diary.base_modes as emcdb
@@ -239,8 +240,8 @@ async def load_viz_notebook_inferred_data(year, month, program, study_type, dyna
     # Access database
     tq = get_time_query(year, month)
     participant_ct_df = load_all_participant_trips(program, tq, include_test_users)
-    inferred_ct = filter_inferred_trips(participant_ct_df)
-    expanded_it = expand_inferredlabels(inferred_ct)
+    inferred_ct = esdt.has_final_labels_df(participant_ct_df)
+    expanded_it = esdt.expand_finallabels(inferred_ct)
     expanded_it = await map_trip_data(expanded_it, study_type, dynamic_labels, dic_re, dic_pur)

Result:

Old Server-code implementation
Old New

There is a difference in the implementation of my inferred labels trips bars and the one in which server generates it.

  1. Firstly the filtering logic:
  • Current:
    inferred_ct = mixed_trip_df[(mixed_trip_df['inferred_labels'].apply(lambda x: bool(x))) | (mixed_trip_df.user_input != {})]
    return inferred_ct
  • Server:
    def has_final_labels(confirmed_trip_data):
    return (confirmed_trip_data["user_input"] != {}
    or confirmed_trip_data["expectation"]["to_label"] == False)
  1. Secondly, the expand_finalLabels logic:
  • Current:
    Check for user_inputs: add to the list
    If user_input == {}
    Check for inferred_labels, but with max ‘p’ value greater than confidence_threshold; add to the list
    Else:
    add ‘mode_confirm’ = ‘uncertain’
    Convert this list into a data frame, and append it to the original data frame.
    Filter out all the rows where mode_confirm = “uncertain”
  • Server:
    Filters out with user_input - adds into the list
    uses expectation.to_label to filter out
    high_confidence_no_userinput_df = labeled_ct[
    (labeled_ct.user_input == {}) & (expectation_expansion.to_label == False)
    ]

@shankari Do you recommend following the server approach for the Labeled and Inferred bars in the Stacked Bar chart?
I have used confidence_threshold to filter out after knowing there's no user_input, that's different than the approach of using expectation column for the trip.

While executing the server approach, I encountered the below issue for cortezebikes, will look more into it:

There is no ['analysis/confirmed_trip'], therefore there is no expectation column.

Details of the issue:

Running at 2024-09-25T04:56:02.857039+00:00 with params [Parameter('year', int, value=2023), Parameter('month', int, value=10), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True), Parameter('sensed_algo_prefix', str, value='cleaned')]
Running at 2024-09-25T04:56:10.199455+00:00 with params [Parameter('year', int, value=2023), Parameter('month', int, value=11), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True), Parameter('sensed_algo_prefix', str, value='cleaned')]
Traceback (most recent call last):
  File "/usr/src/app/saved-notebooks/bin/generate_plots.py", line 111, in <module>
    compute_for_date(month_year.month, month_year.year)
  File "/usr/src/app/saved-notebooks/bin/generate_plots.py", line 104, in compute_for_date
    nbclient.execute(new_nb)
  File "/root/miniconda-23.5.2/envs/emission/lib/python3.9/site-packages/nbclient/client.py", line 1314, in execute
    return NotebookClient(nb=nb, resources=resources, km=km, **kwargs).execute()
  File "/root/miniconda-23.5.2/envs/emission/lib/python3.9/site-packages/jupyter_core/utils/__init__.py", line 165, in wrapped
    return loop.run_until_complete(inner)
  File "/root/miniconda-23.5.2/envs/emission/lib/python3.9/asyncio/base_events.py", line 647, in run_until_complete
    return future.result()
  File "/root/miniconda-23.5.2/envs/emission/lib/python3.9/site-packages/nbclient/client.py", line 709, in async_execute
    await self.async_execute_cell(
  File "/root/miniconda-23.5.2/envs/emission/lib/python3.9/site-packages/nbclient/client.py", line 1062, in async_execute_cell
    await self._check_raise_for_error(cell, cell_index, exec_reply)
  File "/root/miniconda-23.5.2/envs/emission/lib/python3.9/site-packages/nbclient/client.py", line 918, in _check_raise_for_error
    raise CellExecutionError.from_cell_and_msg(cell, exec_reply_content)
nbclient.exceptions.CellExecutionError: An error occurred while executing the following cell:
------------------
expanded_ct_inferred, file_suffix_inferred, quality_text_inferred, debug_df_inferred = await scaffolding.load_viz_notebook_inferred_data(year,
                                                                            month,
                                                                            program,
                                                                            study_type,
                                                                            dynamic_labels,
                                                                            dic_re,
                                                                            dic_pur=dic_pur,
                                                                            include_test_users=include_test_users)
------------------

----- stderr -----
DEBUG:root:In get_filter_query, returning query {'data.start_local_dt.year': {'$gte': 2023, '$lte': 2023}, 'data.start_local_dt.month': {'$gte': 11, '$lte': 11}}
----- stderr -----
DEBUG:root:curr_query = {'invalid': {'$exists': False}, '$or': [{'metadata.key': 'analysis/confirmed_trip'}], 'data.start_local_dt.year': {'$gte': 2023, '$lte': 2023}, 'data.start_local_dt.month': {'$gte': 11, '$lte': 11}}, sort_key = None
----- stderr -----
DEBUG:root:orig_ts_db_keys = [], analysis_ts_db_keys = ['analysis/confirmed_trip']
----- stderr -----
DEBUG:root:finished querying values for [], count = 0
----- stderr -----
DEBUG:root:In get_filter_query, returning query {'data.start_local_dt.year': {'$gte': 2023, '$lte': 2023}, 'data.start_local_dt.month': {'$gte': 11, '$lte': 11}}
----- stderr -----
DEBUG:root:finished querying values for ['analysis/confirmed_trip'], count = 0
----- stderr -----
DEBUG:root:orig_ts_db_matches = 0, analysis_ts_db_matches = 0
----- stderr -----
DEBUG:root:Found 0 results
----- stdout -----
Loaded all confirmed trips of length 0
------------------

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[7], line 1
----> 1 expanded_ct_inferred, file_suffix_inferred, quality_text_inferred, debug_df_inferred = await scaffolding.load_viz_notebook_inferred_data(year,
      2                                                                             month,
      3                                                                             program,
      4                                                                             study_type,
      5                                                                             dynamic_labels,
      6                                                                             dic_re,
      7                                                                             dic_pur=dic_pur,
      8                                                                             include_test_users=include_test_users)

File /usr/src/app/saved-notebooks/scaffolding.py:243, in load_viz_notebook_inferred_data(year, month, program, study_type, dynamic_labels, dic_re, dic_pur, include_test_users)
    241 tq = get_time_query(year, month)
    242 participant_ct_df = load_all_participant_trips(program, tq, include_test_users)
--> 243 inferred_ct = esdt.has_final_labels_df(participant_ct_df)
    244 print("Hellooooooooo")
    245 expanded_it = esdt.expand_finallabels(inferred_ct)

File /usr/src/app/emission/storage/decorations/trip_queries.py:343, in has_final_labels_df(df)
    340 def has_final_labels_df(df):
    341     # print(df.expectation)
    342     # print(pd.DataFrame(df.expectation.to_list(), index=df.index))
--> 343     to_list_series = pd.DataFrame(df.expectation.to_list(), index=df.index).to_label
    344     return df[(df.user_input != {})
    345             | (to_list_series == False)]

File ~/miniconda-23.5.2/envs/emission/lib/python3.9/site-packages/pandas/core/generic.py:5902, in NDFrame.__getattr__(self, name)
   5895 if (
   5896     name not in self._internal_names_set
   5897     and name not in self._metadata
   5898     and name not in self._accessors
   5899     and self._info_axis._can_hold_identifiers_and_holds_name(name)
   5900 ):
   5901     return self[name]
-> 5902 return object.__getattribute__(self, name)

AttributeError: 'DataFrame' object has no attribute 'expectation'

(emission) root@22f87f8030df:/usr/src/app/saved-notebooks# 

@iantei
Copy link
Contributor

iantei commented Sep 27, 2024

Well, upon a further investigation. I found the following:

userinput/expectations.py

# This is a placeholder. TODO: implement the real algorithm
def _get_expectation_for_trip(trip):
    raw_expectation = eace.get_expectation(trip)
    # For now, expect always labeling unless the config file specifies no labeling at all
    processed_expectation = not raw_expectation["type"] == "none"
    return {"to_label": processed_expectation}

Seemingly, the algorithm for expectation is not implemented yet, therefore it makes sense why it was giving same result for both "Labeled" and "Labeled and Inferred Labels" bar.

@shankari
Copy link
Contributor

@iantei

Seemingly, the algorithm for expectation is not implemented yet,

This is not correct. We do return a non-zero value, it is just not as sophisticated as we would like

While executing the server approach, I encountered the below issue for cortezebikes, will look more into it:

This is probably a data load error since if we have cleaned trips, we have confirmed trips

I have used confidence_threshold to filter out after knowing there's no user_input, that's different than the approach of using expectation column for the trip.

The functionality in the server code and your code is similar but not identical

As I said earlier

but note that there is a similar function in the server when we handle metrics to include trips with confidence > 90% as "confirmed"

The way in which the data is handled is similar but the filtering is different. You need to try handling the data in the same way but changing the filter at the end.

@iantei
Copy link
Contributor

iantei commented Sep 27, 2024

This is probably a data load error since if we have cleaned trips, we have confirmed trips

Yes, that's a data load error. I used the same has_final_labels_df(df) with an additional check of

def has_final_labels_df(df):
    if len(df) == 0:
        return df

It worked fine

@iantei
Copy link
Contributor

iantei commented Sep 27, 2024

@shankari

  • Changed the confidence threshold to 90%.
  • Updated the filter at the end, with the use of dropna()

I have submitted the PR #158 for review. Please let me know if you wanted anything different.

@shankari
Copy link
Contributor

@iantei please discuss with @Abby-Wheelis when she returns on Monday. The 90% threshold is the filter. Changing the public dashboard to use the 90% threshold will not move the needle on inferred trips.

@Abby-Wheelis
Copy link
Member Author

Observed in the dataset that all trips have expectation: {to_label - True}, this aligns with the comment in the server code where # For now, expect always labeling unless the config file specifies no labeling at all where all will be expected to label.

Observed in testExpandFinalLabelsPandasFunctionsNestedPostFilter that sample data does not expect labels if already labeled by user or if p value of a set of inferred labels is >= 0.9. However, we have not found in the code where the expectation to label is set to false if the trip is labeled or there is a p value greater than 0.9 present. Should we update the server code to change to_label to false once user input or reasonable inferred labels are present? Did we overlook the code where this already exists?

Found relevant PR for expanding the labels here: #846
The commit which added the expectation to label was 3 years ago here

It seems like using the confidence_threshold, like the phone does, or using the approach from the server seem like they should be separate approaches. It seems like ideally we would use the server approach, but the hangup with the expectation.to_label makes that complicated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Questions for Shankari
Development

No branches or pull requests

3 participants