Feature: Query authors and respondents #18 #30

joeflack4 · 2022-08-15T20:20:46Z

Updates

    Feature: Query authors and respondents #18
    - Add: New function implementing basic feature: create_report_users_and_roles()
    - Add: Documentation for feature to README.md

    Misc
    - Add: Codebook section at bottom of README.md documentation.
    - Add: Comment link to user roles GoogleSheet.
    - Update: Renamed 'report1' and 'report2' variable and function names to be more descriptive.
    - Update: Reorganized run()
    - Update: Fixed an incorrect type.
    - Update: .gitignore: Added *.pickle

joeflack4 · 2022-08-16T01:23:02Z

README.md

+#### `zulip_report2_thread_lengths.csv`
+TODO
+
+#### `zulip_report3_users.csv`


Added a "codebook" section to the README.md. I added a codebook for the most recent report, but I need to do so for the rest of the outputs. See: #31

joeflack4 · 2022-08-16T01:23:39Z

fhir_zulip_nlp/fhir_zulip_nlp.py

@@ -9,6 +9,8 @@
  3. The Zulip chat we're querying: https://chat.fhir.org/#
  4. Category keywords google sheet:
     https://docs.google.com/spreadsheets/d/1OB0CEAkOhVTN71uIhzCo_iNaiD1B6qLqL7uwil5O22Q/edit#gid=1136391153
+  5. User roles google sheet:


@rohaher Just FYI, Davera and I decided to open up a new tab on the google sheet, where she'll put "user" -> "HL7 organization role" mappings.

joeflack4 · 2022-08-16T01:24:19Z

fhir_zulip_nlp/fhir_zulip_nlp.py

@@ -46,11 +48,15 @@
    'zuliprc_path': os.path.join(ENV_DIR, '.zuliprc'),  # rc = "runtime config"
    'chat_stream_name': 'terminology',
    'num_messages_per_query': 1000,
-    'outpath_report1': os.path.join(PROJECT_DIR, 'zulip_report1_counts.csv'),
-    'outpath_report2': os.path.join(PROJECT_DIR, 'zulip_report2_thread_lengths.csv'),
+    'outpath_user_info': os.path.join(PROJECT_DIR, 'zulip_user_info.csv'),


As we create more and more outputs, I wonder if I should think more about how these outputs are named / organized.

joeflack4 · 2022-08-16T01:25:42Z

fhir_zulip_nlp/fhir_zulip_nlp.py

    return df_report


+def create_report_users(df: pd.DataFrame) -> Tuple[pd.DataFrame, pd.DataFrame, pd.DataFrame]:


@rohaher I just wanted to share with you the new function I made to handle this feature. I feel like it is really repetitive, though; not very proud of it. Due for a refactor at some point.

joeflack4 · 2022-08-16T01:26:20Z

fhir_zulip_nlp/fhir_zulip_nlp.py

+    user_participation_df.to_csv(CONFIG['outpath_raw_results_user_participation'], index=False)
+
+    # TODO: I really don't like how repetitive this is; even worse than previous block
+    # TODO: Have aggregated to keyword, category, and stream, but not to role agnostic of stream. Would be useful to add


Some other updates I'm thinking of making:

TODO: Have aggregated to keyword, category, and stream, but not to role agnostic of stream. Would be useful to add this, once streams feature is complete.
TODO: aggregate to agnostic of role? for every level? stream, category, keyword? If so, can call 'participant'

- Add: New function implementing basic feature: create_report_users_and_roles() - Add: Documentation for feature to README.md Misc - Add: Codebook section at bottom of README.md documentation. - Add: Comment link to user roles GoogleSheet. - Update: Renamed 'report1' and 'report2' variable and function names to be more descriptive. - Update: Reorganized run() - Update: Fixed an incorrect type. - Update: .gitignore: Added *.pickle

joeflack4 · 2022-08-16T01:30:37Z

fhir_zulip_nlp/fhir_zulip_nlp.py

    return df_report


+def create_report_users(df: pd.DataFrame) -> Tuple[pd.DataFrame, pd.DataFrame, pd.DataFrame]:
+    """Report: Users
+    # TODO: Bugfix: Major bug; respondent/author counts are not all correct. This is because (i) threads are being


@rohaher FYI: Major bugfix I need to do:

# TODO: Bugfix: Major bug; respondent/author counts are not all correct. This is because (i) threads are being counted multiple times when multiple keywords are matched against them, and (ii) we are *only* counting messages within threads that have keyword matches; not every message in every thread that has a keyword match for any message.

Related: #32

joeflack4 self-assigned this Aug 15, 2022

joeflack4 added the enhancement New feature or request label Aug 15, 2022

joeflack4 linked an issue Aug 15, 2022 that may be closed by this pull request

Query users #18

Open

3 tasks

joeflack4 force-pushed the joe branch 7 times, most recently from 5c83534 to c4ce0fb Compare August 16, 2022 01:21

joeflack4 commented Aug 16, 2022

View reviewed changes

joeflack4 force-pushed the joe branch from c4ce0fb to 6909c96 Compare August 16, 2022 01:29

joeflack4 commented Aug 16, 2022

View reviewed changes

joeflack4 merged commit 2a30506 into main Aug 16, 2022

joeflack4 deleted the joe branch August 16, 2022 01:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature: Query authors and respondents #18 #30

Feature: Query authors and respondents #18 #30

joeflack4 commented Aug 15, 2022 •

edited

Loading

joeflack4 Aug 16, 2022

joeflack4 Aug 16, 2022

joeflack4 Aug 16, 2022

joeflack4 Aug 16, 2022

joeflack4 Aug 16, 2022

joeflack4 Aug 16, 2022

joeflack4 Aug 16, 2022

		return df_report


		def create_report_users(df: pd.DataFrame) -> Tuple[pd.DataFrame, pd.DataFrame, pd.DataFrame]:

Feature: Query authors and respondents #18 #30

Feature: Query authors and respondents #18 #30

Conversation

joeflack4 commented Aug 15, 2022 • edited Loading

joeflack4 Aug 16, 2022

Choose a reason for hiding this comment

joeflack4 Aug 16, 2022

Choose a reason for hiding this comment

joeflack4 Aug 16, 2022

Choose a reason for hiding this comment

joeflack4 Aug 16, 2022

Choose a reason for hiding this comment

joeflack4 Aug 16, 2022

Choose a reason for hiding this comment

joeflack4 Aug 16, 2022

Choose a reason for hiding this comment

joeflack4 Aug 16, 2022

Choose a reason for hiding this comment

joeflack4 commented Aug 15, 2022 •

edited

Loading