Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query users #18

Open
1 of 3 tasks
joeflack4 opened this issue Jul 28, 2022 · 1 comment · Fixed by #30
Open
1 of 3 tasks

Query users #18

joeflack4 opened this issue Jul 28, 2022 · 1 comment · Fixed by #30
Assignees

Comments

@joeflack4
Copy link
Member

joeflack4 commented Jul 28, 2022

Summary

We need to query users (authors and respondents).

Major features

  • 1. Users
  • 2. Roles
  • 3. "Profiling"

Steps

  1. Query list of messages (done)
  2. Get unique list of author email addresses
  3. Use user endpoint to get additional user information
  • Edit: joeflack4 2022/08/15: Looks like this part is not necessary. The message API returns user ID, name, and email. I think that's all we'll need right now, but we should ask.
  1. Roles
    4b. Profiles
  2. Output

Steps, expanded

4. Roles (@pending initial results)

@DaveraGabriel will give us a mapping list of {person name --> role} once we present her w/ the list of people. Davera will fill in this spreadsheet: https://docs.google.com/spreadsheets/d/1OB0CEAkOhVTN71uIhzCo_iNaiD1B6qLqL7uwil5O22Q/edit#gid=1504038457

There are different classes of roles.

List of roles:
i. FHIR Management Group / FHIR-I
ii. Terminology Services Management Group / Vocab WG
iii. IG Developer / WG member
iv. External Terminology steward /representative
v. FHIR (... other standard) implementer

4b. Profiles

Davera said 2022/09/18:

look at the users. what categories or keywords come up in their messages.
predominantly author (quantile based~) vs respondent

This would be like clustering based on information about the users.

5. Output

@DaveraGabriel What kinda output would you like to see? A new CSV? What columns?
~i. Counts of individuals, grouped by [author, respondent]
ii. Counts of "types / roles" grouped by "authors / respondents"
Example CSV Fields:

  • type: individual, role
  • name: e.g. a person name or a role name
  • count~

Edit: joeflack4 2022/08/15: I came up with a better idea for how the output should look, I think.

Later we can do fancier, like JOIN on keywords for more information, disaggregate by author/respondent, etc.

Questions that we're interested in are "this type of role is interested / talking about 'x'"

@joeflack4 joeflack4 mentioned this issue Aug 5, 2022
14 tasks
joeflack4 added a commit that referenced this issue Aug 15, 2022
- Implemented basic feature

Misc
- Add: Comment link to user roles GoogleSheet.
@joeflack4 joeflack4 linked a pull request Aug 15, 2022 that will close this issue
joeflack4 added a commit that referenced this issue Aug 15, 2022
- Implemented basic feature

Misc
- Add: Comment link to user roles GoogleSheet.
- Update: Renamed 'report1' and 'report2' variable and function names to be more descriptive.
joeflack4 added a commit that referenced this issue Aug 15, 2022
- Add: New function implementing basic feature: create_report_users_and_roles()

Misc
- Add: Comment link to user roles GoogleSheet.
- Update: Renamed 'report1' and 'report2' variable and function names to be more descriptive.
- Update: Reorganized run()
joeflack4 added a commit that referenced this issue Aug 15, 2022
- Add: New function implementing basic feature: create_report_users_and_roles()

Misc
- Add: Comment link to user roles GoogleSheet.
- Update: Renamed 'report1' and 'report2' variable and function names to be more descriptive.
- Update: Reorganized run()
- Update: Fixed an incorrect type.
joeflack4 added a commit that referenced this issue Aug 15, 2022
- Add: New function implementing basic feature: create_report_users_and_roles()

Misc
- Add: Comment link to user roles GoogleSheet.
- Update: Renamed 'report1' and 'report2' variable and function names to be more descriptive.
- Update: Reorganized run()
- Update: Fixed an incorrect type.
- Update: .gitignore: Added *.pickle
joeflack4 added a commit that referenced this issue Aug 16, 2022
- Add: New function implementing basic feature: create_report_users_and_roles()
- Add: Documentation for feature to README.md

Misc
- Add: Codebook section at bottom of README.md documentation.
- Add: Comment link to user roles GoogleSheet.
- Update: Renamed 'report1' and 'report2' variable and function names to be more descriptive.
- Update: Reorganized run()
- Update: Fixed an incorrect type.
- Update: .gitignore: Added *.pickle
joeflack4 added a commit that referenced this issue Aug 16, 2022
- Add: New function implementing basic feature: create_report_users_and_roles()
- Add: Documentation for feature to README.md

Misc
- Add: Codebook section at bottom of README.md documentation.
- Add: Comment link to user roles GoogleSheet.
- Update: Renamed 'report1' and 'report2' variable and function names to be more descriptive.
- Update: Reorganized run()
- Update: Fixed an incorrect type.
- Update: .gitignore: Added *.pickle
joeflack4 added a commit that referenced this issue Aug 16, 2022
- Add: New function implementing basic feature: create_report_users_and_roles()
- Add: Documentation for feature to README.md

Misc
- Add: Codebook section at bottom of README.md documentation.
- Add: Comment link to user roles GoogleSheet.
- Update: Renamed 'report1' and 'report2' variable and function names to be more descriptive.
- Update: Reorganized run()
- Update: Fixed an incorrect type.
- Update: .gitignore: Added *.pickle
joeflack4 added a commit that referenced this issue Aug 16, 2022
- Add: New function implementing basic feature: create_report_users_and_roles()
- Add: Documentation for feature to README.md

Misc
- Add: Codebook section at bottom of README.md documentation.
- Add: Comment link to user roles GoogleSheet.
- Update: Renamed 'report1' and 'report2' variable and function names to be more descriptive.
- Update: Reorganized run()
- Update: Fixed an incorrect type.
- Update: .gitignore: Added *.pickle
@joeflack4
Copy link
Member Author

joeflack4 commented Aug 16, 2022

Major bugfix I need to do:

    # TODO: Bugfix: Major bug; respondent/author counts are not all correct. This is because (i) threads are being 
       counted multiple times when multiple keywords are matched against them, and (ii) we are *only* counting messages
       within threads that have keyword matches; not every message in every thread that has a keyword match for any 
       message.

Related: #32

joeflack4 added a commit that referenced this issue Aug 16, 2022
Feature: Query authors and respondents #18
@joeflack4 joeflack4 reopened this Aug 16, 2022
joeflack4 added a commit that referenced this issue Aug 29, 2022
- Bugfix: Duplicated counts at aggregated levels (category, stream), due to the same message potentially being matched by multiple keywords.
joeflack4 added a commit that referenced this issue Aug 29, 2022
- Bugfix: Duplicated counts at aggregated levels (category, stream), due to the same message potentially being matched by multiple keywords.
- Update: Stream name is now queried directly from `display_recipient` field, rather than mapping from hard-coded `stream_id`.
- Update: Fields role -> thread.role, count -> thread.count
joeflack4 added a commit that referenced this issue Aug 29, 2022
- Bugfix: Duplicated counts at aggregated levels (category, stream), due to the same message potentially being matched by multiple keywords.
- Update: Stream name is now queried directly from `display_recipient` field, rather than mapping from hard-coded `stream_id`.
- Update: Fields role -> thread.role, count -> thread.count
joeflack4 added a commit that referenced this issue Aug 29, 2022
- Bugfix: Duplicated counts at aggregated levels (category, stream), due to the same message potentially being matched by multiple keywords.
- Update: Stream name is now queried directly from `display_recipient` field, rather than mapping from hard-coded `stream_id`.
- Update: Fields role -> thread.role, count -> thread.count
joeflack4 added a commit that referenced this issue Aug 29, 2022
- Bugfix: Duplicated counts at aggregated levels (category, stream), due to the same message potentially being matched by multiple keywords.
- Update: Stream name is now queried directly from `display_recipient` field, rather than mapping from hard-coded `stream_id`.
- Update: Fields role -> thread.role, count -> thread.count
- Add: New output 'all messages' which is utilized by this feature.
- Update: Refactor to do with querying 'all messages' vs 'querying by keyword'.
joeflack4 added a commit that referenced this issue Aug 30, 2022
- Bugfix: Duplicated counts at aggregated levels (category, stream), due to the same message potentially being matched by multiple keywords.
- Update: Stream name is now queried directly from `display_recipient` field, rather than mapping from hard-coded `stream_id`.
- Update: Fields role -> thread.role, count -> thread.count
- Add: New output 'all messages' which is utilized by this feature.
- Update: Refactor to do with querying 'all messages' vs 'querying by keyword'.
joeflack4 added a commit that referenced this issue Aug 30, 2022
- Bugfix: Duplicated counts at aggregated levels (category, stream), due to the same message potentially being matched by multiple keywords.
- Update: Stream name is now queried directly from `display_recipient` field, rather than mapping from hard-coded `stream_id`.
- Update: Fields role -> thread.role, count -> thread.count
- Add: New output 'all messages' which is utilized by this feature.
- Update: Refactor to do with querying 'all messages' vs 'querying by keyword'.
joeflack4 added a commit that referenced this issue Aug 30, 2022
- Bugfix: Duplicated counts at aggregated levels (category, stream), due to the same message potentially being matched by multiple keywords.
- Update: Stream name is now queried directly from `display_recipient` field, rather than mapping from hard-coded `stream_id`.
- Update: Fields role -> thread.role, count -> thread.count
- Add: New output 'all messages' which is utilized by this feature.
- Update: Refactor to do with querying 'all messages' vs 'querying by keyword'.
@joeflack4 joeflack4 changed the title Query authors and respondents Query users Sep 20, 2022
@joeflack4 joeflack4 removed their assignment Sep 26, 2022
@joeflack4 joeflack4 assigned DaveraGabriel and unassigned rohaher Dec 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants