Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

o365: Add Microsoft Reports data-stream #12138

Closed
wants to merge 21 commits into from

Conversation

kcreddy
Copy link
Contributor

@kcreddy kcreddy commented Dec 17, 2024

Proposed commit message

Adds following Microsoft 365 Usage Reports to Office 365 integration using Microsoft Graph API.

  • Microsoft Teams User Activity User Detail: ref
  • Viva Engage Groups Activity Group Detail: ref
  • Office365 Groups Activity Group Detail: ref
  • SharePoint Site Usage Site Detail: ref
  • OneDrive Usage Account Detail: ref

Reference issue: #12054.

Other changes:

  1. Update docs to indicate the integration now supports both logs and metrics.
  2. Update kibana.version to 8.15 to utilise latest CEL macros.

Note

To reviewers:
Following decisions are taken during this work:

  1. Make generic reports data-streams instead of data-stream per report. Users are allowed to configure which reports to fetch.
  2. Downloaded CSV files contains spaces inside field names, which are replaced by underscore. The field names remain otherwise unaltered to maintain integrity with raw data.
  3. System tests are removed in commit as the config url path doesn't support a regex. We require a wildcard pattern on date, to avoid failure during daily-CI-runs.
  4. Reason for adding transforms is documented in README.
  5. The transform should ideally run once every 24h because the reports are only available once a day. But due to the limitation of transform's maximum frequency, it is updated to 1h.

Checklist

  • I have reviewed tips for building integrations and this pull request is aligned with them.
  • I have verified that all data streams collect metrics or logs.
  • I have added an entry to my package's changelog.yml file.
  • I have verified that Kibana version constraints are current according to guidelines.
  • I have verified that any added dashboard complies with Kibana's Dashboard good practices

Author's Checklist

  • Screenshots to PR and integration.
  • Update o365 documentation to include both logs and metrics.
  • Add transform.
  • Add data validation logic to CEL as performed by Microsoft.
  • Generic reports instead of multiple data-streams.

How to test this PR locally

Pipeline Tests:
eval "$(elastic-package stack shellinit)" && elastic-package test pipeline --generate -v --data-streams=reports

--- Test results for package: o365 - START ---
╭─────────┬─────────────┬───────────┬───────────────────────────────────────────────────────────────────────┬────────┬──────────────╮
│ PACKAGE │ DATA STREAM │ TEST TYPE │ TEST NAME                                                             │ RESULT │ TIME ELAPSED │
├─────────┼─────────────┼───────────┼───────────────────────────────────────────────────────────────────────┼────────┼──────────────┤
│ o365    │ reports     │ pipeline  │ (ingest pipeline warnings test-office365-groups-activity-group.log)   │ PASS   │   270.8875ms │
│ o365    │ reports     │ pipeline  │ (ingest pipeline warnings test-onedrive-usage-account.log)            │ PASS   │ 294.903333ms │
│ o365    │ reports     │ pipeline  │ (ingest pipeline warnings test-sharepoint-site-usage-site.log)        │ PASS   │ 279.616917ms │
│ o365    │ reports     │ pipeline  │ (ingest pipeline warnings test-teams-user-activity-user.log)          │ PASS   │ 304.869084ms │
│ o365    │ reports     │ pipeline  │ (ingest pipeline warnings test-viva-engage-groups-activity-group.log) │ PASS   │ 353.329375ms │
│ o365    │ reports     │ pipeline  │ test-office365-groups-activity-group.log                              │ PASS   │  88.770792ms │
│ o365    │ reports     │ pipeline  │ test-onedrive-usage-account.log                                       │ PASS   │  59.335084ms │
│ o365    │ reports     │ pipeline  │ test-sharepoint-site-usage-site.log                                   │ PASS   │  59.270125ms │
│ o365    │ reports     │ pipeline  │ test-teams-user-activity-user.log                                     │ PASS   │     76.312ms │
│ o365    │ reports     │ pipeline  │ test-viva-engage-groups-activity-group.log                            │ PASS   │  63.989709ms │
╰─────────┴─────────────┴───────────┴───────────────────────────────────────────────────────────────────────┴────────┴──────────────╯
--- Test results for package: o365 - END   ---
Done

Related issues

Screenshots

Screenshot 2024-12-27 at 7 43 53 PM Screenshot 2024-12-27 at 7 41 52 PM o365-viva-engage-groups-activity o365-sharepoint-site-usage o365-onedrive-usage o365-groups-activity o365-teams-user-activity

@kcreddy kcreddy self-assigned this Dec 17, 2024
@kcreddy kcreddy added enhancement New feature or request Integration:o365 Microsoft Office 365 Team:Security-Service Integrations Security Service Integrations Team [elastic/security-service-integrations] labels Dec 17, 2024
@andrewkroh andrewkroh added the dashboard Relates to a Kibana dashboard bug, enhancement, or modification. label Dec 17, 2024
@elastic-vault-github-plugin-prod

🚀 Benchmarks report

To see the full report comment with /test benchmark fullreport

@kcreddy kcreddy changed the title o365: Add teams_user_activity_user_detail data-stream o365: Add report data-stream Dec 27, 2024
@kcreddy kcreddy changed the title o365: Add report data-stream o365: Add Microsoft Reports data-stream Dec 27, 2024
@@ -221,27 +184,3 @@ rules:
# 2 documents
body: |-
[{"ObjectId":"Sales","Id":"2af7bbf1-d5d8-5cb0-8aca-f4ad8a087594","CreationTime":"2020-02-28T09:42:45","UserKey":"100320009d6edf94","YammerNetworkId":5846122497,"Operation":"GroupCreation","ClientIP":"79.159.10.151:12345","ActorYammerUserId":36787265537,"UserType":0,"ResultStatus":"TRUE","RecordType":22,"Workload":"Yammer","Version":1,"GroupName":"Sales","OrganizationId":"0e1dddce-163e-4b0b-9e33-87ba56ac4655","UserId":"[email protected]","ActorUserId":"[email protected]"},{"CreationTime":"2020-02-28T09:39:20","ActorUserId":"[email protected]","ObjectId":"Company group","UserKey":"100320009d292e16","Id":"3f3e7f1c-84c1-55fc-9bb2-c8b8563eae06","ActorYammerUserId":36085768193,"ClientIP":"[fdfd::555]:12346","UserId":"[email protected]","Operation":"GroupCreation","ResultStatus":"TRUE","UserType":0,"Workload":"Yammer","Version":1,"OrganizationId":"0e1dddce-163e-4b0b-9e33-87ba56ac4655","YammerNetworkId":5846122497,"RecordType":22,"GroupName":"Company group"}]
- path: /reports/getTeamsUserActivityUserDetail(date=.*
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed system tests as the regex is unsupported in path.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think regular expressions can be used, but with different syntax.

The elastic/stream README says:

path: the path to match. It can use gorilla/mux parameters patterns

The mux doc says:

Paths can have variables. They are defined using the format {name} or {name:pattern}. If a regular expression pattern is not defined, the matched variable will be anything until the next slash. For example:

r := mux.NewRouter()
r.HandleFunc("/products/{key}", ProductHandler)
r.HandleFunc("/articles/{category}/", ArticlesCategoryHandler)
r.HandleFunc("/articles/{category}/{id:[0-9]+}", ArticleHandler)

@kcreddy kcreddy marked this pull request as ready for review December 27, 2024 15:31
@kcreddy kcreddy requested a review from a team as a code owner December 27, 2024 15:31
@elasticmachine
Copy link

Pinging @elastic/security-service-integrations (Team:Security-Service Integrations)

@elasticmachine
Copy link

💚 Build Succeeded

History

cc @kcreddy

@kcreddy kcreddy marked this pull request as draft January 2, 2025 11:26
@kcreddy
Copy link
Contributor Author

kcreddy commented Jan 2, 2025

Moving to draft as it needs PM clarification how we want to split the data (across integrations/data-streams).

Copy link
Contributor

@chrisberkhout chrisberkhout left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partial review with a few comments. Read the README and looked at the overall structure.

Moving to draft as it needs PM clarification how we want to split the data (across integrations/data-streams).

I like it as it is. I'm not sure that splitting reports into separate data streams would help in any way, especially since you can already configure which reports you want and the transforms create separate destination indices for each. For the de-duplication, transforms always seem like a heavy-handed solution, but they might be the best solution here.

Following Microsoft 365 usage reports can be collected by Microsoft Office 365 integration.

| Report | API |
|------------------|:-------:|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
|------------------|:-------:|
|------------------|-------|

This is a very nice table to have in the README!

Looks better with both columns left-aligned:

image

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this file is broken out into separate files for each report, then each transform can use one of those files without changes. Currently they're all different:

find -wholename '*/reports/*/fields.yml' -or -wholename '*/transform/*/fields.yml' | sort | xargs md5sum
377cac0e355ca11527037e46271238bd  ./data_stream/reports/fields/fields.yml
5b233259d6615b83105cf4053896b768  ./elasticsearch/transform/latest_office365_groups_activity_group/fields/fields.yml
78c15f9ad25f60975de6ffd5196e971d  ./elasticsearch/transform/latest_onedrive_usage_account/fields/fields.yml
a2f7e13dbfca735b495940306347750f  ./elasticsearch/transform/latest_sharepoint_site_usage_site/fields/fields.yml
a65fa2cf69f1a8aa619815fddb0a1871  ./elasticsearch/transform/latest_teams_user_activity_user/fields/fields.yml
8106e9736f725f106b863c4b233de965  ./elasticsearch/transform/latest_viva_engage_groups_activity_group/fields/fields.yml

Copy link
Contributor Author

@kcreddy kcreddy Jan 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Chris. I like that method, will incorporate it.


As the latest data is available in destination indices, the source data-stream backed indices are purged based on ILM policy `metrics-o365.reports-default_policy`.

| o365.reports.metadata.name | Source filter | Source indices | Destination filter | Destination indices | Destination alias |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another very helpful table.

I'd say Source indices: metrics-o365.reports-* rather than .ds-metrics-o365.reports-*.

metrics-o365.reports-* is the value used in source.index in transform.yml.

@kcreddy
Copy link
Contributor Author

kcreddy commented Jan 3, 2025

I like it as it is. I'm not sure that splitting reports into separate data streams would help in any way, especially since you can already configure which reports you want and the transforms create separate destination indices for each.

The real concern is whether to add reports inside Microsoft Office 365 integration or create multiple integrations i.e., 1 integration per Microsoft 365 entity, such as OneDrive, to enable customers to onboard entity-specific integrations.

@kcreddy
Copy link
Contributor Author

kcreddy commented Jan 8, 2025

Closing in favour of #12256

@kcreddy kcreddy closed this Jan 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dashboard Relates to a Kibana dashboard bug, enhancement, or modification. enhancement New feature or request Integration:o365 Microsoft Office 365 Team:Security-Service Integrations Security Service Integrations Team [elastic/security-service-integrations]
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants