Policy automations: run script #17129

dherder · 2024-02-23T18:22:50Z

Goal

User story
As a Fleet user,
I want a policy failure in Fleet to trigger a script run on a host
so that I can run scripts on many hosts w/o having to use a third-party automation tool (ex. Tines).

"Policy automations: install software" (#19551). Except now we're triggering script runs.

Context

Product designer: @marko-lisica

Changes

Product

UI changes: Figma
CLI usage changes: No CLI changes
YAML file changes: Add new run_script parameter to policies YAML.
REST API changes: Add API docs for script execution on policy failure #22395
Permissions changes: Only users with the admin role in Fleet (global and team) can edit global policy automations. Team maintains and admins can edit team level policy automations.
- PR to permissions table is here: Docs: Note permissions distinction between global policy automations and software install (#19551) and script execution (#17129) policy automations #23447
Reference documentation changes: Update API YAML reference docs. Reference docs PR is here: Reference doc and guide updates: Policy automations: run script (#17129) #23300
Feature guide changes: Guide is here: https://fleetdm.com/guides/policy-automation-run-script#basic-article
- Mention that policy counts reset when new script is specified. This was added in a PR here: Reference doc and guide updates: Policy automations: run script (#17129) #23300
- @noahtalerman: UI redirect for new guide is in a PR here: 22117 Policy based run script guide #22471
Changes to paid features or tiers: Available to Fleet Premium users only.
- @noahtalerman: Let's update the guide that "Device remediation" points to (remediation) to link to guides for automatically run scripts and install software. We can frame these features (paid only) as device remediation.
- PR is here: Deploy software: update pricing page and guides #23329

Engineering

Database schema migrations: TODO
Load testing: TODO

ℹ️ Please read this issue carefully and understand it. Pay special attention to UI wireframes, especially "dev notes".

QA

Risk assessment

Requires load testing: TODO
Risk level: Low / High TODO
Risk description: TODO

Manual testing steps

Migration

Starting with a script and policy created in <= 4.57.x works for these automation workflows

Regression avoidance

Manual script execution works
Manual script execution errors when the same script is already queued
Software install automation works (@jacobshandling and I QA'd this one for 4.57 if you need pointers), on both team and no-team

UI

Script automation is available for teams, including No Team
Script automation is not available for global policies
Script automation dialog allows adding/changing/removing scripts from team-specific policies (global-inherited policies should not be shown)
Scripts error on deletion attempt if they are associated with a policy, with useful error text
Scripts can be deleted if they are removed from a policy automation
Adding or changing a script automation for a policy clears that policy's stats/host statuses
Removing a script automation for a policy does not clear that policy's stats/host statuses
Changing a policy's name does not clear that policy's status/host statuses

Policy automation execution

* Known issue: No author on upcoming/past script run activity (fix incoming, pending product confirmation)

PowerShell scripts work on Windows
shell scripts work on macOS
zsh scripts work
shell scripts work on Linux
Pending activity visible for script run once queued
Script run activity shows in Past once executed
Manual script run fails when a policy failure has queued the same script

No-ops

GitOps

Known issue: Non-functional on no-team due to path mismatch (so test on a different teaml fix incoming as part of GitOps script path fix)

Succeeds in setting up (confirm via UI) with correct YAML in team

controls:
    scripts:
        - path: ../path/to/script.sh
policies
   - # normal policy 
        run_script:
            path: ../path/to/script.sh

Succeeds when policy is defined in its own file, in a directory at a different nesting level than the team file

Changing existing configuration

If policy automation is dropped from YAML, it's dropped on-apply to the server
If policy automation is dropped and script is dropped from YAML, application is successful (script is deleted, policy automation is removed, no fkey issues)
If script contents change but path does not, script is updated in-place but policy is not reset
If script path changse (need to change in both controls and run_script), policy status/hosts are reset

Validation errors

Fails when attempted on global
Fails when script not found at path
Fails when script isn't also specified for the team
Fails on malformed YAML (e.g. missing value on path property)

Confirmation

Engineer (@____): Added comment to user story confirming successful completion of QA.
QA (@____): Added comment to user story confirming successful completion of QA.

The text was updated successfully, but these errors were encountered:

noahtalerman · 2024-02-27T14:57:37Z

I would like to execute a script automatically when a policy fails instead of trigger a webhook.

@dherder we'll get to this but I think there's an iteration or two before we build it.

Currently, the customer can consume the failing policies webhook in Tines and execute a script using the Fleet API, right?

I think the first iteration will be sending a webhook per host that includes all the hosts failing policies. I think this simplifies the Tines story. The Tines story becomes this:

Receive new webhook that includes a specific host's failing policies
Loop through policies and take remediation action specific to each failing policy (via script or some other tool)

dherder · 2024-02-29T22:57:28Z

@noahtalerman would also be good to get a Fleet desktop notification on failed policies similar to #16264

noahtalerman · 2024-03-01T15:13:18Z

would also be good to get a Fleet desktop notification on failed policies

@dherder the current plan is to solve the problem of notifying the end user by getting in their calendar: #17230

dherder · 2024-03-07T19:52:21Z

@noahtalerman I see the calendar remediation as a separate issue. It works great when you want an end user to do a thing like update an app or perform an OS update. Where it doesn't work so great is if you want the remediation to be "execute a root level script", where if the user is a standard user, they just simply wouldn't be able to do it.

noahtalerman · 2024-03-11T20:38:47Z

Where it doesn't work so great is if you want the remediation to be "execute a root level script", where if the user is a standard user, they just simply wouldn't be able to do it.

@dherder I think the first iteration of "Fleet in your calendar" will address this.

The high level flow of the feature:

IT admin chooses which policies trigger calendar events
Calendar event is created when end user fails at least one of these policies
Webhook is fire when the calendar event starts
Automation tool (ex. Tines) receives the webhook and runs atuo-remediation (ex. script)

Check out the user story for more details on the flow: #17230

What do you think?

Also, we didn't have room for this "Auto remediation of policy failure" story in the current design sprint (4.48).

nonpunctual · 2024-05-02T21:07:54Z

@noahtalerman it's still does not solve the problem of 3rd party solution integration that is a blocker for some of our current customers but especially prospective customers.

The expectation is that if Fleet has the script server-side & Fleet has a policy to check for a client state or attribute, that it would also have a way of executing the script on a policy failure without 3rd party integration required.

Couldn't Fleet just send the policy failure webhook to its own API endpoint for executing a script? Is there a technical concern like load on server due to script execution? Thanks.

cc @dherder @willmayhone88 @spokanemac @ksatter @pacamaster

dherder · 2024-05-02T21:16:30Z

@noahtalerman i presented the option of remediation through 3rd party automation tools today (IT buying scenario) and the feedback was that it would be a blocker to move forward with Fleet.

noahtalerman · 2024-05-07T20:53:41Z

Couldn't Fleet just send the policy failure webhook to its own API endpoint for executing a script? Is there a technical concern like load on server due to script execution? Thanks.

@nonpunctual no technical concern that I know of. It's just a matter of priorities/timing. Let's chat about it at feature fest!

noahtalerman · 2024-10-28T18:11:46Z

Permissions changes: Only users with the admin role in Fleet (global and team) can edit policy automations. This is already documented in the permissions guide here: https://fleetdm.com/guides/role-based-access

Hey @iansltx when you get the chance, can you please sanity check me here?

noahtalerman · 2024-10-28T18:20:30Z

Changes to paid features or tiers: Available to Fleet Premium users only. Updating fleetdm.com/pricing is still TODO

Let's update the guide that "Device remediation" points to (remediation) to link to guides for automatically run scripts and install software:

We can frame these features (paid only) as device remediation.

@noahtalerman

…) (#23300) - Update guides to reflect use case: automatically run scripts and install software - @noahtalerman: I removed top image from "Automatically run scripts" b/c I think it looked rushed/unexpected - Update "execute" language to "run" and add "manual" language - Clarify when a policy's host counts are reset - Clarify support for policy automations: team v. default (global) v. no team - Update `software.packages` example to best practice: separate file - Inline is supported for backwards compatibility - Remove `policies` and `controls` call outs about "No team." This info is covered in the starter filed in fleetdm/gitops. For an example, see `teams/no-teams.yml` here: https://github.com/fleetdm/fleet-gitops/blob/main/teams/no-team.yml

noahtalerman · 2024-10-29T13:24:32Z

Permissions changes: Only users with the admin role in Fleet (global and team) can edit policy automations. This is already documented in the permissions guide here: https://fleetdm.com/guides/role-based-access

Hey @iansltx just giving you another ping! Can you please sanity check me here?

This is what we have documented in the permissions guide: https://fleetdm.com/guides/role-based-access#user-permissions

iansltx · 2024-10-29T16:22:28Z

@noahtalerman Re: permissions, as implemented in the API the team-specific policy automations (software install, script run) only require policy write permissions, so they're available to Maintainers as well as Admins and GitOps. My guess is that global automations are only available to admins, and that's what the existing permissions line item is referencing.

If we need to tighten down permissions for scripts/software it's doable, and could land in 4.59.0 if needed, but that would be a change from 4.57/4.58, and I'm not sure what the UI enforces here.

noahtalerman · 2024-10-29T20:09:50Z

available to Maintainers as well as Admins and GitOps

@iansltx ah, ok. I think no need to update the permissions in the code. We just want the documentation to be accurate.

UPDATE: @noahtalerman: I opened a draft PR here: #23433

When you get the chance, can you please take a pass at a PR to the permissions guide? https://fleetdm.com/guides/role-based-access

More context here: #17129 (comment)

noahtalerman · 2024-10-31T13:25:00Z

in the API the team-specific policy automations (software install, script run) only require policy write permissions, so they're available to Maintainers as well as Admins and GitOps. My guess is that global automations are only available to admins, and that's what the existing permissions line item is referencing.

@iansltx when you get the chance can you please double check that these^ are the current permissions? I opened up a draft PR to the permissions table here: #23433

I'm not sure what the UI enforces here.

@RachelElysia are the permissions mentioned above also enforced in the UI?

RachelElysia · 2024-10-31T14:20:04Z

@noahtalerman

According to the code for the UI: For policy automations dropdown on the policy page, the user has to be a global admin or a team admin, and they need to be viewing a team policy table with at least one team policy shown on the UI table. The UI button for managing automations for policies is hidden for maintainers.

Just logged in as a team maintainer and confirmed Policy Automations dropdown is hidden for maintainers.

iansltx · 2024-10-31T15:17:34Z

So, given the above, we have the API enforcing looser permissions than the UI. Do we want to:

Tighten the API up (in which case docs stay the same)
Allow maintainers to edit team-specific policy automations in the UI (in which case we should have a new line item in docs for software install/script run policies as their permissions are distinct from global automations like webhooks and calendars)
Do nothing (what should docs say in this case)

iansltx · 2024-10-31T15:51:25Z

Per design review just now, we're taking the second option of the above.

Action items (all on me):

Verify that global automations are indeed limited to Admin or above (if I'm wrong here and global automations work for admins the next two items will look different)
Add a frontend bug for mismatched permissions (FE should show install/script automations to maintainers)
Update Team maintainers can manage policy automations #23433 with an additional line item for install/script automations (Maintainer or above permissions), and remove the new maintainer permission note on the existing policy automations line item

Self-assigning this until the above are done.

iansltx · 2024-11-01T00:07:21Z

Confirmed that global automations are admin-or-above; modifications to global automations hit the global config endpoint, which is controlled by the app_config.write permission, which is gated to admin or gitops.

Docs update incoming.

iansltx · 2024-11-01T00:11:44Z

Going to set this up as a new PR to clean up the approval flow (and since the content of the PR is going to wind up quite different from the original docs change).

…ftware install (#19551) and script execution (#17129) policy automations

iansltx · 2024-11-01T00:28:48Z

RBAC docs PR is up: #23447

iansltx · 2024-11-01T00:37:00Z

#23448 created for matching UI permissions with API permissions. Reassigning this ticket back to @noahtalerman for continuation of confirmation and celebration.

…and software install (#19551) and script execution (#17129) policy automations (#23447) Co-authored-by: Noah Talerman <[email protected]>

noahtalerman · 2024-11-05T14:24:13Z

@Patagonia121 @pintomi1989 @pintomi1989 @zayhanlon @ambrusps @phtardif1 @AnthonySnyder8 heads up that this user story was shipped in 4.58 🚀

Here's the guide.

(we wait to close the issue until reference docs are updated and guide is published)

fleet-release · 2024-11-05T14:24:18Z

Script triggers rise,
Like sun on distant hosts gleams,
Effortless, we thrive.

dherder added :product Product Design department (shows up on 🦢 Drafting board) ~feature fest Will be reviewed at next Feature Fest customer-rosner labels Feb 23, 2024

noahtalerman removed the :product Product Design department (shows up on 🦢 Drafting board) label Feb 27, 2024

noahtalerman assigned dherder Mar 7, 2024

noahtalerman unassigned dherder Mar 11, 2024

noahtalerman added prospect-konrad and removed ~feature fest Will be reviewed at next Feature Fest labels Mar 11, 2024

noahtalerman mentioned this issue Mar 14, 2024

Auto-remediate policies with scripts. #17591

Closed

dherder added the ~feature fest Will be reviewed at next Feature Fest label Apr 1, 2024

noahtalerman assigned dherder Apr 18, 2024

pintomi1989 added the customer-knopfia label Apr 18, 2024

noahtalerman unassigned dherder Apr 19, 2024

noahtalerman removed the ~feature fest Will be reviewed at next Feature Fest label Apr 19, 2024

pintomi1989 added the ~csa Issue was created by or deemed important by the Customer Solutions Architect. label Apr 23, 2024

nonpunctual added customer-flacourtia ~feature fest Will be reviewed at next Feature Fest labels May 2, 2024

nonpunctual changed the title ~~Auto remediation of policy failure~~ Auto remediation (script execution) on policy failure May 2, 2024

nonpunctual added the customer-flavia label May 2, 2024

dherder added the prospect-themis label May 7, 2024

noahtalerman assigned nonpunctual May 9, 2024

dherder added the ~sc Request is a requirement in a presales opportunity label May 9, 2024

noahtalerman assigned iansltx and unassigned noahtalerman Oct 30, 2024

noahtalerman added a commit that referenced this issue Oct 31, 2024

Permissions for polic automations

78d0d1c

More context here: #17129 (comment)

noahtalerman mentioned this issue Oct 31, 2024

Team maintainers can manage policy automations #23433

Closed

noahtalerman assigned noahtalerman and unassigned iansltx Oct 31, 2024

iansltx assigned iansltx and unassigned noahtalerman Oct 31, 2024

iansltx added a commit that referenced this issue Nov 1, 2024

Note permissions distinction between global policy automations and so…

0018d61

…ftware install (#19551) and script execution (#17129) policy automations

iansltx mentioned this issue Nov 1, 2024

Maintainers should be able to apply install/script automations from the UI #23448

Open

iansltx assigned noahtalerman and unassigned iansltx Nov 1, 2024

iansltx added a commit that referenced this issue Nov 4, 2024

Docs: Note permissions distinction between global policy automations …

1d0ab56

…and software install (#19551) and script execution (#17129) policy automations (#23447) Co-authored-by: Noah Talerman <[email protected]>

noahtalerman closed this as completed Nov 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Policy automations: run script #17129

Policy automations: run script #17129

dherder commented Feb 23, 2024 •

edited by noahtalerman

Loading

noahtalerman commented Feb 27, 2024 •

edited

Loading

dherder commented Feb 29, 2024

noahtalerman commented Mar 1, 2024

dherder commented Mar 7, 2024

noahtalerman commented Mar 11, 2024

nonpunctual commented May 2, 2024

dherder commented May 2, 2024

noahtalerman commented May 7, 2024

noahtalerman commented Oct 28, 2024

noahtalerman commented Oct 28, 2024

noahtalerman commented Oct 29, 2024

iansltx commented Oct 29, 2024

noahtalerman commented Oct 29, 2024 •

edited

Loading

noahtalerman commented Oct 31, 2024

RachelElysia commented Oct 31, 2024 •

edited

Loading

iansltx commented Oct 31, 2024

iansltx commented Oct 31, 2024

iansltx commented Nov 1, 2024

iansltx commented Nov 1, 2024

iansltx commented Nov 1, 2024 •

edited

Loading

iansltx commented Nov 1, 2024

noahtalerman commented Nov 5, 2024

fleet-release commented Nov 5, 2024

Policy automations: run script #17129

Policy automations: run script #17129

Comments

dherder commented Feb 23, 2024 • edited by noahtalerman Loading

Goal

Context

Changes

Product

Engineering

QA

Risk assessment

Manual testing steps

Migration

Regression avoidance

UI

Policy automation execution

No-ops

GitOps

Changing existing configuration

Validation errors

Confirmation

noahtalerman commented Feb 27, 2024 • edited Loading

dherder commented Feb 29, 2024

noahtalerman commented Mar 1, 2024

dherder commented Mar 7, 2024

noahtalerman commented Mar 11, 2024

nonpunctual commented May 2, 2024

dherder commented May 2, 2024

noahtalerman commented May 7, 2024

noahtalerman commented Oct 28, 2024

noahtalerman commented Oct 28, 2024

noahtalerman commented Oct 29, 2024

iansltx commented Oct 29, 2024

noahtalerman commented Oct 29, 2024 • edited Loading

noahtalerman commented Oct 31, 2024

RachelElysia commented Oct 31, 2024 • edited Loading

iansltx commented Oct 31, 2024

iansltx commented Oct 31, 2024

iansltx commented Nov 1, 2024

iansltx commented Nov 1, 2024

iansltx commented Nov 1, 2024 • edited Loading

iansltx commented Nov 1, 2024

noahtalerman commented Nov 5, 2024

fleet-release commented Nov 5, 2024

dherder commented Feb 23, 2024 •

edited by noahtalerman

Loading

noahtalerman commented Feb 27, 2024 •

edited

Loading

noahtalerman commented Oct 29, 2024 •

edited

Loading

RachelElysia commented Oct 31, 2024 •

edited

Loading

iansltx commented Nov 1, 2024 •

edited

Loading