Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Global Row Level Security Support for Presto #20572

Open
hmadison opened this issue Aug 15, 2023 · 9 comments
Open

Global Row Level Security Support for Presto #20572

hmadison opened this issue Aug 15, 2023 · 9 comments

Comments

@hmadison
Copy link
Contributor

Expected Behavior or Use Case

Presto should have an interface which allows end users to inject predicates to provide "Row Level Security".

Presto Component, Service, or Connector

To the best of my understanding, this would be a coordinator change.

Possible Implementation

We could update the ConnectionManager to install additional logical plan optimizers which would be provided by a new SPI.

    interface SecurityPlanProvider {
    /**
     * Provides a {@link ConnectorPlanOptimizer} which is able to modify the query on the coordinator
     * before it is submitted to any worker node. This allows for any additional filtering based on the user
     * session to be added.
     * 
     * @param targetConnectorId The identifier of the connector which the optimizer would be applied to.
     * @return A logical plan optimizer which.
     */
    Optional<ConnectorPlanOptimizer> getSecurityPlanOptimizer(ConnectorId targetConnectorId);
}

The provider optimizers would be installed before any connector provider optimizers in the ConnectionManager allowing the connector provider optimizers the ability to work on any additional input that the “security optimizer” would provide.

Context

“Row level security” has a number of open, stale or closed issues tracking its implementation inside of Presto:

While there has been work in this area, there isn’t a wholistic approach which covers an entire deployment.

Unless you are using Hive and are able to take advantage of the Apache Ranger support, the “best” interface to cover a “Row Level Security” use case is ConnectorPlanOptimizer. The downside of this approach is that the only way to add new instances of ConnectorPlanOptimizer to inject the additional clauses to subset the data being queried is to provide a completely new adapter to Presto.

While this approach would work for new connectors, it doesn’t offer the best user experience when attempting to make use of this interface for existing connectors. The only three methods which I’m aware of would be to use “ByteBuddy” style class rewrites, wrap the existing connector or maintain a fork of the connector code.

@tdcmeehan
Copy link
Contributor

Is there an example of any current system that could be leveraged to implement this row level security?

@imjalpreet
Copy link
Member

There is another PR that was based on how Trino has implemented Row Filtering: #16955

They introduced the relevant callbacks in AccessControl Layer since this is a feature usually supported by different security systems eg. Ranger, and AWS Lake Formation. These systems provide a way to define policy rules which include row filters or column masks. On a high level, these callbacks were then used to fetch the relevant filter expressions from the security system and added these filters to the query plan. We can probably discuss more on whether we want to use a similar design or not.

@hmadison
Copy link
Contributor Author

Is there an example of any current system that could be leveraged to implement this row level security?

The only one that I'm aware of would be Apache Ranger's integration with Hive/Hivesql/Hadoop.

There is another PR that was based on how Trino has implemented Row Filtering: #16955

If we want to use this work as the foundation of row level security, I don't have any objections to it. What is missing from that pull request to get it into a mergable state?

@imjalpreet
Copy link
Member

If we want to use this work as the foundation of row level security, I don't have any objections to it. What is missing from that pull request to get it into a mergable state?

I guess the only thing would be to rebase on latest master and review the PR to see if we want any changes.

@tdcmeehan
Copy link
Contributor

I think the general approach is sound--this is a cross cutting concern, and the approach above gives this a dedicated SPI so it doesn't need to be rebuilt for every connector.

@tdcmeehan
Copy link
Contributor

@hmadison can you update your design to have a more focused SPI in a similar vein as what was attempted in #16955?

@tdcmeehan tdcmeehan moved this from 🆕 Unprioritized to 📋 Prioritized Backlog in Security Aug 18, 2023
@hmadison
Copy link
Contributor Author

Is the following similar to what you have in mind?

interface RowFilteringExpressionProvider {
/**
 * Provides a {@link PlanNode} which is inserted into the where clause for any place in the provided query which accesses the table.
 * 
 * @param session The identifier of the connector which the restriction would be applied to.
 * @param table The table to provide the filter for.
 * @return A plan fragment to attach to the where clause of thee query.
 */
  Optional<PlanNode> getRowFilter(ConnectorSession session, Table table);
}

@JaneHang9305
Copy link

@hmadison, Our team is also planning to add Row Level of Security support for Presto. We want to understand what is the further plan for this issue? Is there anyone actively working on this?

@hmadison
Copy link
Contributor Author

hmadison commented Jan 28, 2024

@JaneHang9305 I’ve ended up with a connector specific solution for my use cases. While I would love to do something more general I’m not sure when I’ll have the time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: 📋 Prioritized Backlog
Development

No branches or pull requests

4 participants