Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

postgres-xl support for 'DISTRIBUTE BY' #248

Open
mahald opened this issue Oct 9, 2017 · 4 comments
Open

postgres-xl support for 'DISTRIBUTE BY' #248

mahald opened this issue Oct 9, 2017 · 4 comments
Assignees
Labels
enhancement New feature or request good first issue Good for newcomers
Milestone

Comments

@mahald
Copy link

mahald commented Oct 9, 2017

Hi,

Wonder wheter it's possible to use postgresql-xl with this ef driver. It would have to be possible to provide the
'DISTRIBUTE BY' then somehow on the the table.

CREATE TABLE disttab (col1 int, col2 int, col3 text) DISTRIBUTE BY HASH(col1);
\d+ disttab
CREATE TABLE repltab (col1 int, col2 int) DISTRIBUTE BY REPLICATION;
\d+ repltab

i assume it is not yet possible, would it be possible to add this as new Feature ?

@roji
Copy link
Member

roji commented Oct 10, 2017

It's actually quite easy to add this capability - see #199 for a similar example if you want to submit a PR.

You can also easily add this to your migrations by manually editing the migration code and adding the DISTRIBUTE BY clause.

@roji roji added enhancement New feature or request good first issue Good for newcomers labels Jun 10, 2018
@roji roji added this to the Backlog milestone Jun 10, 2018
@Jmorjsm
Copy link

Jmorjsm commented Feb 6, 2021

I'm interested in picking this up.
@roji what would your thoughts be on generic naming for annotations in this? I ask as I've been looking for a similar solution for greenplum but this is obviously a feature in multiple distributed postgresql DBs.

@roji
Copy link
Member

roji commented Feb 6, 2021

@Jmorjsm great. I'm not familiar with this feature (or postgres-xsl/greenplum), are you saying multiple distributed PostgreSQL DBs have the exact same command syntax, as a sort of standard? I'd be wary of assuming cross-database compatibility here - even if there's similarity between the different DBs, I'm guessing there are syntax differences as well. A comparison with some links to the different database docs would be useful.

Otherwise, and to be on the safe side, I'd go with an entity type builder extension such as PostgresXLDistributeBy, to make sure there's no confusion about this being applicable to regular PostgreSQL.

There's also the question of what DISTRIBUTE BY accepts as a parameter, and modeling that well. You can write up a quick proposal (nothing too formal), if you'd prefer to submit a PR directly that's fine too.

@Jmorjsm
Copy link

Jmorjsm commented Feb 13, 2021

Turns out postgres-xl has a few more distribution options than Greenplum, with distribution styles and more distribution strategies.
From the postgres-xl documentation:

[ 
  DISTRIBUTE BY { REPLICATION | ROUNDROBIN | { [HASH | MODULO ] ( column_name ) } } |
  DISTRIBUTED { { BY ( column_name ) } | { RANDOMLY } |
  DISTSTYLE { EVEN | KEY | ALL } DISTKEY ( column_name )
]

I have implemented all of the above in #1697.
There is some overlap in the syntax supported by both postgres-xl and greenplum, namely when using DISTRIBUTED BY (column_name) or DISTRIBUTED RANDOMLY, but all other strategies have different syntax. Fortunately in my use case with Greenplum, this is sufficient.
From the Greenplum documentation:

[ DISTRIBUTED BY (column [opclass], [ ... ] ) 
       | DISTRIBUTED RANDOMLY | DISTRIBUTED REPLICATED ]

@roji roji modified the milestones: Backlog, 6.0.0 Feb 15, 2021
@roji roji modified the milestones: 6.0.0, 7.0.0 Oct 9, 2021
@roji roji modified the milestones: 7.0.0, 8.0.0 Oct 15, 2022
@roji roji modified the milestones: 8.0.0, Backlog Nov 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

3 participants