Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[views] Error when inserting new records in junction tables. #297

Closed
Wild-Soul opened this issue Jun 15, 2022 · 6 comments
Closed

[views] Error when inserting new records in junction tables. #297

Wild-Soul opened this issue Jun 15, 2022 · 6 comments

Comments

@Wild-Soul
Copy link

PGSync version: master

Postgres version: 14

Elasticsearch version: 7+

Redis version: latest

Python version: 3.7

Problem Description:
Hi,
I'm trying to sync a materialized view in postgres to ES.
Followed the comments in this issue. And was able to get the initial sync.

But when linking books to new categories, I'm getting an error (I've pasted the entire trace block below)

Here's the screenshot of schema :
Screenshot 2022-06-15 at 3 04 39 PM

Queries to create sample data :

create table book (
	id int primary key,
	name varchar(40)
);

create table book_category (
	book_id int,
	category_id int
);
alter table book_category add constraint fk_book_id foreign key(book_id) references book(id);
alter table book_category add constraint fk_cat_id foreign key(category_id) references cat(id);

create table cat (
	id int primary key,
	value varchar(40)
)

-- create some books.
insert into book(id, name) values(1,'book-1');
insert into book(id, name) values(2, 'book-2');
insert into book(id, name) values(3, 'book-3');

-- create some categories.
insert into cat(id, value) values(1, 'adventure');
insert into cat(id, value) values(2, 'action');
insert into cat(id, value) values(3, 'sci-fi');

-- create relation between book and categories.
insert into book_category (book_id, category_id) values (1, 1);
insert into book_category (book_id, category_id) values (1, 2);
insert into book_category (book_id, category_id) values (2, 1);

-- create a materialized view of this (book + category).
create materialized view mvw_books as (
	with get_categories as (
		select
			b.id, b."name",
			array_agg(c.id) as category
			from book b
			left join book_category bc
			on b.id=bc.book_id
			left join cat c on c.id=bc.category_id
			group by b.id, b."name"
	)
	select * from get_categories
);

select * from mvw_books;


-- This is the update part, that generates error.
insert into book_category (book_id, category_id) values (3, 1);

Schema :

[
    {
        "database": "postgres",
        "index": "book_data",
        "nodes": {
            "table": "mvw_books",
            "base_tables": ["book", "book_category", "cat"],
            "columns": [
                "id",
                "name",
                "category"
            ],
            "primary_key": ["id"]
        }
    }
]

Error Message (if any):

2022-06-15 09:29:02.229:ERROR:pgsync.sync: Primary keys ['id'] not subset of payload data dict_keys(['book_id', 'category_id']) for table public.mvw_books
NoneType: None
2022-06-15 09:29:02.326:ERROR:pgsync.elastichelper: Exception No active exception to reraise
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/pgsync/elastichelper.py", line 142, in bulk
    raise_on_error=raise_on_error,
  File "/usr/local/lib/python3.7/site-packages/pgsync/elastichelper.py", line 197, in _bulk
    ignore_status=ignore_status,
  File "/usr/local/lib/python3.7/site-packages/elasticsearch/helpers/actions.py", line 484, in parallel_bulk
    actions, chunk_size, max_chunk_bytes, client.transport.serializer
  File "/usr/local/lib/python3.7/multiprocessing/pool.py", line 748, in next
    raise value
  File "/usr/local/lib/python3.7/multiprocessing/pool.py", line 121, in worker
    result = (True, func(*args, **kwds))
  File "/usr/local/lib/python3.7/multiprocessing/pool.py", line 140, in _helper_reraises_exception
    raise ex
  File "/usr/local/lib/python3.7/multiprocessing/pool.py", line 292, in _guarded_task_generation
    for i, x in enumerate(iterable):
  File "/usr/local/lib/python3.7/site-packages/elasticsearch/helpers/actions.py", line 155, in _chunk_actions
    for action, data in actions:
  File "/usr/local/lib/python3.7/site-packages/pgsync/sync.py", line 791, in _payloads
    raise
RuntimeError: No active exception to reraise
Exception in poll_redis() for thread Thread-16: No active exception to reraise

I tried by adding book_id and category_id, but it gave a different error. I think my schema.json file is wrong, but don't know what can be done to fix it.

@Wild-Soul Wild-Soul changed the title [views] Inserting new records in junction table is giving error. [views] Error when inserting new records in junction tables. Jun 15, 2022
@Wild-Soul
Copy link
Author

Hi @toluaina ,
Is there any documentation for views ?

@toluaina
Copy link
Owner

I'm afraid I don't have much documentation on views yet. Can you try with 2 tables?
Otherwise I will take a look at this shortly.

@Wild-Soul
Copy link
Author

Wild-Soul commented Jun 17, 2022

@toluaina I tried by changing a column name in book_category table from book_id to id.
The same update query worked fine, whereas it was giving that error before.
insert query -> insert into book_category (book_id, category_id) values (3, 1);

So I'm guessing (can't say for sure), that pgsync is expecting the primary_key list to be same across all the base tables. Since changing the book_id to id worked.

@toluaina
Copy link
Owner

  • Sorry for the delay.
  • So the reason for this is because the book_category does not have a primary_key defined.
  • All relations need to have a primary_key otherwise you run into exceptions like this.
  • I will add a validation check to ensure all base tables contain a primary_key.

You can add it manually with this, them make sure you re-run bootstrap

ALTER TABLE book_category ADD id INTEGER;
ALTER TABLE book_category ADD PRIMARY KEY(id); 
bootstrap -c examples/book_view/schema.json -t
bootstrap -c examples/book_view/schema.json 

@Wild-Soul
Copy link
Author

Hi @toluaina ,
Thanks for helping me out.... will test it out ASAP.

@Wild-Soul
Copy link
Author

Hi @toluaina ,
It works after adding an auto increment PK in book_category. Closing this issue.

Thanks :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants