Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamo DB seems to be corrupting shared-local-instance.db file #105

Open
AugustoQueiroz opened this issue Apr 29, 2021 · 10 comments
Open

Dynamo DB seems to be corrupting shared-local-instance.db file #105

AugustoQueiroz opened this issue Apr 29, 2021 · 10 comments

Comments

@AugustoQueiroz
Copy link

I'm not sure that the problem is 100% here, but from the searching I did on the web + testing I tried locally, it seems that at least some of it is here. So, the problem:

When using dynamodb-admin, PutItem and UpdateItem operations are broken on dynamodb-local in that they create duplicates of the item in question (in the case of PutItem if you're trying to add a value that already exists, in the case of UpdateItem no matter what). This problem happened to me and other people I'm working with and we run different systems on different machines (I'll try to detail a little about this later) when either we try to update an item directly through dynamodb-admin (i.e.: when we add an item, and then open it up, change the json, and save), and when the update was done through code (when we're testing our API locally). Whenever an update is tried, a duplicate of the object is created (with the updated info), and then dynamodb-local breaks (because it now has two items sharing partition and sort keys). Deleting the item (at least through dynamodb-admin) is impossible, as is purging the table (because deleting is impossible), so the only option is to either delete the table or delete the entire shared-local-instance.db and repopulate it. Although we haven't actually tried putting the same item twice, this seems to be a problem that usually is related to the update problem I just described.

At first I assumed this was some problem with the dynamodb-local instance, but I found a thread where people described the same problem and it was said that this problem was usually found when people used a third-party viewer to manage the database (such as dynamodb-admin itself), which would corrupt the .db file causing the hash and range keys to be lost. To verify that this was, indeed, the case, I removed dynamodb-admin from my workflow, doing everything through the command line. After that sequences of API uses that would previously fail (because it would first update an item, then try to access the same item, but due to this very problem the item had become innaccessible and would cause a crash of dynamodb-local) were now working, with updates working correctly. This shows that the answer provided in the thread is correct at least to some extent.

Some more technical detailing:

  • I'm running on Arch Linux, whereas two of my colleagues have had the same problem on macOS
  • I'm running dynamodb-admin version 4.0.1 installed through npm
  • The dynamodb-local docker instance was setup using this guide from AWS
@rchl
Copy link
Collaborator

rchl commented May 12, 2021

I've tried on the database I have handy and couldn't reproduce. I think this could depend on the structure of the database. If you have exact steps, including db structure then please share.

@AugustoQueiroz
Copy link
Author

I hope this helps:

This was a rather simple database, with only 5 tables, most of them having only a hash and range keys and no other indexes, except for one, which had a secondary global index with hash and range keys.

All the tables were created (locally) through DynamoDB admin, so the "workflow" was to run docker-compose up with the dynamodb-local instance, then jump into dynamodb-admin and create the tables. One of these tables was populated through a JS script, but that was only added later, and after the problem was already known to happen. The other ones were left empty and would only be populated through the testing of the API.

Writing to the database was done through js, using aws-sdk. Writes worked a-ok, and a first update worked ok as well, but as soon as the API tried reading an item that had been updated, dynamodb-local would crash. This also happened if we tried to update an item manually, as I described.

Since we stopped using dynamodb-admin we haven't had such a problem, despite not changing anything other than that (the code is the same, and the docker instance is the same as well).

Do you think there might be somewhere I can find logs for either dynamodb-admin or dynamodb-local? This way I might provide more/better info.

Other than that, I hope this helps.

@rion18
Copy link

rion18 commented Mar 31, 2022

I have this issue as well. Any UpdateItem operation duplicates the item and then just the db stops working. I can literally replicate it with a single table.

  MyOwnTable:
    Type: AWS::DynamoDB::Table
    Properties:
      TableName: MyOwnTable
      TableClass: STANDARD
      AttributeDefinitions:
        - AttributeName: another_id
          AttributeType: S
        - AttributeName: this_id
          AttributeType: S
      KeySchema:
        - AttributeName: another_id
          KeyType: HASH
        - AttributeName: this_id
          KeyType: RANGE
      BillingMode: PAY_PER_REQUEST

If you use dynamodb-admin to go to this table, create the following item.

{
  "another_id": "123",
  "this_id": "abc",
  "myVal": 4
}

Click the Save button. Reopen the same item and then change "myVal" to 5.

The UI doesn't react, the log fills up with

ar 31, 2022 3:22:02 AM com.almworks.sqlite4java.Internal log
WARNING: [sqlite] SQLiteDBAccess$14@1991783d: job exception
com.amazonaws.services.dynamodbv2.local.shared.exceptions.LocalDBAccessException: Given key conditions were not unique. Returned: **details on the fields you just changed**
	at com.amazonaws.services.dynamodbv2.local.shared.access.LocalDBUtils.ldAccessFail(LocalDBUtils.java:799)
	at com.amazonaws.services.dynamodbv2.local.shared.access.sqlite.SQLiteDBAccessJob.getRecordInternal(SQLiteDBAccessJob.java:224)
	at com.amazonaws.services.dynamodbv2.local.shared.access.sqlite.SQLiteDBAccess$14.doWork(SQLiteDBAccess.java:1555)
	at com.amazonaws.services.dynamodbv2.local.shared.access.sqlite.SQLiteDBAccess$14.doWork(SQLiteDBAccess.java:1551)
	at com.amazonaws.services.dynamodbv2.local.shared.access.sqlite.AmazonDynamoDBOfflineSQLiteJob.job(AmazonDynamoDBOfflineSQLiteJob.java:117)
	at com.almworks.sqlite4java.SQLiteJob.execute(SQLiteJob.java:372)
	at com.almworks.sqlite4java.SQLiteQueue.executeJob(SQLiteQueue.java:534)
	at com.almworks.sqlite4java.SQLiteQueue.queueFunction(SQLiteQueue.java:667)
	at com.almworks.sqlite4java.SQLiteQueue.runQueue(SQLiteQueue.java:623)
	at com.almworks.sqlite4java.SQLiteQueue.access$000(SQLiteQueue.java:77)
	at com.almworks.sqlite4java.SQLiteQueue$1.run(SQLiteQueue.java:205)
	at java.lang.Thread.run(Thread.java:748)

and finally, we can see that there are two items on the table. Clicking on them triggers more of these logs.

@rion18
Copy link

rion18 commented Mar 31, 2022

I just found out something interesting. I downloaded NoSQL Workbench from Amazon and tried it again. I get the same thing. I think we can safely assume that this doesn't depend on the UI for the Dynamodb local instance, but rather the way the instance is setup. I'm currently using https://www.serverless.com/plugins/serverless-dynamodb-local, which COULD be the culprit. This plugin uses https://www.npmjs.com/package/dynamodb-localhost as an underlying dependency, and this one uses the official DynamoDBLocal.jar file... Interesting 🤔

@rion18
Copy link

rion18 commented Mar 31, 2022

I added noStart:true for serverless-dynamodb-local, and decided to stand up my own Dockerized Local DynamoDB. It works. It does seem there's an issue on how serverless-dynamodb-local or dynamodb-localhost work.

@AugustoQueiroz By any chance, were you using any of those two libraries?

@rion18
Copy link

rion18 commented Mar 31, 2022

I think I managed to isolate the culprit to the option: -optimizeDbBeforeStartup...

@thisishantzz
Copy link

Removing that option worked for me.

@nir581
Copy link

nir581 commented Nov 14, 2022

Maybe it is related to saving different data comprised out of different Dynamo's ConversionSchema (V1, V2, V2_COMPATIBLE). This can potentially courrpt the data.

@dminhhoang26
Copy link

Yes, UpdateItem is created a new record and cause duplicated key for me. It is also occur in my both MacOS and wsl Ubuntu.

@kookster
Copy link

kookster commented Sep 26, 2024

same issue - update item was creating duplicates, until I removed the -optimizeDbBeforeStartup option when starting dynamodb-local. Since I removed that, no dupes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants