Designing the Data Layer for Efficiency #20
emperorjm
started this conversation in
Smart Contract Development
Replies: 1 comment
-
Thanks for this content @emperorjm The suggested approach is very much the same as what I just wrote in the Governance forum regarding the "high gas fees" topic (read the post) |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Background:
When discussing data organization, those of us rooted in traditional software development naturally think of Relational Databases. We're familiar with tables, primary keys, defining relationships, and the relentless pursuit of data normalization. Contrasting this is the blockchain realm, where data predominantly lives in Key-Value stores. This setup draws parallels with NoSQL databases, but it's more reminiscent of tools like Redis, Cassandra, or Memcached.
The tech community today often regards relational structures as complex and outdated. However, in blockchain dapp design, there's no one-size-fits-all solution. Let's delve deeper.
Example:
Let’s look at an example. Suppose we have a multi-library system that tracks all the books in each library.
Relational Database Approach:
Here's a typical relational structure::
NoSQL (Document-Oriented) Approach:
With NoSQL databases, such as MongoDB, the data presentation is denormalized, offering flexibility. We might consider two collections: one for libraries (with locations) and another for books. However, let's explore a unified library collection approach:
Key-Value Database Approach:
What’s the ideal structure for a key-value database, like what we have in CosmWasm? Should we follow the traditional normalized structure, or should we simply put everything together, similar to what Document databases recommend (MongoDB, n.d.)? One might be tempted to do something like the following:
Past experiences suggest that this structure may be inefficient for large data sets. For example, modifying just one book out of a million would require loading all one million book entries, leading to increased gas usage. Search operations could also become sluggish.
Questions:
Gas Consumption and Efficiency: Pulling extensive records in scenarios with vast datasets could inflate gas costs. If we have a library with a million books, fetching a single extensive record could be resource-intensive. Using Big O notation, this operation seems O(n^2) in complexity when it comes to searching and updating records.
Relational Mimicry in Key-Value Store: To update attributes like the library's name or location, it's inefficient to pull records with redundant data. Perhaps we should lean on our relational instincts and try to normalize the data to increase efficiency.
Something like the following could suffice:
Design Philosophy: Do you think the design principles applied in traditional databases can or should be directly translated into blockchain data design, or do we need to reconsider our foundational assumptions?
Scalability Concerns: How do you see the trade-offs between data normalization in relational databases and the flatter structures in key-value stores evolving as the volume of data on the blockchain grows?
Performance Metrics: Beyond gas consumption, what other performance metrics should we consider crucial when designing data structures for blockchain?
key-value stores generally lack the transactional integrity and query capabilities of relational databases, so you may need to implement additional logic to handle those aspects if they are necessary for your application. Our next discussion will touch on the effective use of Indexes.
References
MongoDB. (n.d.). Document Database - NoSQL. [online] Available at: https://www.mongodb.com/document-databases#:~:text=Document%20databases%20have%20the%20following [Accessed 18 Sep. 2023].
Beta Was this translation helpful? Give feedback.
All reactions