[D1]: move to rows read/written (cloudflare#10412)

* d1: move to rows read/written * Apply suggestions from code review Co-authored-by: Maddy <[email protected]> --------- Co-authored-by: Maddy <[email protected]>
Laurry-gee · Aug 18, 2023 · 541a672 · 541a672
1 parent 1c23818
commit 541a672
Show file tree

Hide file tree

Showing 5 changed files with 53 additions and 17 deletions.
diff --git a/content/d1/changelog.md b/content/d1/changelog.md
@@ -7,6 +7,25 @@ rss: file
 
 # Changelog
 
+## 2023-08-19
+
+### Row count now returned per query
+
+D1 now returns a count of `rows_written` and `rows_read` for every query executed, allowing you to assess the cost of query for both [pricing](/d1/platform/pricing/) and [index optimization](/d1/learning/using-indexes/) purposes.
+
+The `meta` object returned in [D1's Client API](/d1/platform/client-api/) contains a total count of the rows read (`rows_read`) and rows written (`rows_written`) by that query. For example, a query that performs a full table scan (for example, `SELECT * FROM users`) from a table with 5000 rows would return a `rows_read` value of `5000`:
+
+```json
+"meta": {
+  "duration": 0.20472300052642825,
+  "size_after": 45137920,
+  "rows_read": 5000,
+  "rows_written": 0
+}
+```
+
+Refer to [D1 pricing documentation](/d1/platform/pricing/) to understand how reads and writes are measured. D1 remains free to use during the alpha period.
+
 ## 2023-08-09
 
 ### Bind D1 from the Cloudflare dashboard

diff --git a/content/d1/learning/using-indexes.md b/content/d1/learning/using-indexes.md
@@ -137,7 +137,7 @@ Use `DROP INDEX` to remove an index. Dropped indexes cannot be restored.
 
 Take note of the following considerations when creating indexes:
 
-* Indexes are not always a free performance boost. You should create indexes only on columns that reflect your most-queried columns. Indexes themselves need to be maintained. When you write to an indexed column, the database needs to write to the table and the index.
+* Indexes are not always a free performance boost. You should create indexes only on columns that reflect your most-queried columns. Indexes themselves need to be maintained. When you write to an indexed column, the database needs to write to the table and the index. The performance benefit of an index and reduction in rows read will, in nearly all cases, offset this additonal write.
 * You cannot create indexes that reference other tables or use non-deterministic functions, since the index would not be stable.
 * Indexes cannot be updated. To add or remove a column from an index, [remove](#removing-indexes) the index and then [create a new index](#create-an-index) with the new columns.
 * Indexes contribute to the overall storage required by your database: an index is effectively a table itself.
diff --git a/content/d1/platform/client-api.md b/content/d1/platform/client-api.md
@@ -74,6 +74,8 @@ The methods `stmt.run()`, `stmt.all()` and `db.batch()` return a typed `D1Result
   success: boolean, // true if the operation was successful, false otherwise
   meta: {
     duration: number, // duration of the operation in milliseconds
+    rows_read: number, // the number of rows read (scanned) by this query
+    rows_written: number // the number of rows written by this query
   }
 }
 ```

diff --git a/content/d1/platform/pricing.md b/content/d1/platform/pricing.md
@@ -9,10 +9,11 @@ title: Pricing
 While in public Alpha, D1 is currently free to use on all [Workers plans](/workers/platform/pricing/#workers). Refer to the [our recent announcement](https://blog.cloudflare.com/d1-turning-it-up-to-11/)) for more information.
 {{</Aside>}}
 
-D1's billing is based on:
+D1 bills based on:
 
-* *What you use*: queries you issue against D1 will consume read units and/or write units depending on the volume of data read (scanned) or written.
-* *Scale-to-zero*: You are not billed for "hours" or "capacity units": if you are not issuing queries against your database, you are only billed for storage above the included limits of your plan when your database is not in use.
+* **Usage**: Queries you issue against D1 will count as rows read, rows written, or both (for transactions or batches).
+* **Scale-to-zero**: You are not billed for "hours" or "capacity units". If you are not issuing queries against your database, you are not billed for compute.
+* **Storage**: You are only billed for storage above the included [limits](/d1/platform/limits/) of your plan.
 
 ## Billing metrics 
 
@@ -50,7 +51,18 @@ For [Workers Paid tier](/workers/platform/pricing/#workers) users, we intend to
 
 * How can I estimate my (eventual) bill?
 
-We'll be adding analytics for read units, write units and storage at both the account level and per-database, so you can both track overall usage and assess which database(s) are contributing to your usage ahead of enabling any billing.
+Every query returns a `meta` object that contains a total count of the rows read (`rows_read`) and rows written (`rows_written`) by that query. For example, a query that performs a full table scan (for instance, `SELECT * FROM users`) from a table with 5000 rows would return a `rows_read` value of `5000`:
+
+```json
+"meta": {
+  "duration": 0.20472300052642825,
+  "size_after": 45137920,
+  "rows_read": 5000,
+  "rows_written": 0
+}
+```
+
+These are also included in the D1 [Cloudflare dashboard](https://dash.cloudflare.com) and the [analytics API](/d1/platform/metrics-analytics/), allowing you to attribute read and write volumes to specific databases, time periods, or both.
 
 * Does D1 charge for data transfer / egress?
 
@@ -64,9 +76,9 @@ D1 itself does not charge for additional compute. Workers querying D1 and comput
 
 Yes: any queries you issue against your database, including `INSERT`ing existing data into a new database, table scans (`SELECT * FROM table`), or creating indexes count as either reads or writes.
 
-* Can I use an index to reduce the number of read units consumed?
+* Can I use an index to reduce the number of rows read by a query?
 
-Yes! [Creating indexes](/d1/learning/using-indexes/) for your most queried tables and filtered columns can reduce how much data is scanned and improve query performance at the same time. If you have a read-heavy workload (most common), this can be particularly advantageous. Note that writing to columns referenced in an index will add at least one (1) additional write unit to account for updating the index, but this is typically offset by the reduction in read units consumed due to the benefits of an index.
+Yes, you can use an index to reduce the number of rows read by a query. [Creating indexes](/d1/learning/using-indexes/) for your most queried tables and filtered columns reduce how much data is scanned and improve query performance at the same time. If you have a read-heavy workload (most common), this can be particularly advantageous. Writing to columns referenced in an index will add at least one (1) additional row written to account for updating the index, but this is typically offset by the reduction in rows read due to the benefits of an index.
 
 * Does a freshly created database, and/or an empty table with no rows, contribute to my storage?
 

diff --git a/content/workers/_partials/_d1-pricing.md b/content/workers/_partials/_d1-pricing.md
@@ -6,20 +6,23 @@ _build:
 ---
 
 {{<Aside type="note">}}
-The alpha [currently limits](/d1/platform/limits/) maximum database size to 100 MB and allows a total of 10 databases across all [Workers plans](/workers/platform/pricing/#workers). Pricing below is not yet final.
+The alpha [currently limits](/d1/platform/limits/) maximum database size to 500 MB and allows a total of 10 databases across all [Workers plans](/workers/platform/pricing/#workers). Pricing below is not yet final.
 {{</Aside>}}
 
 |                                 | [Workers Free](/workers/platform/pricing/#workers) | [Workers Paid](/workers/platform/pricing/#workers)                 |
 | ------------------------------- | -------------------------------------------------- | ------------------------------------------------------------------ |
-| Read units (per 4KB scanned)    | 5 million / day                                    | First 25 billion / month included  + $0.001 / million units |
-| Write units (per 1KB written)   | 100,000 / day                                      | First 50 million / month included + $1.00 / million units |
+| Rows read                       | 5 million / day                                    | First 25 billion / month included  + $0.001 / million rows |
+| Rows written                    | 100,000 / day                                      | First 50 million / month included + $1.00 / million rows |
 | Storage (per GB stored)         | 1GB (total)                                        | First 5GB included + $0.75 / GB-mo |
 
-Notes:
 
-1. Read units measure how much data a query reads (scans), in units of 4 KB. For example, if you have a table with 5000 rows, with each row ~200 bytes, and run a `SELECT * FROM table`, your query would scan (5000 rows * 0.2KB / 4KB read unit) 1000 KB in total, or 250 read units.
-2. Write units measure how much data was written to a D1 database, in 1KB units. An `INSERT` of a single row of 1900 bytes — a userID, name, email address and comments field, for example — would count as two (2) write units (2KB).
-3. Both read and write units are rounded up to the nearest whole unit. A query that reads 1,000 rows of approximately 90 bytes (`1000*.009 / 4`), would consume 23 read units.
-4. Storage is based on gigabytes stored per month, and is based on the sum of all databases in your account. Tables and indexes both count towards storage consumed.
-5. Free limits reset daily at 00:00 UTC. Monthly included limits reset based on your monthly subscription renewal date, which is determined by the day you first subscribed.
-6. There are no data transfer (egress) or throughput (bandwidth) charges for data accessed from D1.
+
+### Definitions
+1. Rows read measure how many rows a query reads (scans), regardless of the size of each row. For example, if you have a table with 5000 rows and run a `SELECT * FROM table` as a full table scan, this would count as 5,000 rows read. A query that filters on an [unindexed column](/d1/learning/using-indexes/) may return fewer rows to your Worker, but is still required to read (scan) more rows to determine which subset to return.
+2. Rows written measure how many rows were written to D1 database. A query that `INSERT` 10 rows into a `users` table would count as 10 rows written.
+3. Row size or the number of columns in a row does not impact how rows are counted. A row that is 1 KB and a row that is 100 KB both count as one row.
+4. Definining [indexes](/d1/learning/using-indexes/) on your table(s) reduce the number of rows read by a query when filtering on that indexed field. For example, if the `users` table has an index on a timestamp column `created_at`, the query `SELECT * FROM users WHERE created_at > ?1` would only need to read a subset of the table.
+5. Indexes will add an additional written row when writes include the indexed column, as there are two rows written: one to the table itself, and one to the index. The performance benefit of an index and reduction in rows read will, in nearly all cases, offset this additonal write.
+6. Storage is based on gigabytes stored per month, and is based on the sum of all databases in your account. Tables and indexes both count towards storage consumed.
+7. Free limits reset daily at 00:00 UTC. Monthly included limits reset based on your monthly subscription renewal date, which is determined by the day you first subscribed.
+8. There are no data transfer (egress) or throughput (bandwidth) charges for data accessed from D1.