misc typo/format fixes

estuary · Dec 27, 2024 · e0d55e1 · e0d55e1
1 parent 59f07d9
commit e0d55e1
Show file tree

Hide file tree

Showing 13 changed files with 62 additions and 57 deletions.
diff --git a/site/docs/concepts/advanced/evolutions.md b/site/docs/concepts/advanced/evolutions.md
@@ -53,12 +53,12 @@ When you attempt to publish a breaking change to a collection in the Flow web ap
 
 Click the **Apply** button to trigger an evolution and update all necessary specification to keep your Data Flow functioning. Then, review and publish your draft.
 
-If you enabled [AutoDiscover](../captures.md#autodiscover) on a capture, any breaking changes that it introduces will trigger an automatic schema evolution, so long as you selected the **Breaking change re-versions collections** option(`evolveIncompatibleCollections`).
+If you enabled [AutoDiscover](../captures.md#autodiscover) on a capture, any breaking changes that it introduces will trigger an automatic schema evolution, so long as you selected the **Breaking change re-versions collections** option (`evolveIncompatibleCollections`).
 
 ## What do schema evolutions do?
 
 The schema evolution feature is available in the Flow web app when you're editing pre-existing Flow entities.
-It notices when one of your edit would cause other components of the Data Flow to fail, alerts you, and gives you the option to automatically update the specs of these components to prevent failure.
+It notices when one of your edits would cause other components of the Data Flow to fail, alerts you, and gives you the option to automatically update the specs of these components to prevent failure.
 
 In other words, evolutions happen in the *draft* state. Whenever you edit, you create a draft.
 Evolutions add to the draft so that when it is published and updates the active data flow, operations can continue seamlessly.

diff --git a/site/docs/concepts/collections.md b/site/docs/concepts/collections.md
@@ -332,7 +332,7 @@ If desired, a derivation could re-key the collection
 on `[/userId, /name]` to materialize the various `/name`s seen for a `/userId`.
 
 This property makes keys less lossy than they might otherwise appear,
-and it is generally good practice to chose a key that reflects how
+and it is generally good practice to choose a key that reflects how
 you wish to _query_ a collection, rather than an exhaustive key
 that's certain to be unique for every document.
 

diff --git a/site/docs/concepts/connectors.md b/site/docs/concepts/connectors.md
@@ -219,7 +219,7 @@ sops:
 ```
 
 You then use this `config.yaml` within your Flow specification.
-The Flow runtime knows that this document is protected by `sops`
+The Flow runtime knows that this document is protected by `sops`,
 will continue to store it in its protected form,
 and will attempt a decryption only when invoking a connector on your behalf.
 

diff --git a/site/docs/concepts/derivations.md b/site/docs/concepts/derivations.md
@@ -218,8 +218,8 @@ into JSON arrays or objects and embeds them into the mapped document:
 `{"greeting": "hello", "items": [1, "two", 3]}`.
 If parsing fails, the raw string is used instead.
 
-If you would like to select all columns of the input collection, 
-rather than `select *`, use `select JSON($flow_document)`, e.g. 
+If you would like to select all columns of the input collection,
+rather than `select *`, use `select JSON($flow_document)`, e.g.
 `select JSON($flow_document where $status = open;`.
 
 As a special case if your query selects a _single_ column
@@ -608,6 +608,7 @@ Flow read delays are very efficient and scale better
 than managing very large numbers of fine-grain timers.
 
 [See Grouped Windows of Transfers for an example using a read delay](#grouped-windows-of-transfers)
+
 [Learn more from the Citi Bike "idle bikes" example](https://github.com/estuary/flow/blob/master/examples/citi-bike/idle-bikes.flow.yaml)
 
 ### Read priority
@@ -639,7 +640,7 @@ For SQLite derivations,
 the entire SQLite database is the internal state of the task.
 TypeScript derivations can use in-memory states with a
 recovery and checkpoint mechanism.
-Estuary intends to offer an additional mechanisms for
+Estuary intends to offer additional mechanisms for
 automatic internal state snapshot and recovery in the future.
 
 The exact nature of internal task states vary,

diff --git a/site/docs/concepts/import.md b/site/docs/concepts/import.md
@@ -3,7 +3,7 @@ sidebar_position: 7
 ---
 # Imports
 
-When you work on a draft Data Flow [using `flowctl draft`](../concepts/flowctl.md#working-with-drafts),
+When you work on a draft Data Flow [using `flowctl draft`](../guides/flowctl/edit-draft-from-webapp.md),
 your Flow specifications may be spread across multiple files.
 For example, you may have multiple **materializations** that read from collections defined in separate files,
 or you could store a **derivation** separately from its **tests**.

diff --git a/site/docs/concepts/materialization.md b/site/docs/concepts/materialization.md
@@ -26,7 +26,7 @@ You define and configure materializations in **Flow specifications**.
 Materializations use real-time [connectors](./connectors.md) to connect to many endpoint types.
 
 When you use a materialization connector in the Flow web app,
-flow helps you configure it through the **discovery** workflow.
+Flow helps you configure it through the **discovery** workflow.
 
 To begin discovery, you tell Flow the connector you'd like to use, basic information about the endpoint,
 and the collection(s) you'd like to materialize there.
@@ -67,7 +67,7 @@ materializations:
           # Name of the collection to be read.
           # Required.
           name: acmeCo/example/collection
-          # Lower bound date-time for documents which should be processed. 
+          # Lower bound date-time for documents which should be processed.
           # Source collection documents published before this date-time are filtered.
           # `notBefore` is *only* a filter. Updating its value will not cause Flow
           # to re-process documents that have already been read.
@@ -93,11 +93,11 @@ materializations:
         # Priority applied to documents processed by this binding.
         # When all bindings are of equal priority, documents are processed
         # in order of their associated publishing time.
-        # 
+        #
         # However, when one binding has a higher priority than others,
         # then *all* ready documents are processed through the binding
         # before *any* documents of other bindings are processed.
-        # 
+        #
         # Optional. Default: 0, integer >= 0
         priority: 0
 
@@ -362,24 +362,27 @@ field implemented. Consult the individual connector documentation for details.
 ### How It Works
 
 1. **Source Capture Level:**
-   - If the source capture provides a schema or namespace, it will be used as the default schema for all bindings in
-   - the materialization.
+
+   If the source capture provides a schema or namespace, it will be used as the default schema for all bindings in the materialization.
 
 2. **Manual Overrides:**
-   - You can still manually configure schema names for each binding, overriding the default schema if needed.
+
+   You can still manually configure schema names for each binding, overriding the default schema if needed.
 
 3. **Materialization-Level Configuration:**
-   - The default schema name can be set at the materialization level, ensuring that all new captures within that
-   - materialization automatically inherit the default schema name.
+
+   The default schema name can be set at the materialization level, ensuring that all new captures within that materialization automatically inherit the default schema name.
 
 ### Configuration Steps
 
 1. **Set Default Schema at Source Capture Level:**
-   - When defining your source capture, specify the schema or namespace. If no schema is provided, Estuary Flow will
-   - automatically assign a default schema.
-   
+
+   When defining your source capture, specify the schema or namespace. If no schema is provided, Estuary Flow will automatically assign a default schema.
+
 2. **Override Schema at Binding Level:**
-   - For any binding, you can manually override the default schema by specifying a different schema name.
+
+   For any binding, you can manually override the default schema by specifying a different schema name.
 
 3. **Set Default Schema at Materialization Level:**
-   - During the materialization configuration, set a default schema name for all captures within the materialization.
+
+   During the materialization configuration, set a default schema name for all captures within the materialization.
diff --git a/site/docs/concepts/schemas.md b/site/docs/concepts/schemas.md
@@ -45,7 +45,7 @@ Flow can usually generate suitable JSON schemas on your behalf.
 
 For systems like relational databases, Flow will typically generate a complete JSON schema by introspecting the table definition.
 
-For systems that store unstructured data, Flow will typically generate a very minimal schema, and will rely on schema inferrence to fill in the details. See [continuous schema inferenece](#continuous-schema-inference) for more information.
+For systems that store unstructured data, Flow will typically generate a very minimal schema, and will rely on schema inference to fill in the details. See [continuous schema inference](#continuous-schema-inference) for more information.
 
 ### Translations
 
@@ -72,7 +72,7 @@ Schema inference is also used to provide translations into other schema flavors:
 ### Annotations
 
 The JSON Schema standard introduces the concept of
-[annotations](http://json-schema.org/understanding-json-schema/reference/generic.html#annotations),
+[annotations](https://json-schema.org/understanding-json-schema/reference/annotations),
 which are keywords that attach metadata to a location within a validated JSON document.
 For example, `title` and `description` can be used to annotate a schema with its meaning:
 

diff --git a/site/docs/concepts/storage-mappings.md b/site/docs/concepts/storage-mappings.md
@@ -22,7 +22,7 @@ Flow tasks — captures, derivations, and materializations — use recovery logs
 Recovery logs are an opaque binary log, but may contain user data.
 
 The recovery logs of a task are always prefixed by `recovery/`,
-so a task named `acmeCo/produce-TNT` would have a recovery log called `recovery/acmeCo/roduce-TNT`
+so a task named `acmeCo/produce-TNT` would have a recovery log called `recovery/acmeCo/produce-TNT`
 
 Flow prunes data from recovery logs once it is no longer required.
 

diff --git a/site/docs/guides/flowctl/edit-draft-from-webapp.md b/site/docs/guides/flowctl/edit-draft-from-webapp.md
@@ -41,13 +41,13 @@ Drafts aren't currently visible in the Flow web app, but you can get a list with
 
 2. Run `flowctl draft list`
 
-  flowctl outputs a table of all the drafts to which you have access, from oldest to newest.
+   flowctl outputs a table of all the drafts to which you have access, from oldest to newest.
 
 3. Use the name and timestamp to find the draft you're looking for.
 
-  Each draft has an **ID**, and most have a name in the **Details** column. Note the **# of Specs** column.
-  For drafts created in the web app, materialization drafts will always contain one specification.
-  A number higher than 1 indicates a capture with its associated collections.
+   Each draft has an **ID**, and most have a name in the **Details** column. Note the **# of Specs** column.
+   For drafts created in the web app, materialization drafts will always contain one specification.
+   A number higher than 1 indicates a capture with its associated collections.
 
 4. Copy the draft ID.
 
@@ -57,10 +57,10 @@ Drafts aren't currently visible in the Flow web app, but you can get a list with
 
 7. Browse the source files.
 
-  The source files and their directory structure will look slightly different depending on the draft.
-  Regardless, there will always be a top-level file called `flow.yaml` that *imports* all other YAML files,
-  which you'll find in a subdirectory named for your catalog prefix.
-  These, in turn, contain the specifications you'll want to edit.
+   The source files and their directory structure will look slightly different depending on the draft.
+   Regardless, there will always be a top-level file called `flow.yaml` that *imports* all other YAML files,
+   which you'll find in a subdirectory named for your catalog prefix.
+   These, in turn, contain the specifications you'll want to edit.
 
 ## Edit the draft and publish
 
@@ -76,7 +76,7 @@ Next, you'll make changes to the specification(s), test, and publish the draft.
 
 3. When you're done, sync the local work to the global draft: `flowctl draft author --source flow.yaml`.
 
-  Specifying the top-level `flow.yaml` file as the source ensures that all entities in the draft are imported.
+   Specifying the top-level `flow.yaml` file as the source ensures that all entities in the draft are imported.
 
 4. Publish the draft: `flowctl draft publish`
 

diff --git a/site/docs/guides/flowctl/edit-specification-locally.md b/site/docs/guides/flowctl/edit-specification-locally.md
@@ -79,7 +79,7 @@ Using these names, you'll identify and pull the relevant specifications for edit
 
    * Pull a group of specifications by prefix or type filter, for example: `flowctl catalog pull-specs --prefix myOrg/marketing --collections`
 
-  The source files are written to your current working directory.
+   The source files are written to your current working directory.
 
 4. Browse the source files.
 
@@ -106,15 +106,15 @@ Next, you'll complete your edits, test that they were performed correctly, and r
 3. When you're done, you can test your changes:
 `flowctl catalog test --source flow.yaml`
 
-  You'll almost always use the top-level `flow.yaml` file as the source here because it imports all other Flow specifications
-  in your working directory.
+   You'll almost always use the top-level `flow.yaml` file as the source here because it imports all other Flow specifications
+   in your working directory.
 
-  Once the test has passed, you can publish your specifications.
+   Once the test has passed, you can publish your specifications.
 
 4. Re-publish all the specifications you pulled: `flowctl catalog publish --source flow.yaml`
 
-  Again you'll almost always want to use the top-level `flow.yaml` file. If you want to publish only certain specifications,
-  you can provide a path to a different file.
+   Again you'll almost always want to use the top-level `flow.yaml` file. If you want to publish only certain specifications,
+   you can provide a path to a different file.
 
 5. Return to the web app or use `flowctl catalog list` to check the status of the entities you just published.
 Their publication time will be updated to reflect the work you just did.

diff --git a/site/docs/guides/schema-evolution.md b/site/docs/guides/schema-evolution.md
@@ -173,7 +173,7 @@ Regardless of whether the field is materialized or not, it must still pass schem
 
 Database and data warehouse materializations tend to be somewhat restrictive about changing column types. They typically only allow dropping `NOT NULL` constraints. This means that you can safely change a schema to make a required field optional, or to add `null` as a possible type, and the materialization will continue to work normally.  Most other types of changes will require materializing into a new table.
 
-The best way to find out whether a change is acceptable to a given connector is to run test or attempt to re-publish. Failed attempts to publish won't affect any tasks that are already running.
+The best way to find out whether a change is acceptable to a given connector is to run a test or attempt to re-publish. Failed attempts to publish won't affect any tasks that are already running.
 
 **Web app workflow**
 

diff --git a/site/docs/guides/system-specific-dataflows/s3-to-snowflake.md b/site/docs/guides/system-specific-dataflows/s3-to-snowflake.md
@@ -52,7 +52,7 @@ credentials provided by your Estuary account manager.
 
 3. Find the **Amazon S3** tile and click **Capture**.
 
-  A form appears with the properties required for an S3 capture.
+   A form appears with the properties required for an S3 capture.
 
 4. Type a name for your capture.
 
@@ -69,23 +69,23 @@ credentials provided by your Estuary account manager.
 
    * **Prefix**: You might organize your S3 bucket using [prefixes](https://docs.aws.amazon.com/AmazonS3/latest/userguide/using-prefixes.html), which emulate a directory structure. To capture *only* from a specific prefix, add it here.
 
-   * **Match Keys**: Filters to apply to the objects in the S3 bucket. If provided, only data whose absolute path matches the filter will be captured. For example, `*\.json` will only capture JSON file.
+   * **Match Keys**: Filters to apply to the objects in the S3 bucket. If provided, only data whose absolute path matches the filter will be captured. For example, `*\.json` will only capture JSON files.
 
    See the S3 connector documentation for information on [advanced fields](../../reference/Connectors/capture-connectors/amazon-s3.md#endpoint) and [parser settings](../../reference/Connectors/capture-connectors/amazon-s3.md#advanced-parsing-cloud-storage-data). (You're unlikely to need these for most use cases.)
 
 6. Click **Next**.
 
-  Flow uses the provided configuration to initiate a connection to S3.
+   Flow uses the provided configuration to initiate a connection to S3.
 
-  It generates a permissive schema and details of the Flow collection that will store the data from S3.
+   It generates a permissive schema and details of the Flow collection that will store the data from S3.
 
-  You'll have the chance to tighten up each collection's JSON schema later, when you materialize to Snowflake.
+   You'll have the chance to tighten up each collection's JSON schema later, when you materialize to Snowflake.
 
 7. Click **Save and publish**.
 
-  You'll see a notification when the capture publishes successfully.
+   You'll see a notification when the capture publishes successfully.
 
-  The data currently in your S3 bucket has been captured, and future updates to it will be captured continuously.
+   The data currently in your S3 bucket has been captured, and future updates to it will be captured continuously.
 
 8. Click **Materialize Collections** to continue.
 
@@ -95,7 +95,7 @@ Next, you'll add a Snowflake materialization to connect the captured data to its
 
 1. Locate the **Snowflake** tile and click **Materialization**.
 
-  A form appears with the properties required for a Snowflake materialization.
+   A form appears with the properties required for a Snowflake materialization.
 
 2.  Choose a unique name for your materialization like you did when naming your capture; for example, `acmeCo/mySnowflakeMaterialization`.
 
@@ -112,12 +112,12 @@ Next, you'll add a Snowflake materialization to connect the captured data to its
 
 4. Click **Next**.
 
-  Flow uses the provided configuration to initiate a connection to Snowflake.
+   Flow uses the provided configuration to initiate a connection to Snowflake.
 
-  You'll be notified if there's an error. In that case, fix the configuration form or Snowflake setup as needed and click **Next** to try again.
+   You'll be notified if there's an error. In that case, fix the configuration form or Snowflake setup as needed and click **Next** to try again.
 
-  Once the connection is successful, the Endpoint Config collapses and the **Source Collections** browser  becomes prominent.
-  It shows the collection you captured previously, which will be mapped to a Snowflake table.
+   Once the connection is successful, the Endpoint Config collapses and the **Source Collections** browser  becomes prominent.
+   It shows the collection you captured previously, which will be mapped to a Snowflake table.
 
 5. In the **Collection Selector**, optionally change the name in the **Table** field.
 
@@ -127,9 +127,9 @@ Next, you'll add a Snowflake materialization to connect the captured data to its
 
 7. Apply a stricter schema to the collection for the materialization.
 
-  S3 has a flat data structure.
-  To materialize this data effectively to Snowflake, you should apply a schema that can translate to a table structure.
-  Flow's **Schema Inference** tool can help.
+   S3 has a flat data structure.
+   To materialize this data effectively to Snowflake, you should apply a schema that can translate to a table structure.
+   Flow's **Schema Inference** tool can help.
 
    1. In the **Source Collections** browser, click the collection's **Collection** tab.
 

diff --git a/site/docs/guides/transform_data_using_typescript.md b/site/docs/guides/transform_data_using_typescript.md
@@ -273,7 +273,8 @@ You can use `flowctl` to quickly verify your derivation before publishing it. Us
 
 As you can see, the output format matches the defined schema.  The last step would be to publish your derivation to Flow, which you can also do using `flowctl`.
 
-:::warning Publishing the derivation will initialize the transformation on the live, real-time Wikipedia stream, make sure to delete it after completing the tutorial.
+:::warning
+Publishing the derivation will initialize the transformation on the live, real-time Wikipedia stream, make sure to delete it after completing the tutorial.
 :::
 
 ```shell