Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document Cosmos changes in 9.0 #4812

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Document Cosmos changes in 9.0 #4812

wants to merge 1 commit into from

Conversation

roji
Copy link
Member

@roji roji commented Sep 22, 2024

This is a pretty huge PR which takes care of the Cosmos documentation for 9.0. There's tons of new content and reorganization of existing content - some reviewing from everyone would be appreciate to catch issues etc.

  • Added notes for the Cosmos breaking change and separated them out to a separate Cosmos section.
  • Add documentation for partition keys, pagination, vector search, FromSql, and lots of other stuff.
  • Completed the Cosmos what's new section, reorganized and cut down the existing content there; linked to the proper docs page for more details.
  • Moved querying docs out of index.md to their own "querying" page, merged the function mapping page into it.
  • Added links from function mappings to the Cosmos docs for each function.
  • Moved modeling docs out of index.md to their own "modeling" page.

Closes #4808
Part of #4805 (new function translations for Cosmos)

@roji roji requested a review from a team September 22, 2024 11:58
@roji roji force-pushed the Cosmos branch 3 times, most recently from 612025f to 04b50eb Compare September 22, 2024 13:43
@roji roji force-pushed the Cosmos branch 4 times, most recently from 31269c1 to e25d9bc Compare September 22, 2024 14:08
@roji roji marked this pull request as ready for review September 22, 2024 14:09
Closes dotnet#4808
Part of dotnet#4805 (new function translations for Cosmos)
@SamMonoRT
Copy link
Member

@Pilchie - fyi

@Pilchie
Copy link
Member

Pilchie commented Sep 22, 2024

Also tagging @jcocchi and @kirankumarkolli here.


In Azure Cosmos DB, JSON documents are stored in containers. Unlike tables in relational databases, Cosmos DB containers can contain documents with different shapes - a container does not impose a uniform schema on its documents. However, various configuration options are defined at the container level, and therefore affect all documents contained within it. See the [Cosmos DB documentation on containers](/azure/cosmos-db/resource-model) for more information.

By default, EF maps all entity types to the same container; this is usually a good default in terms of performance and pricing. The default container is named after the your .NET context type (`OrderContext` in this case). To change the default container name, use <xref:Microsoft.EntityFrameworkCore.CosmosModelBuilderExtensions.HasDefaultContainer%2A>:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
By default, EF maps all entity types to the same container; this is usually a good default in terms of performance and pricing. The default container is named after the your .NET context type (`OrderContext` in this case). To change the default container name, use <xref:Microsoft.EntityFrameworkCore.CosmosModelBuilderExtensions.HasDefaultContainer%2A>:
By default, EF maps all entity types to the same container; this is usually a good default in terms of performance and pricing. The default container is named after the .NET context type (`OrderContext` in this case). To change the default container name, use <xref:Microsoft.EntityFrameworkCore.CosmosModelBuilderExtensions.HasDefaultContainer%2A>:


Developers coming to Cosmos DB from other database sometimes expect the key (`Id`) property to be generated automatically. For example, on SQL Server, EF configures numeric key properties to be IDENTITY columns, where auto-incrementing values are generated in the database. In contrast, Cosmos DB does not support automatic generation of properties, and so key properties must be explicitly set. Inserting an entity type with an unset key property will simply insert the CLR default value for that property (e.g. 0 for `int`), and a second insert will fail; EF issues a warning if you attempt to do this.

If you'd like to have a GUID as your key property, you can configure EF to generate random values at the client:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If you'd like to have a GUID as your key property, you can configure EF to generate random values at the client:
If you'd like to have a GUID as your key property, you can configure EF to generate unique values at the client:


This is similar, but allows EF to use efficient [point reads](xref:core/providers/cosmos/querying#point-reads) in more scenarios. If you need to insert a discriminator into the `id` property, consider inserting the root discriminator for better performance.

## Partition keys
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move these two sections up and mention that if the PK is discovered by a convention, then the partition key properties will be added to it


## Time-to-live

Entity types in the Azure Cosmos DB model can now be configured with a default time-to-live. For example:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Entity types in the Azure Cosmos DB model can now be configured with a default time-to-live. For example:
Entity types in the Azure Cosmos DB model can be configured with a default time-to-live. For example:

-->
[!code-csharp[BookEntity](../../../../samples/core/Miscellaneous/NewInEFCore6.Cosmos/CosmosPrimitiveTypesSample.cs?name=BookEntity)]

Both the list and the dictionary can be populated and inserted into the database in the normal way:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prevent the names from being localized:

Suggested change
Both the list and the dictionary can be populated and inserted into the database in the normal way:
The `IList` and the `IDictionary` can be populated and persisted to the database:

@@ -70,374 +73,10 @@ The Azure Cosmos DB provider for EF Core has multiple overloads of the [UseCosmo
| Account endpoint and token | `UseCosmos<DbContext>(accountEndpoint, tokenCredential, databaseName)` | [Resource tokens](/azure/cosmos-db/secure-access-to-data#primary-keys) |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the IMPORTANT above explicitly call out that RBAC is the recommended mechanism

@@ -17,6 +17,8 @@ Common EF Core patterns that either do not apply, or are a pit-of-failure, when
- Loading graphs of related entities from different documents is not supported. Document databases are not designed to perform joins across many documents; doing so would be very inefficient. Instead, it is more common to denormalize data so that everything needed is in one, or a small number, of documents. However, there are some forms of cross-document relationships that could be handled--see [Limited Include support for Cosmos](https://github.com/dotnet/efcore/issues/16920#issuecomment-989721078).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update date above


In these logs, we notice the following:

* The first two comparisons - on `TenantId` and `UserId` - have been lifted out, and appear in the ReadNext's "Partition" rather than in the `WHERE` clause; this means that query will only execute on the subpartitions for those values.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* The first two comparisons - on `TenantId` and `UserId` - have been lifted out, and appear in the ReadNext's "Partition" rather than in the `WHERE` clause; this means that query will only execute on the subpartitions for those values.
* The first two comparisons - on `TenantId` and `UserId` - have been lifted out and appear in the `ReadNext` "Partition" rather than in the `WHERE` clause; this means that query will only execute on the subpartitions for those values.

* `SessionId` is also part of the hierarchical partition key, but instead of an equality comparison, it uses a greater-than operator (`>`), and therefore cannot be lifted out. It is part of the `WHERE` clause like any regular property.
* `Username` is a regular property - not part of the partition key - and therefore remains in the `WHERE` clause as well.

Note that even though not all three partition key properties are provided, hierarchical partition keys still allow targeting only the subpartitions which correspond to the first two properties. While this isn't as efficient as targeting a single partition (as identified by all three properties), it's still much more efficient than targeting all partitions.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Note that even though not all three partition key properties are provided, hierarchical partition keys still allow targeting only the subpartitions which correspond to the first two properties. While this isn't as efficient as targeting a single partition (as identified by all three properties), it's still much more efficient than targeting all partitions.
Note that even though some of the partition key values are not provided, hierarchical partition keys still allow targeting only the subpartitions which correspond to the first two properties. While this isn't as efficient as targeting a single partition (as identified by all three properties), it's still much more efficient than targeting all partitions.

## Pagination

> [!NOTE]
> This feature was introduced in EF Core 9.0.
Copy link
Member

@AndriySvyryd AndriySvyryd Sep 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should mention here that it's currently experimental as we want to hear feedback before we finalize the API shape

.OrderBy(s => s.Id)
.ToPageAsync(pageSize: 10, continuationToken: null);

string continuationToken = page.ContinuationToken;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
string continuationToken = page.ContinuationToken;
string continuationToken = firstPage.ContinuationToken;

}
```

We execute the same query, but this time we pass in the continuation token received from the first execution; this instructs Cosmos DB to continue the query where it left off, and fetch the next 10 items. This method of paginating is extremely efficient and cost-effective compared to using `Skip` and `Take`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider mentioning that continuationToken will be null if there are no more results

To learn more about pagination in Cosmos DB, [see this page](/azure/cosmos-db/nosql/query/pagination).

> [!NOTE]
> Cosmos DB does not support backwards pagination.
Copy link
Member

@AndriySvyryd AndriySvyryd Sep 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider just making this part of a limitations list.

Another limitation is that it doesn't tell you how many pages are available.


## `FindAsync`

[`FindAsync`](xref:core/change-tracking/entity-entries#find-and-findasync) is a useful API for getting a an entity by its primary key, and avoiding a database roundtrip when the entity has already been loaded and is tracked by the context.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
[`FindAsync`](xref:core/change-tracking/entity-entries#find-and-findasync) is a useful API for getting a an entity by its primary key, and avoiding a database roundtrip when the entity has already been loaded and is tracked by the context.
[`FindAsync`](xref:core/change-tracking/entity-entries#find-and-findasync) is a useful API for getting an entity by its primary key and avoiding a database roundtrip when the entity has already been loaded and is tracked by the context.


[`FindAsync`](xref:core/change-tracking/entity-entries#find-and-findasync) is a useful API for getting a an entity by its primary key, and avoiding a database roundtrip when the entity has already been loaded and is tracked by the context.

Usually, the primary key of an entity type consists e.g. of an `Id` property. When using the EF Cosmos DB provider, the primary key also contains the partition key properties; this is the case since Cosmos DB allows different partitions to contain documents with the same JSON `id` property, and so only the combined `id` and partition key uniquely identify a single document in a container:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Usually, the primary key of an entity type consists e.g. of an `Id` property. When using the EF Cosmos DB provider, the primary key also contains the partition key properties; this is the case since Cosmos DB allows different partitions to contain documents with the same JSON `id` property, and so only the combined `id` and partition key uniquely identify a single document in a container:
When using the EF Cosmos DB provider, the primary key contains the partition key properties in addition to the property mapped to the JSON `id` property; this is the case since Cosmos DB allows different partitions to contain documents with the same JSON `id` property, and so only the combined `id` and partition key uniquely identify a single document in a container:

> [!WARNING]
> Azure Cosmos DB vector search is currently in preview. As a result, using EF's vector search APIs will generate an "experimental API" warning (`EF9103`) which must be suppressed. The APIs and capabilities may change in breaking ways in the future.
Azure Cosmos DB now offers preview support for vector similarity search. Vector search is a fundamental part of some application types, include AI, semantic search and others. The Cosmos DB support for vector search allows storing your data and vectors, and performing your queries in a single database, which can considerably simplify your architecture and remove the need for an additional, dedicated vector database solution in your stack. To learn more about Cosmos DB vector search, [see the documentation](/azure/cosmos-db/nosql/vector-search).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Azure Cosmos DB now offers preview support for vector similarity search. Vector search is a fundamental part of some application types, include AI, semantic search and others. The Cosmos DB support for vector search allows storing your data and vectors, and performing your queries in a single database, which can considerably simplify your architecture and remove the need for an additional, dedicated vector database solution in your stack. To learn more about Cosmos DB vector search, [see the documentation](/azure/cosmos-db/nosql/vector-search).
Azure Cosmos DB now offers preview support for vector similarity search. Vector search is a fundamental part of some application types, include AI, semantic search and others. The Cosmos DB support for vector search allows storing data and vectors, and performing queries in a single database, which can considerably simplify the architecture and remove the need for an additional, dedicated vector database solution. To learn more about Cosmos DB vector search, [see the documentation](/azure/cosmos-db/nosql/vector-search).

@@ -7,6 +7,9 @@ uid: core/providers/cosmos/index
---
# EF Core Azure Cosmos DB Provider

> [!WARNING]
> Extensive work has gone into the Cosmos DB provider in 9.0. In order to improve the provider, a number of high-impact breaking changes had to be made; if you are upgrading an existing application, please read the [breaking changes section](xref:core/what-is-new/ef-core-9.0/breaking-changes#cosmos-breaking-changes) carefully.

This database provider allows Entity Framework Core to be used with Azure Cosmos DB. The provider is maintained as part of the [Entity Framework Core Project](https://github.com/dotnet/efcore).

It is strongly recommended to familiarize yourself with the [Azure Cosmos DB documentation](/azure/cosmos-db/introduction) before reading this section.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a note below, for EnsureCreatedAsync:

Azure Cosmos DB SDK does not support RBAC for management plane operations in Azure Cosmos DB. Use Azure Management API instead of EnsureCreatedAsync with RBAC.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Document IncludeRootDiscriminatorInJsonId breaking change
4 participants