Query Design and Tips

High level overview of query

For the best performance always run application on Windows x64 to support native service interop.
- Visual studio default new projects to Any CPU. Any CPU project can easily switch to x86. It's recommend to set the project to x64 to avoid it switching to x86.
Use the FeedIterator.HasMoreResults to loop over result to drain the entire query
SDK does not support a minimum item count.
- Code should handle any page size from 0 to max item count
- The amount of items in a page can and will change without any notice.
Empty pages are expected for queries, and can appear at any time.
- The reason empty pages are exposed to user is it allows more opportunities to cancel the query and makes it clear that it is doing multiple network calls.
- Empty page can show up in existing workloads because a physical partition is split in Cosmos DB. First partition now has 0 results which causes the empty page.
- It is possible that you are getting an empty page due to the backend preempting your query. This would mean that your query is taking more than some fixed amount of time on the backend to retrieve your documents, so it preempts your query and sends you a continuation if you want to continue making progress on the query.

ServiceInterop.dll is included in the SDK nuget package
Provides the ability to parse and optimize the query locally
Local query parsing avoids network call which reduces latency
Only supported on Windows x64. All other platforms go to the gateway to get the optimized query plan.

Strongly recommend to reading how Cosmos DB horizontally scales.

Does not get impacted by container scaling
Can be done by restricting the partition key in the SQL query by using something like a where clause or the partition key can be passed via the request options

SDK visits every physical partition in Cosmos DB and run the query against that physical partition.
- Depending on the query the SDK may have to aggregate result locally. For example a distinct query would get a distinct list from each partition but has to remove duplicates between partitions.
Each ReadNextAsync() represent a single page request from Cosmos DB.
SDK has the ability to buffer items to reduce latency. User can configure the amount of items that get buffered and the concurrency of the request.
- Increasing the concurrency or buffered item count can significantly impact load on the machine and the RU cost. To much load on a machine can cause additional latency and transport exceptions.
Impacted by container scaling. The larger the container the more physical partitions it will contain.