Skip to content

Commit

Permalink
Generate en docs
Browse files Browse the repository at this point in the history
  • Loading branch information
Milvus-doc-bot authored and Milvus-doc-bot committed Nov 11, 2024
1 parent c0abf00 commit e3b41b7
Showing 1 changed file with 6 additions and 11 deletions.
17 changes: 6 additions & 11 deletions localization/v2.4.x/site/en/faq/product_faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,11 +25,6 @@ title: Product FAQ
<p>Zilliz, the company behind Milvus, also offers a fully managed cloud version of the platform for those that don’t want to build and maintain their own distributed instance. <a href="https://zilliz.com/cloud">Zilliz Cloud</a> automatically maintains data reliability and allows users to pay only for what they use.</p>
<h4 id="Does-Milvus-support-non-x86-architectures" class="common-anchor-header">Does Milvus support non-x86 architectures?</h4><p>Milvus cannot be installed or run on non-x86 platforms.</p>
<p>Your CPU must support one of the following instruction sets to run Milvus: SSE4.2, AVX, AVX2, AVX512. These are all x86-dedicated SIMD instruction sets.</p>
<h4 id="What-is-the-maximum-dataset-size-Milvus-can-handle" class="common-anchor-header">What is the maximum dataset size Milvus can handle?</h4><p>Theoretically, the maximum dataset size Milvus can handle is determined by the hardware it is run on, specifically system memory and storage:</p>
<ul>
<li>Milvus loads all specified collections and partitions into memory before running queries. Therefore, memory size determines the maximum amount of data Milvus can query.</li>
<li>When new entities and and collection-related schema (currently only MinIO is supported for data persistence) are added to Milvus, system storage determines the maximum allowable size of inserted data.</li>
</ul>
<h4 id="Where-does-Milvus-store-data" class="common-anchor-header">Where does Milvus store data?</h4><p>Milvus deals with two types of data, inserted data and metadata.</p>
<p>Inserted data, including vector data, scalar data, and collection-specific schema, are stored in persistent storage as incremental log. Milvus supports multiple object storage backends, including <a href="https://min.io/">MinIO</a>, <a href="https://aws.amazon.com/s3/?nc1=h_ls">AWS S3</a>, <a href="https://cloud.google.com/storage?hl=en#object-storage-for-companies-of-all-sizes">Google Cloud Storage</a> (GCS), <a href="https://azure.microsoft.com/en-us/products/storage/blobs">Azure Blob Storage</a>, <a href="https://www.alibabacloud.com/product/object-storage-service">Alibaba Cloud OSS</a>, and <a href="https://www.tencentcloud.com/products/cos">Tencent Cloud Object Storage</a> (COS).</p>
<p>Metadata are generated within Milvus. Each Milvus module has its own metadata that are stored in etcd.</p>
Expand All @@ -41,17 +36,17 @@ title: Product FAQ
<h4 id="What-is-the-maximum-length-of-self-defined-entity-primary-keys" class="common-anchor-header">What is the maximum length of self-defined entity primary keys?</h4><p>Entity primary keys must be non-negative 64-bit integers.</p>
<h4 id="What-is-the-maximum-amount-of-data-that-can-be-added-per-insert-operation" class="common-anchor-header">What is the maximum amount of data that can be added per insert operation?</h4><p>An insert operation must not exceed 1,024 MB in size. This is a limit imposed by gRPC.</p>
<h4 id="Does-collection-size-impact-query-performance-when-searching-in-a-specific-partition" class="common-anchor-header">Does collection size impact query performance when searching in a specific partition?</h4><p>No. If partitions for a search are specified, Milvus searches the specified partitions only.</p>
<h4 id="Does-Milvus-load-the-entire-collection-when-partitions-are-specified-for-a-search" class="common-anchor-header">Does Milvus load the entire collection when partitions are specified for a search?</h4><p>No. Milvus has varied behavior. Data must be loaded to memory before searching.</p>
<h4 id="Does-Milvus-need-to-load-the-entire-collection-when-partitions-are-specified-for-a-search" class="common-anchor-header">Does Milvus need to load the entire collection when partitions are specified for a search?</h4><p>It depends on what data is needed for search. All partitions potentially show up in search result must be loaded before searching.</p>
<ul>
<li>If you know which partitions your data are located in, call <code translate="no">load_partition()</code> to load the intended partition(s) <em>then</em> specify partition(s) in the <code translate="no">search()</code> method call.</li>
<li>If you do not know the exact partitions, call <code translate="no">load_collection()</code> before calling <code translate="no">search()</code>.</li>
<li>If you fail to load collections or partitions before searching, Milvus returns an error.</li>
<li>For example, if you only want to search specific parition(s), you don’t need to load all. Call <code translate="no">load_partition()</code> to load the intended partition(s) <em>then</em> specify partition(s) in the <code translate="no">search()</code> method call.</li>
<li>If you want to search all partitions, call <code translate="no">load_collection()</code> to load the whole collection including all partitions.</li>
<li>If you fail to load the collection or specific partition(s) before searching, Milvus will return an error.</li>
</ul>
<h4 id="Can-indexes-be-created-after-inserting-vectors" class="common-anchor-header">Can indexes be created after inserting vectors?</h4><p>Yes. If an index has been built for a collection by <code translate="no">create_index()</code> before, Milvus will automatically build an index for subsequently inserted vectors. However, Milvus does not build an index until the newly inserted vectors fill an entire segment and the newly created index file is separate from the previous one.</p>
<h4 id="How-are-the-FLAT-and-IVFFLAT-indexes-different" class="common-anchor-header">How are the FLAT and IVF_FLAT indexes different?</h4><p>The IVF_FLAT index divides vector space into list clusters. At the default list value of 16,384, Milvus compares the distances between the target vector and the centroids of all 16,384 clusters to return probe nearest clusters. Milvus then compares the distances between the target vector and the vectors in the selected clusters to get the nearest vectors. Unlike IVF_FLAT, FLAT directly compares the distances between the target vector and every other vector.</p>
<p>When the total number of vectors approximately equals nlist, there is little distance between IVF_FLAT and FLAT in terms of calculation requirements and search performance. However, as the number of vectors exceeds nlist by a factor of two or more, IVF_FLAT begins to demonstrate performance advantages.</p>
<p>See <a href="/docs/index.md">Vector Index</a> for more information.</p>
<h4 id="How-does-Milvus-flush-data" class="common-anchor-header">How does Milvus flush data?</h4><p>Milvus returns success when inserted data are loaded to the message queue. However, the data are not yet flushed to the disk. Then Milvus’ data node writes the data in the message queue to persistent storage as incremental logs. If <code translate="no">flush()</code> is called, the data node is forced to write all data in the message queue to persistent storage immediately.</p>
<h4 id="How-does-Milvus-flush-data" class="common-anchor-header">How does Milvus flush data?</h4><p>Milvus returns success when inserted data are ingested to the message queue. However, the data are not yet flushed to the disk. Then Milvus’ data node writes the data in the message queue to persistent storage as incremental logs. If <code translate="no">flush()</code> is called, the data node is forced to write all data in the message queue to persistent storage immediately.</p>
<h4 id="What-is-normalization-Why-is-normalization-needed" class="common-anchor-header">What is normalization? Why is normalization needed?</h4><p>Normalization refers to the process of converting a vector so that its norm equals 1. If inner product is used to calculate vector similarity, vectors must be normalized. After normalization, inner product equals cosine similarity.</p>
<p>See <a href="https://en.wikipedia.org/wiki/Unit_vector">Wikipedia</a> for more information.</p>
<h4 id="Why-do-Euclidean-distance-L2-and-inner-product-IP-return-different-results" class="common-anchor-header">Why do Euclidean distance (L2) and inner product (IP) return different results?</h4><p>For normalized vectors, Euclidean distance (L2) is mathematically equivalent to inner product (IP). If these similarity metrics return different results, check to see if your vectors are normalized</p>
Expand All @@ -67,7 +62,7 @@ title: Product FAQ
<h4 id="Does-Milvus-support-Apple-M1-CPU" class="common-anchor-header">Does Milvus support Apple M1 CPU?</h4><p>Current Milvus release does not support Apple M1 CPU directly. After Milvus 2.3, Milvus provides Docker images for the ARM64 architecture.</p>
<h4 id="What-data-types-does-Milvus-support-on-the-primary-key-field" class="common-anchor-header">What data types does Milvus support on the primary key field?</h4><p>In current release, Milvus supports both INT64 and string.</p>
<h4 id="Is-Milvus-scalable" class="common-anchor-header">Is Milvus scalable?</h4><p>Yes. You can deploy Milvus cluster with multiple nodes via Helm Chart on Kubernetes. Refer to <a href="/docs/scaleout.md">Scale Guide</a> for more instruction.</p>
<h4 id="Does-the-query-perform-in-memory-What-are-incremental-data-and-historical-data" class="common-anchor-header">Does the query perform in memory? What are incremental data and historical data?</h4><p>Yes. When a query request comes, Milvus searches both incremental data and historical data by loading them into memory. Incremental data are in the growing segments, which are buffered in memory before they reach the threshold to be persisted in storage engine, while historical data are from the sealed segments that are stored in the object storage. Incremental data and historical data together constitute the whole dataset to search.</p>
<h4 id="What-are-growing-segment-and-sealed-segment" class="common-anchor-header">What are growing segment and sealed segment?</h4><p>When a search request comes, Milvus searches both incremental data and historical data. Incremental data are recent updates, they are stored in the growing segments, which are buffered in memory before they reach the threshold to be persisted in object storage and a more efficient index is built for them, while historical data are updates a while ago. They are in the sealed segments which have been persisted in the object storage. Incremental data and historical data together constitute the whole dataset for search. This design makes any data ingested to Milvus instantly searchable. For Milvus Distributed, there are more complex factors that decide when a record just ingested can show up in search result. Learn more nuance about that at <a href="https://milvus.io/docs/consistency.md">consistency levels</a>.</p>
<h4 id="Is-Milvus-available-for-concurrent-search" class="common-anchor-header">Is Milvus available for concurrent search?</h4><p>Yes. For queries on the same collection, Milvus concurrently searches the incremental and historical data. However, queries on different collections are conducted in series. Whereas the historical data can be an extremely huge dataset, searches on the historical data are relatively more time-consuming and essentially performed in series.</p>
<h4 id="Why-does-the-data-in-MinIO-remain-after-the-corresponding-collection-is-dropped" class="common-anchor-header">Why does the data in MinIO remain after the corresponding collection is dropped?</h4><p>Data in MinIO is designed to remain for a certain period of time for the convenience of data rollback.</p>
<h4 id="Does-Milvus-support-message-engines-other-than-Pulsar" class="common-anchor-header">Does Milvus support message engines other than Pulsar?</h4><p>Yes. Kafka is supported in Milvus 2.1.0.</p>
Expand Down

0 comments on commit e3b41b7

Please sign in to comment.