diff --git a/docs/modules/ROOT/nav.adoc b/docs/modules/ROOT/nav.adoc index e4ca650fb..7bbef0027 100644 --- a/docs/modules/ROOT/nav.adoc +++ b/docs/modules/ROOT/nav.adoc @@ -54,7 +54,7 @@ ** xref:ingest:overview.adoc[] ** xref:computing:distributed-computing.adoc[] ** xref:query:overview.adoc[] -* Best Practices +* xref:cluster-performance:best-practices.adoc[] ** xref:capacity-planning.adoc[] ** xref:cluster-performance:performance-tips.adoc[] ** xref:cluster-performance:back-pressure.adoc[] @@ -62,6 +62,8 @@ ** xref:cluster-performance:aws-deployments.adoc[] ** xref:cluster-performance:threading.adoc[] ** xref:cluster-performance:near-cache.adoc[] +** xref:cluster-performance:imap-bulk-read-operations.adoc[] +** xref:cluster-performance:data-affinity.adoc[] include::architecture:partial$nav.adoc[] * Member/Client Discovery ** xref:clusters:discovery-mechanisms.adoc[] @@ -153,12 +155,10 @@ include::secure-cluster:partial$nav.adoc[] include::fault-tolerance:partial$nav.adoc[] include::cp-subsystem:partial$nav.adoc[] * xref:storage:high-density-memory.adoc[] -include::tiered-storage:partial$nav.adoc[] -* xref:cluster-performance:thread-per-core-tpc.adoc[] include::data-connections:partial$nav.adoc[] include::wan:partial$nav.adoc[] -include::tiered-storage:partial$nav.adoc[] * xref:cluster-performance:thread-per-core-tpc.adoc[] +include::tiered-storage:partial$nav.adoc[] * xref:extending-hazelcast:extending-hazelcast.adoc[] ** xref:extending-hazelcast:operationparker.adoc[] ** xref:extending-hazelcast:discovery-spi.adoc[] diff --git a/docs/modules/cluster-performance/pages/best-practices.adoc b/docs/modules/cluster-performance/pages/best-practices.adoc index 52e870f81..98929c938 100644 --- a/docs/modules/cluster-performance/pages/best-practices.adoc +++ b/docs/modules/cluster-performance/pages/best-practices.adoc @@ -10,3 +10,5 @@ Learn more about best practices and Hazelcast recommendations: * xref:cluster-performance:aws-deployments.adoc[] * xref:cluster-performance:threading.adoc[] * xref:cluster-performance:near-cache.adoc[] +* xref:cluster-performance:imap-bulk-read-operations.adoc[] +* xref:cluster-performance:data-affinity.adoc[] diff --git a/docs/modules/cluster-performance/pages/imap-bulk-read-operations.adoc b/docs/modules/cluster-performance/pages/imap-bulk-read-operations.adoc new file mode 100644 index 000000000..e04eb2188 --- /dev/null +++ b/docs/modules/cluster-performance/pages/imap-bulk-read-operations.adoc @@ -0,0 +1,105 @@ += IMap bulk read operations +:description: Learn about best practices for IMap bulk read operations. + +[[bulk-read-operations]] + +To safeguard your cluster and application from becoming Out of Memory +(OOM), follow these best practices and consider using the described +alternatives to IMap bulk read operations. + +It's critical to avoid an Out of Memory Error (OOME) as its impact +can be severe. Hazelcast strives to protect your data but +an OOME can lead to a loss of cluster availability. This can result +in increased operation latencies due to triggered migrations. From +your application's perspective, an OOME could also cause a system +crash. + +Some specific IMap API calls are particularly risky in this regard. +Methods like `IMap#entrySet()` and `IMap#values()` can trigger an OOME, depending +on the size of your map and the available memory on each member. +To mitigate this risk, you should follow these best practices. + +== Plan capacity +Proper capacity planning is crucial for providing +sufficient system resources to the Hazelcast cluster. This +involves estimating and validating the cluster's capacity +(memory, CPU, disk, etc.) to determine the best practices +that help the cluster achieve optimal performance. + +For more information, see xref:ROOT:capacity-planning.adoc[]. + +== Limit query result size +If you limit query result sizes, this can help prevent the adverse effects of bulk data reads. + +[source,java] +---- +Set> entrySet(); +Set> entrySet(Predicate predicate); +---- +For more information, see xref:data-structures:preventing-out-of-memory.adoc#configuring-query-result-size[Configuring query result size]. + +== Use Iterator +The Iterator fetches data in batches, ensuring consistent heap +utilization. The relevant methods in the IMap API include: + +[source,java] +---- +Iterator> iterator(); +Iterator> iterator(int fetchSize); +---- +This example shows how to use the Iterator API: +[source,java] +---- +IMap testMap = instance.getMap("test"); +for (int i = 0; i < 1_000; i++) { + testMap.set(i, i); +} + +// default fetch size is 100 element +Iterator> iterator = testMap.iterator(); +while (iterator.hasNext()) { + Map.Entry next = iterator.next(); + System.err.println(next); +} +---- + + +== Use PartitionPredicate +You can reduce memory overhead during bulk operations by filtering with *PartitionPredicate*. + +For more info, see xref:query:predicate-overview.adoc#filtering-with-partition-predicate[PartitionPredicate]. + +== Use Entry Processor +In some scenarios, reversing the traditional approach can be +more effective. Instead of fetching all data to the local +application for processing, you can send operations directly to +the data. This _in-place_ processing method saves both time and +resources; *Entry Processor* is an excellent tool for this purpose. + +For more info, see xref:data-structures:entry-processor.adoc[]. + +== Use SQL service +SQL was designed specifically for distributed computing use cases: SQL query results +are paged, which makes SQL a good tool to fetch data in bulk. + +The following example shows a replacement for `IMap#values()`: + +[source,java] +---- +String MAP_NAME = "..."; +HazelcastInstance client = HazelcastClient.newHazelcastClient(); +// Create a SQL mapping for IMap +client.getSql().execute("CREATE MAPPING " + MAP_NAME + " (__key INT, this VARCHAR)"); +// Run query to replace IMap#values() +SqlResult result = client.getSql().execute("SELECT this FROM " + MAP_NAME); +// Process the data in paged fashion +for (SqlRow row: result) { + /* do your processing */ +} +---- + +IMPORTANT: You must have Jet enabled to use the SQL service. + +For more info, see xref:query:sql-overview.adoc[]. + +