2113 using compact incomplete on a library with dynamic schema with a named index can result in an unreadable index #2116

G-D-Petrov · 2025-01-13T12:17:38Z

Reference Issues/PRs

Fixes #2113

What does this implement or fix?

Any other comments?

Checklist

Checklist for code changes...

Have you updated the relevant docstrings, documentation and copyright notice?
Is this contribution tested against all ArcticDB's features?
Do all exceptions introduced raise appropriate error messages?
Are API changes highlighted in the PR description?
Is the PR labelled as enhancement or bug so it appears in autogenerated release notes?

…imports

…nability

G-D-Petrov · 2025-01-14T09:04:38Z

cpp/arcticdb/version/schema_checks.hpp

+    auto new_df_field_index_count = new_df_descriptor.index().type() == IndexDescriptor::Type::EMPTY ? 0 : new_df_descriptor.index().field_count();
+
+    // If either index is empty, we consider them to match
+    if (df_in_store_index_field_count == 0 || new_df_field_index_count == 0) {


This check is to accommodate the existing behavior around empty DFs and Series, both of which have essentially empty indexes, even though for series the types is RowCount, I think

G-D-Petrov · 2025-01-14T09:08:57Z

The benchmarks are reporting ~30% performance degradation on the FinalizeStagedData benchmarks:

Change	Before [`d71a0bb`] <v5.2.0rc0~1>	After [`c1b389e`]	Ratio	Benchmark (Parameter)
+	904M	1.45G	1.6	finalize_staged_data.FinalizeStagedData.peakmem_finalize_staged_data(1000)
+	1.88G	2.73G	1.45	finalize_staged_data.FinalizeStagedData.peakmem_finalize_staged_data(2000)
+	1.71±0s	2.33±0s	1.36	finalize_staged_data.FinalizeStagedData.time_finalize_staged_data(1000)
+	3.49±0s	4.66±0s	1.34	finalize_staged_data.FinalizeStagedData.time_finalize_staged_data(2000)

I think that this is due to the new check over all of the segments to make sure that the index names are the same, which was not done before.

IGNORE THIS: The latest commit fixes this - 300ae92

vasil-pashov · 2025-01-15T10:39:14Z

cpp/arcticdb/version/schema_checks.hpp

@@ -86,6 +86,31 @@ inline void check_normalization_index_match(
    }
 }

+inline bool index_names_match(


Can we add schema_checks.cpp and add this there?

vasil-pashov · 2025-01-15T10:49:30Z

I think it's worth adding a test for sort_and_finalize_staged_data similar to the one for finalize_staged_data because the codepaths are slightly different

G-D-Petrov added 2 commits January 10, 2025 17:55

Add named index tests

029c84f

Add index name matching checks to schema validation

c1b389e

G-D-Petrov requested review from alexowens90, willdealtry and poodlewars as code owners January 13, 2025 12:17

G-D-Petrov added 2 commits January 13, 2025 18:31

Update index name matching logic and adjust StreamDescriptorMismatch …

ac26237

…imports

Refactor index name matching logic to improve readability and maintai…

4f8cd7e

…nability

G-D-Petrov commented Jan 14, 2025

View reviewed changes

Check the index names in finalize staged data on demand

300ae92

vasil-pashov reviewed Jan 15, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2113 using compact incomplete on a library with dynamic schema with a named index can result in an unreadable index #2116

2113 using compact incomplete on a library with dynamic schema with a named index can result in an unreadable index #2116

G-D-Petrov commented Jan 13, 2025

G-D-Petrov Jan 14, 2025

G-D-Petrov commented Jan 14, 2025 •

edited

Loading

vasil-pashov Jan 15, 2025

vasil-pashov commented Jan 15, 2025 •

edited

Loading

2113 using compact incomplete on a library with dynamic schema with a named index can result in an unreadable index #2116

Are you sure you want to change the base?

2113 using compact incomplete on a library with dynamic schema with a named index can result in an unreadable index #2116

Conversation

G-D-Petrov commented Jan 13, 2025

Reference Issues/PRs

What does this implement or fix?

Any other comments?

Checklist

G-D-Petrov Jan 14, 2025

Choose a reason for hiding this comment

G-D-Petrov commented Jan 14, 2025 • edited Loading

vasil-pashov Jan 15, 2025

Choose a reason for hiding this comment

vasil-pashov commented Jan 15, 2025 • edited Loading

G-D-Petrov commented Jan 14, 2025 •

edited

Loading

vasil-pashov commented Jan 15, 2025 •

edited

Loading