-
Fluss is a very good streaming storage, but I encountered some questions during the research. The On-stream analyticsI used different levels of Flux data and then used Flink SQL to compare queries. I found that when the data reached billions, the query speed decreased significantly, but point query was still excellent. CREATE TABLE orders (
order_id BIGINT PRIMARY KEY NOT ENFORCED,
user_id BIGINT,
product_id BIGINT,
order_amount DECIMAL(10, 2),
create_time TIMESTAMP_LTZ(3)
) WITH (
'bucket.num' = '4',
'table.datalake.enabled' = 'true'
);
In particular, when the data reaches hundreds of millions, union read takes roughly twice as long as just querying for paimon. My question about all of the above is:
The storage costI compared the storage cost of the above orders table
Fluss locally takes up a lot of disk costs。 |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 5 replies
-
Hi @JohnZp , thanks for the detailed testing! I will answer the questions below:
The incremental part is the changelog part from the lake table snapshot time until now. Log and Changlog are stored in local disk (and tiered to remote storage if configured), so it is not stored in memory. The current implementation of union read is very basic with many optimizations planned in future versions. Currently, for a simple
We didn't plan a specific version for this feature, but it's very welcome to contribute this feature at any time!
The disk cost looks strange. Could you share the cluster configuration of Fluss?
Yes, in the future, the Zero Disk Architecture can use remote storage to replace local disks and make TabletServers stateless.
It depends. The throughput limitation will be changed from "disk bandwidth" to "network bandwidth" and throughput provided by the remote storage. And the latency is expected to be increased. |
Beta Was this translation helpful? Give feedback.
-
I'd like add some comments about union read. It also depends how many rows remains in Fluss that need to read. More rows remain in Fluss to read, usually slower. What's more, if it's a primary key table, union read will require to sort merge the rows in Fluss and Paimon which may be time cost. In this part, we haven't do much optimzation. But of course, we can... |
Beta Was this translation helpful? Give feedback.
-
@JohnZp thank you for the provided configuration and disk usage. I think there are some reasons: It seems the average log size is much larger than Kafka. 3 possible reasons:
All the old log segments (except the active segment) should have been tiered to remote storage. But by default, the local log needs to retain 2 segments (so each bucket consumes <= 2GB size), see We don't support storage-compute separation or tiered storage for kv store currently, so we have to store all kv locally. The storage-compute separation is also on Fluss's roadmap, and there are many promising projects (e.g., ForstDB, SlateDB) to help achieve this goal. In the future, we can optimize the lifecycle of the changelog, currently, it retains for default 7 days. But changelogs before a specific |
Beta Was this translation helpful? Give feedback.
Hi @JohnZp , thanks for the detailed testing! I will answer the questions below:
The incremental part is the changelog part from the lake table snapshot time until now. Log and Changlog are stored in local disk (and tiered to remote storage if configured), so it is not stored in memory. The current implementation of union read is very basic with many optimizations planned in future versions. Currently, for a simple
count()
of union read on the incremental part, it needs to read all the incremental changelog data to the query engine. This is inefficient and can be optimized to …