Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

modules/zstd: Add Sequence Executor #1295

Closed
wants to merge 31 commits into from

Conversation

rw1nkler
Copy link
Contributor

@rw1nkler rw1nkler commented Feb 8, 2024

This PR adds Proc that performs ZSTD Sequence Execution as described in rfc8878#section-3.1.1.5. The Proc is responsible for interpreting packets with Literals and Sequences obtained from the Procs that decode RAW, RLE, or Compressed blocks of data.

NOTE: this is based on #1214 , please ignore commits from that branch when reviewing.
This is part of #1211

rw1nkler and others added 30 commits January 15, 2024 15:43
This commit adds a DSLX Buffer library that provides the Buffer struct,
and helper functions that can be used to operate on it. The Buffer
is meant to be a storage for data coming from the channel. It acts like
a FIFO, allowing data of any length to be put in or popped out of it.
Provided DSLX tests verify the correct behaviour of the library.

Internal-tag: [#50221]
Signed-off-by: Robert Winkler <[email protected]>
This commit adds a simple test that shows, how one can use the Buffer
struct inside a Proc.

Internal-tag: [#50221]
Signed-off-by: Robert Winkler <[email protected]>
This commit adds the library with functions for parsing a magic number and
tests that verify its correctness.

Internal-tag: [#50221]
Signed-off-by: Robert Winkler <[email protected]>
This commit adds the library with functions for parsing a frame header.
The provided tests verify the correcness of the library.

Internal-tag: [#49967]
Co-authored-by: Roman Dobrodii <[email protected]>
Co-authored-by: Pawel Czarnecki <[email protected]>
Signed-off-by: Robert Winkler <[email protected]>
Signed-off-by: Pawel Czarnecki <[email protected]>
Internal-tag: [#53329]
Signed-off-by: Pawel Czarnecki <[email protected]>
Required for expected_status inference in C++ tests for ZSTD decoder
components

Internal-tag: [#53465]
Signed-off-by: Pawel Czarnecki <[email protected]>
Internal-tag: [#50967]
Signed-off-by: Robert Winkler <[email protected]>
This commit adds a binary that calls decoding to generate data and loads
it into a vector of bytes.

Internal-tag: [#50967]
Signed-off-by: Robert Winkler <[email protected]>
Internal-tag: [#50967]
Co-authored-by: Pawel Czarnecki <[email protected]>
Signed-off-by: Robert Winkler <[email protected]>
Signed-off-by: Pawel Czarnecki <[email protected]>
Internal-tag: [#51343]
Signed-off-by: Robert Winkler <[email protected]>
Internal-tag: [#51343]
Signed-off-by: Robert Winkler <[email protected]>
Internal-tag: [#53329]
Signed-off-by: Pawel Czarnecki <[email protected]>
Adds RleBlockDecoder responsible for decoding Blocks
of RLE_Block Block_Type as specified in RFC 8878, paragraph 3.1.1.2.2.
https://datatracker.ietf.org/doc/html/rfc8878#section-3.1.1.2.2

RleBlockDecoder communicates through BlockDataPacket channels.
It reuses existing RunLengthDecoder block which is interfaced through
two seprate procs:

 * RleDataPacker
 * BatchPacker

Which are responsible for converting input data into format accepted by
RLE decoder and for gathering RLE decoder output symbols into batches
which are then send out through BlockDataPacket.

Internal-tag: [#51473]
Signed-off-by: Pawel Czarnecki <[email protected]>
Internal-tag: [#53329]
Signed-off-by: Pawel Czarnecki <[email protected]>
Internal-tag: [#51343]
Signed-off-by: Robert Winkler <[email protected]>
This commit adds DecoderMux Proc, which collects data from specialized
Raw, RLE, and Compressed Block decoders and re-sends them in the correct order.

Internal-tag: [#51343]
Signed-off-by: Robert Winkler <[email protected]>
Internal-tag: [#53329]
Signed-off-by: Pawel Czarnecki <[email protected]>
This DSLX proc responsibility is to dispatch encoded blocks to a correct
decoder: RAW, RLE, COMPRESSED.

It tracks and assigns block IDs.
The ID counter is reset on the frame's last block on the last data packet.

Internal-tag: [#51736]
Co-authored-by: Robert Winkler <[email protected]>
Signed-off-by: Maciej Dudek <[email protected]>
Internal-tag: [#53329]
Signed-off-by: Pawel Czarnecki <[email protected]>
Internal-tag: [#53329]
Signed-off-by: Pawel Czarnecki <[email protected]>
This adds a decoder of block data. It decodes block header and
demuxes remaining input data into one of specific block decoders
depending on the type of the parsed block. Then it muxes outputs
from those decoders into single output channel.

Internal-tag: [#51873]
Signed-off-by: Pawel Czarnecki <[email protected]>
Internal-tag: [#53329]
Signed-off-by: Pawel Czarnecki <[email protected]>
Internal-tag: [#53329]
Signed-off-by: Pawel Czarnecki <[email protected]>
Internal-tag: [#52954]
Signed-off-by: Pawel Czarnecki <[email protected]>
This commit marks SimultaneousReadWriteBehavior enum and
num_partitions function as public to allow for creating
simpler tests that interact with RAM models.

Internal-tag: [#53241]
Signed-off-by: Robert Winkler <[email protected]>
Internal-tag: [#54705]
Signed-off-by: Robert Winkler <[email protected]>
This commit adds RAM printer block usefull for debugging
HistoryBuffer inside SequenceExecutor.

Internal-tag: [#54705]
Signed-off-by: Robert Winkler <[email protected]>
Add Proc responsible for handling ZSTD Sequence Execution step,
which is described in:
https://datatracker.ietf.org/doc/html/rfc8878#name-sequence-execution

Internal-tag: [#54705]
Signed-off-by: Robert Winkler <[email protected]>
@rw1nkler
Copy link
Contributor Author

@hongted Here is a short description of the problem with looped connections:

SequenceExecutor is an example of Proc for which we observed problems during the Proc inlining step. During that step the tool shows a cycle in logic:

SUBCOMMAND: # //xls/modules/zstd:sequence_executor_verilog [action 'Optimizing IR file: bazel-out/k8-fastbuild/bin/xls/modules/zstd/sequence_executor_verilog.ir', configuration: 3d46003a2740b6027ad117c5193a361cf22b9845aa40b9d298dd905c9f83a639, execution platform: @local_config_platform//:host]                          
(cd /home/rwinkler/.cache/bazel/_bazel_rwinkler/d2cf6e52152dcda37d34f6b381369325/execroot/com_google_xls && \                                                 
  exec env - \                         
  /bin/bash -c 'bazel-out/k8-opt-exec-2B5CBBC6/bin/xls/tools/opt_main bazel-out/k8-fastbuild/bin/xls/modules/zstd/sequence_executor_verilog.ir  --inline_procs=true --top=__sequence_executor__SequenceExecutorZstd__SequenceExecutor_0__64_0_0_0_13_8192_65536_next 2>&1 >bazel-out/k8-fastbuild/bin/xls/modules/zstd/sequence_executor_verilog.opt.ir | tee bazel-out/k8-fastbuild/bin/xls/modules/zstd/sequence_executor_verilog.opt.log')
F0214 13:46:30.257837       3 node_iterator.cc:142] Check failed: f_->Accept(&cycle_checker) is OK (INTERNAL: Cycle detected: [status__1 -> concat.9782 -> and.9755 -> not.9735 -> and.9716 -> packet_valid -> ram_packet_valid -> SequenceExecutorZstd__SequenceExecutor_chandecl_xls_modules_zstd_sequence_executor_x_877_54_877_88_receive -> SequenceExecutorZstd__SequenceExecutor_chandecl_xls_modules_zstd_sequence_executor_x_877_54_877_88_receive_data -> SequenceExecutorZstd__SequenceExecutor_chandecl_xls_modules_zstd_sequence_executor_x_877_54_877_88_data_out -> SequenceExecutorZstd__SequenceExecutor_chandecl_xls_modules_zstd_sequence_executor_x_877_54_877_88_data_in -> output_data -> concat.9238 -> acc__10 -> converted__16 -> input_resp__1 -> input__1 -> SequenceExecutorZstd__SequenceExecutor_chandecl_xls_modules_zstd_sequence_executor_x_876_52_876_84_receive -> SequenceExecutorZstd__SequenceExecutor_chandecl_xls_modules_zstd_sequence_executor_x_876_52_876_84_receive_data -> SequenceExecutorZstd__SequenceExecutor_chandecl_xls_modules_zstd_sequence_executor_x_876_52_876_84_receive_fired -> SequenceExecutorZstd__SequenceExecutor_chandecl_xls_modules_zstd_sequence_executor_x_876_52_876_84_receive_activation_out -> SequenceExecutorZstd__SequenceExecutor_chandecl_xls_modules_zstd_sequence_executor_x_876_52_876_84_receive_not_stalled -> SequenceExecutorZstd__SequenceExecutor_chandecl_xls_modules_zstd_sequence_executor_x_876_52_876_84_receive_stall -> SequenceExecutorZstd__SequenceExecutor_chandecl_xls_modules_zstd_sequence_executor_x_876_52_876_84_valid -> SequenceExecutorZstd__SequenceExecutor_chandecl_xls_modules_zstd_sequence_executor_x_876_52_876_84_send_fired__1 -> do_read__1 -> mask__39 -> array_index.10144 -> read_reqs__1 -> concat.10093 -> and.9877 -> eq.9839 -> status__1])  

Proc inlining step is a consequence of the SequenceExecutor architecture, in which communication with the RAMs takes place in multiple Procs:
seq_exec
Since it is impossible to rewrite the RAM interface "partially" (for example just read channels for a Proc that is responsible for handling reading from RAM) we had to inline the SequenceExecutor. In our case one of the inner Procs sends data to a input channel of SequenceExecutor. This loop is, as we believe, the source of problems. In normal scenario this kind of connection would be treated as FIFO. We wonder if this behavior is preserved during the proc inlining step for the looped channels.

lpawelcz pushed a commit to antmicro/xls that referenced this pull request Feb 21, 2024
google#1295

modules/zstd: Add SequenceExecutor

This commit adds SequenceExecutor block, that is currently capable of
handling only Literal Copy commands.

Internal-tag: [#53241]
Signed-off-by: Robert Winkler <[email protected]>

examples/ram: Export internal RAM API to other modules

This commit marks SimultaneousReadWriteBehavior enum and
num_partitions function as public to allow for creating
simpler tests that interact with RAM models.

Internal-tag: [#53241]
Signed-off-by: Robert Winkler <[email protected]>

modules/zstd: Add consts and types used by SequenceExecutor to common

Internal-tag: [#53241]
Signed-off-by: Robert Winkler <[email protected]>

modules/zstd: Add RAM printer debugging block

This commit adds RAM printer block usefull for debugging
SequenceExecutor.

Internal-tag: [#53241]
Signed-off-by: Robert Winkler <[email protected]>

moduels/zstd/common: Specify decoder output format

Internal-tag: [#52954]
Signed-off-by: Pawel Czarnecki <[email protected]>

modules/zstd/sequence_executor: fix codegen

Internal-tag: [#52954]
Signed-off-by: Pawel Czarnecki <[email protected]>
@hongted
Copy link
Collaborator

hongted commented Feb 21, 2024

@hongted Here is a short description of the problem with looped connections:

SequenceExecutor is an example of Proc for which we observed problems during the Proc inlining step. During that step the tool shows a cycle in logic:

SUBCOMMAND: # //xls/modules/zstd:sequence_executor_verilog [action 'Optimizing IR file: bazel-out/k8-fastbuild/bin/xls/modules/zstd/sequence_executor_verilog.ir', configuration: 3d46003a2740b6027ad117c5193a361cf22b9845aa40b9d298dd905c9f83a639, execution platform: @local_config_platform//:host]                          
(cd /home/rwinkler/.cache/bazel/_bazel_rwinkler/d2cf6e52152dcda37d34f6b381369325/execroot/com_google_xls && \                                                 
  exec env - \                         
  /bin/bash -c 'bazel-out/k8-opt-exec-2B5CBBC6/bin/xls/tools/opt_main bazel-out/k8-fastbuild/bin/xls/modules/zstd/sequence_executor_verilog.ir  --inline_procs=true --top=__sequence_executor__SequenceExecutorZstd__SequenceExecutor_0__64_0_0_0_13_8192_65536_next 2>&1 >bazel-out/k8-fastbuild/bin/xls/modules/zstd/sequence_executor_verilog.opt.ir | tee bazel-out/k8-fastbuild/bin/xls/modules/zstd/sequence_executor_verilog.opt.log')
F0214 13:46:30.257837       3 node_iterator.cc:142] Check failed: f_->Accept(&cycle_checker) is OK (INTERNAL: Cycle detected: [status__1 -> concat.9782 -> and.9755 -> not.9735 -> and.9716 -> packet_valid -> ram_packet_valid -> SequenceExecutorZstd__SequenceExecutor_chandecl_xls_modules_zstd_sequence_executor_x_877_54_877_88_receive -> SequenceExecutorZstd__SequenceExecutor_chandecl_xls_modules_zstd_sequence_executor_x_877_54_877_88_receive_data -> SequenceExecutorZstd__SequenceExecutor_chandecl_xls_modules_zstd_sequence_executor_x_877_54_877_88_data_out -> SequenceExecutorZstd__SequenceExecutor_chandecl_xls_modules_zstd_sequence_executor_x_877_54_877_88_data_in -> output_data -> concat.9238 -> acc__10 -> converted__16 -> input_resp__1 -> input__1 -> SequenceExecutorZstd__SequenceExecutor_chandecl_xls_modules_zstd_sequence_executor_x_876_52_876_84_receive -> SequenceExecutorZstd__SequenceExecutor_chandecl_xls_modules_zstd_sequence_executor_x_876_52_876_84_receive_data -> SequenceExecutorZstd__SequenceExecutor_chandecl_xls_modules_zstd_sequence_executor_x_876_52_876_84_receive_fired -> SequenceExecutorZstd__SequenceExecutor_chandecl_xls_modules_zstd_sequence_executor_x_876_52_876_84_receive_activation_out -> SequenceExecutorZstd__SequenceExecutor_chandecl_xls_modules_zstd_sequence_executor_x_876_52_876_84_receive_not_stalled -> SequenceExecutorZstd__SequenceExecutor_chandecl_xls_modules_zstd_sequence_executor_x_876_52_876_84_receive_stall -> SequenceExecutorZstd__SequenceExecutor_chandecl_xls_modules_zstd_sequence_executor_x_876_52_876_84_valid -> SequenceExecutorZstd__SequenceExecutor_chandecl_xls_modules_zstd_sequence_executor_x_876_52_876_84_send_fired__1 -> do_read__1 -> mask__39 -> array_index.10144 -> read_reqs__1 -> concat.10093 -> and.9877 -> eq.9839 -> status__1])  

Proc inlining step is a consequence of the SequenceExecutor architecture, in which communication with the RAMs takes place in multiple Procs: seq_exec Since it is impossible to rewrite the RAM interface "partially" (for example just read channels for a Proc that is responsible for handling reading from RAM) we had to inline the SequenceExecutor. In our case one of the inner Procs sends data to a input channel of SequenceExecutor. This loop is, as we believe, the source of problems. In normal scenario this kind of connection would be treated as FIFO. We wonder if this behavior is preserved during the proc inlining step for the looped channels.

Yes, loopback channels should not be inlined (#1018). Instead, any loopback channel should be externalized outside the proc inlining scope. They can still be given to codegen and an extern FIFO will be instantiated.

@rw1nkler
Copy link
Contributor Author

Thank you for confirmation. Is there any chance that #1018 will be resolved in the near future?

lpawelcz pushed a commit to antmicro/xls that referenced this pull request Mar 7, 2024
google#1295

modules/zstd: Add SequenceExecutor

This commit adds SequenceExecutor block, that is currently capable of
handling only Literal Copy commands.

Internal-tag: [#53241]
Signed-off-by: Robert Winkler <[email protected]>

examples/ram: Export internal RAM API to other modules

This commit marks SimultaneousReadWriteBehavior enum and
num_partitions function as public to allow for creating
simpler tests that interact with RAM models.

Internal-tag: [#53241]
Signed-off-by: Robert Winkler <[email protected]>

modules/zstd: Add consts and types used by SequenceExecutor to common

Internal-tag: [#53241]
Signed-off-by: Robert Winkler <[email protected]>

modules/zstd: Add RAM printer debugging block

This commit adds RAM printer block usefull for debugging
SequenceExecutor.

Internal-tag: [#53241]
Signed-off-by: Robert Winkler <[email protected]>

moduels/zstd/common: Specify decoder output format

Internal-tag: [#52954]
Signed-off-by: Pawel Czarnecki <[email protected]>

modules/zstd/sequence_executor: fix codegen

Internal-tag: [#52954]
Signed-off-by: Pawel Czarnecki <[email protected]>
@cdleary cdleary added the app Application level functionality (examples, uses of XLS stack) label Mar 27, 2024
@proppy
Copy link
Member

proppy commented Mar 29, 2024

should we close this and focus on reviewing #1315 ?

@rw1nkler
Copy link
Contributor Author

Closing, as the review will happen in #1315

@rw1nkler rw1nkler closed this Mar 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
app Application level functionality (examples, uses of XLS stack)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants