Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add INCR privileged instructions #734

Open
wants to merge 14 commits into
base: develop
Choose a base branch
from
Open

Conversation

Nashtare
Copy link
Collaborator

@Nashtare Nashtare commented Oct 18, 2024

Add a series of 4 INCR privileged instructions (INCR1, INCR2, INCR3 and INCR4) to increment by 1 the Nth element of the stack in place (i.e. no PUSH / POP).
Particularly helpful for accumulators increment previously requiring SWAPN PUSH 1 ADD SWAPN now only requiring INCRN.

Though having an overall lesser impact, we could see how interesting a DECR variant would be (we could add it at no cost by combining it with the INCR CPU column).

Removes 4% to 5% of CPU cycles on mainnet blocks.

Total CPU columns for vanilla type1: 86

MemBefore new initial size:

  • vanilla type1: 63199
  • type2: 62691

@Nashtare Nashtare added the performance Performance improvement related changes label Oct 18, 2024
@Nashtare Nashtare added this to the Performance Tuning milestone Oct 18, 2024
@Nashtare Nashtare self-assigned this Oct 18, 2024
@github-actions github-actions bot added crate: evm_arithmetization Anything related to the evm_arithmetization crate. specs labels Oct 18, 2024
scripts/prove_stdio.sh Outdated Show resolved Hide resolved
Comment on lines -23 to -34
# Circuit sizes only matter in non test_only mode.
if ! [[ $8 == "test_only" ]]; then
export ARITHMETIC_CIRCUIT_SIZE="16..21"
export BYTE_PACKING_CIRCUIT_SIZE="8..21"
export CPU_CIRCUIT_SIZE="8..21"
export KECCAK_CIRCUIT_SIZE="4..20"
export KECCAK_SPONGE_CIRCUIT_SIZE="8..17"
export LOGIC_CIRCUIT_SIZE="4..21"
export MEMORY_CIRCUIT_SIZE="17..24"
export MEMORY_BEFORE_CIRCUIT_SIZE="16..23"
export MEMORY_AFTER_CIRCUIT_SIZE="7..23"
fi
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these match the default ones in .env

Copy link
Contributor

@muursh muursh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice

Copy link
Contributor

@hratoanina hratoanina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, but there are some issues with the constraints. We should also be able to get rid of the memory operations for INCR1.

evm_arithmetization/src/cpu/incr.rs Outdated Show resolved Hide resolved
evm_arithmetization/src/cpu/incr.rs Outdated Show resolved Hide resolved
evm_arithmetization/src/cpu/kernel/asm/core/exception.asm Outdated Show resolved Hide resolved
@@ -597,6 +599,39 @@ pub(crate) fn generate_swap<F: RichField, T: Transition<F>>(
Ok(())
}

pub(crate) fn generate_incr<F: RichField, T: Transition<F>>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm actually not sure why it's working. For INCR2-4, there's no problem, but for INCR1 we are reading the stack (and writing) at address stack_len - 1. There is no guarantee that the current top of the stack has been written in memory, so I'm surprised the reads don't return a wrong value sometimes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some constraints seem to be missing, and some seem to be unneeded.

The current set of constraints work as intended for INCR2-4, but for INCR1 we are not checking that the output channel is equal to the next top of the stack.
Moreover, the value read in the input channel is not constrained to match the current top of the stack (tests pass so it seems to be the case, but it sounds like coincidence to me).

I think the clean way to do it is to filter all of the current constraints with lv.opcode_bits[0] (with a new one making sure that the top of the stack doesn't change), and handle INCR1 separately with filter 1 - lv.opcode_bits[0] (you can even disable the memory channels to save some memory rows).

Copy link
Collaborator Author

@Nashtare Nashtare Nov 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

handle INCR1 separately with filter 1 - lv.opcode_bits[0]

This would be catching INCR3 too, which wouldn't work. If I use both bits I'll get a degree 4, but I can introduce helper columns. I'll look at it.

Copy link
Contributor

@LindaGuiga LindaGuiga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have the same concerns as Hamy regarding generate_incr and constraints for INCR1. For INCR2-4, it looks good to me besides some nits.

evm_arithmetization/src/cpu/incr.rs Outdated Show resolved Hide resolved
evm_arithmetization/src/cpu/incr.rs Outdated Show resolved Hide resolved
@Nashtare Nashtare mentioned this pull request Nov 4, 2024
@Nashtare
Copy link
Collaborator Author

@hratoanina @LindaGuiga I've removed the memory ops for INCR1, moving the extra checks on prev/new stack tops with an extra CTL against arithmetic. Let me know if I forgot anything.

@Nashtare Nashtare requested review from hratoanina and LindaGuiga and removed request for hratoanina November 14, 2024 13:39
Copy link
Contributor

@hratoanina hratoanina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good except the general column constraint!


// Constrain the helper column
yield_constr.constraint(
base_filter
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's rather base_filter * (lv.general.incr().is_not_incr1 - lv.opcode_bits[0] * lv.opcode_bits[1]).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A possible optimization to avoid introducing an extra CTL just for INCR1 is to keep the same arithmetic CTL for all cases, with memory channels 1 and 2. Then you add INCR1-specific constraints to check that lv.stack_top == mem_channel[1] and nv.stack_top == mem_channel[2].

It requires using the value columns of some disable channels, but I couldn't find anywhere in the code that it wasn't possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
crate: evm_arithmetization Anything related to the evm_arithmetization crate. performance Performance improvement related changes specs
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

4 participants