diff --git a/docs/arithmetization/cpulogic.tex b/docs/arithmetization/cpulogic.tex index 318e2db48..933149960 100644 --- a/docs/arithmetization/cpulogic.tex +++ b/docs/arithmetization/cpulogic.tex @@ -2,7 +2,7 @@ \section{CPU logic} \label{cpulogic} The CPU is in charge of coordinating the different STARKs, proving the correct execution of the instructions it reads and guaranteeing -that the final state of the EVM corresponds to the starting state after executing the input transaction. All design choices were made +that the final state of the EVM corresponds to the starting state after executing the input transactions. All design choices were made to make sure these properties can be adequately translated into constraints of degree at most 3 while minimizing the size of the different table traces (number of columns and number of rows). @@ -11,29 +11,49 @@ \section{CPU logic} \subsection{Kernel} The kernel is in charge of the proving logic. This section aims at providing a high level overview of this logic. For details about any specific part of the logic, one can consult the various ``asm'' files in the \href{https://github.com/0xPolygonZero/plonky2/tree/main/evm/src/cpu/kernel}{``kernel'' folder}. -We prove one transaction at a time. These proofs can later be aggregated recursively to prove a block. Proof aggregation is however not in the scope of this section. Here, we assume that we have an initial state of the EVM, and we wish to prove that a single transaction was correctly executed, leading to a correct update of the state. +We prove a batch of transactions, split into segments. These proofs can later be aggregated recursively to prove a block. Proof aggregation is however not in the scope of this section. Here, we assume that we have an initial state of the EVM, and we wish to prove that a batch of contiguous transactions was correctly executed, leading to a correct update of the state. -Since we process one transaction at a time, a few intermediary values need to be provided by the prover. Indeed, to prove that the registers in the EVM state are correctly updated, we need to have access to their initial values. When aggregating proofs, we can also constrain those values to match from one transaction to the next. Let us consider the example of the transaction number. Let $n$ be the number of transactions executed so far in the current block. If the current proof is not a dummy one (we are indeed executing a transaction), then the transaction number should be updated: $n := n+1$. Otherwise, the number remains unchanged. We can easily constrain this update. When aggregating the previous transaction proof ($lhs$) with the current one ($rhs$), we also need to check that the output transaction number of $lhs$ is the same as the input transaction number of $rhs$. +Since we process transactions and not entire blocks, a few intermediary values need to be provided by the prover. Indeed, to prove that the registers in the EVM state are correctly updated, we need to have access to their initial values. When aggregating proofs, we can also constrain those values to match from one batch to the next. Let us consider the example of the transaction number. Let $n$ be the number of transactions executed so far in the current block. If the current proof is not a dummy one (we are indeed executing a batch of transactions), then the transaction number should be updated: $n := n+k$ with $k$ the number of transactions in the batch. Otherwise, the number remains unchanged. We can easily constrain this update. When aggregating the previous transaction batch proof ($lhs$) with the current one ($rhs$), we also need to check that the output transaction number of $lhs$ is the same as the input transaction number of $rhs$. Those prover provided values are stored in memory prior to entering the kernel, and are used in the kernel to assert correct updates. The list of prover provided values necessary to the kernel is the following: \begin{enumerate} - \item the previous transaction number: $t_n$, - \item the gas used before executing the current transaction: $g\_u_0$, - \item the gas used after executing the current transaction: $g\_u_1$, - \item the state, transaction and receipts MPTs before executing the current transaction: $\texttt{tries}_0$, - \item the hash of all MPTs before executing the current transaction: $\texttt{digests}_0$, - \item the hash of all MPTs after executing the current transaction: $\texttt{digests}_1$, - \item the RLP encoding of the transaction. + \item the number of the last transaction executed: $t_n$, + \item the gas used before executing the current transactions: $g\_u_0$, + \item the gas used after executing the current transactions: $g\_u_1$, + \item the state, transaction and receipts MPTs before executing the current transactions: $\texttt{tries}_0$, + \item the hash of all MPTs before executing the current transactions: $\texttt{digests}_0$, + \item the hash of all MPTs after executing the current transactions: $\texttt{digests}_1$, + \item the RLP encoding of the transactions. \end{enumerate} -\paragraph*{Initialization:} The first step consists in initializing: +\paragraph*{Segment handling:} +An execution run is split into one or more segments. To ensure continuity, the first cycles of a segment are used to "load" segment data from the previous segment, and the last cycles to +"save" segment data for the next segment. The number of CPU cycles of a segment is bounded by \texttt{MAX\_CPU\_CYCLES}, which can be tweaked for best performance. The segment data values are: \begin{itemize} - \item The shift table: it maps the number of bit shifts $s$ with its shifted value $1 << s$. Note that $0 \leq s \leq 255$. - \item The initial MPTs: the initial state, transaction and receipt tries $\texttt{tries}_0$ are loaded from memory and hashed. The hashes are then compared to $\texttt{digests}\_0$. + \item the stack length, + \item the stack top, + \item the context, + \item the \texttt{is\_kernel} flag, + \item the gas used, + \item the program counter. +\end{itemize} +These values are stored as global metadata, and are loaded from (resp. written to) memory at the beginning (resp. at the end) of a segment. They are propagated +between proofs as public values. + +The initial memory of the first segment is fixed and contains: +\begin{itemize} + \item the kernel code, + \item the shift table. +\end{itemize} + +\paragraph*{Initialization:} The first step of a run consists in initializing: +\begin{itemize} + \item The initial transaction and receipt tries $\texttt{tries}_0$ are loaded from memory. The transaction and the receipt tries are hashed and the hashes are then compared to $\texttt{digests}\_0$. +For efficiency, the initial state trie will be hashed for verification at the end of the run. \item We load the transaction number $t\_n$ and the current gas used $g\_u_0$ from memory. \end{itemize} -If no transaction is provided, we can halt after this initialization. Otherwise, we start processing the transaction. The transaction is provided as its RLP encoding. We can deduce the various transaction fields (such as its type or the transfer value) from its encoding. Based on this, the kernel updates the state trie by executing the transaction. Processing the transaction also includes updating the transactions MPT with the transaction at hand. +We start processing the transactions (if any) sequentially, provided in RLP encoded format. The processing of the transaction returns a boolean ``success'' that indicates whether the transaction was executed successfully, along with the leftover gas. @@ -44,8 +64,9 @@ \subsection{Kernel} Finally, once the three MPTs have been updated, we need to carry out final checks: \begin{itemize} \item the gas used after the execution is equal to $g\_u_1$, - \item the new transaction number is $n+1$ if there was a transaction, - \item the three MPTs are hashed and checked against $\texttt{digests}_1$. + \item the new transaction number is $n + k$ with $k$ the number of processed transactions, + \item the initial state MPT is hashed and checked against $\texttt{digests}_0$. + \item the initial state MPT is updated to reflect the processed transactions, then the three final MPTs are hashed and checked against $\texttt{digests}_1$. \end{itemize} Once those final checks are performed, the program halts. @@ -283,3 +304,57 @@ \subsection{Exceptions} push once at most), and that the faulty instruction is pushing. If the exception is not raised, stack constraints ensure that a stack length of 1025 in user mode will fail the proof. \end{enumerate} + +\subsection{Linked lists} + +Individual account information are contained in the state and the storage MPTs. However, accessing and modifying MPT data requires heavy trie +traversal, insertion and deletion functions. To alleviate these costs, during an execution run, we store all account information in linked list structures +and only modify the state trie at the end of the run. + +Our linked list construction guarantees these properties: +\begin{itemize} + \item A linked list is cyclic. The last element's successor is the first element. + \item A linked list is always sorted by a certain index, which can be one or more fields of an element. + \item The last element of a linked list is MAX, whose index is always higher than any possible index value. + \item An index cannot appear twice in the linked list. +\end{itemize} + +These properties allows us to efficiently modify the list. + +\paragraph*{Search} +To search a node given its index, we provide via \texttt{PROVER\_INPUT} a pointer to its predecessor $p$. We first check that $p$'s index is strictly lower than +the node index, if not, the provided pointer is invalid. Then, we check $s$, $p$'s successor. If $s$'s index is equal to the node index, we found the node. +If $s$'s index is lower than the node index, then the provided $p$ was invalid. If $s$'s index is greater than the node index, then the node doesn't exist. + +\paragraph*{Insertion} +To insert a node given its index, we provide via \texttt{PROVER\_INPUT} a pointer to its predecessor $p$. We first check that $p$'s index is strictly lower than +the node index, if not, the provided pointer is invalid. Then, we check $s$, $p$'s successor, and make sure that $s$ is strictly greater than the node index. +We create a new node, and make it $p$'s successor; then we make $s$ the new node's successor. + +\paragraph*{Deletion} +To delete a node given its index, we provide via \texttt{PROVER\_INPUT} a pointer to its predecessor $p$. We check that $p$'s successor is equal to the node index; if not +either $p$ is invalid or the node doesn't exist. Then we set $p$'s successor to the node's successor. To indicate that the node is now deleted and to make sure that it's +never accessed again, we set its next pointer to MAX. + +We maintain two linked lists: one for the state accounts and one for the storage slots. + +\subsubsection*{Account linked list} + +An account node is made of four memory cells: +\begin{itemize} + \item The account key (the hash of the account address). This is the index of the node. + \item A pointer to the account payload, in segment \texttt{@TrieData}. + \item A pointer to the initial account payload, in segment \texttt{@TrieData}. This is the value of the account at the beginning of the execution, before processing any transaction. This payload never changes. + \item A pointer to the next node (which points to the next node's account key). +\end{itemize} + +\subsubsection*{Storage linked list} + +A storage node is made of five memory cells: +\begin{itemize} + \item The account key (the hash of the account address). + \item The slot key (the hash of the slot). Nodes are indexed by \texttt{(account\_key, slot\_key)}. + \item The slot value. + \item The initial slot value. This is the value of the account at the beginning of the execution, before processing any transaction. It never changes. + \item A pointer to the next node (which points to the next node's account key). +\end{itemize} \ No newline at end of file diff --git a/docs/arithmetization/tables.tex b/docs/arithmetization/tables.tex index 43b45eb58..3094c462a 100644 --- a/docs/arithmetization/tables.tex +++ b/docs/arithmetization/tables.tex @@ -6,5 +6,6 @@ \section{Tables} \input{tables/byte-packing} \input{tables/logic} \input{tables/memory} +\input{tables/mem-continuations} \input{tables/keccak-f} \input{tables/keccak-sponge} diff --git a/docs/arithmetization/tables/cpu.tex b/docs/arithmetization/tables/cpu.tex index 590b95668..ecbf68267 100644 --- a/docs/arithmetization/tables/cpu.tex +++ b/docs/arithmetization/tables/cpu.tex @@ -21,6 +21,16 @@ \subsubsection{CPU flow} executing user code (transaction or contract code). In a non-zero user context, syscalls may be executed, which are specific instructions written in the kernel. They don't change the context but change the code context, which is where the instructions are read from. +\paragraph*{Continuations} + +A full run of the zkEVM consists in initializing the zkEVM with the input state, executing a certain number of transactions, and then validating the output state. +However, for performance reasons, a run is split in multiple segments of at most \texttt{MAX\_CPU\_CYCLES} cycles, which can be proven individually. Continuations ensure that the segments are part of the +same run and guarantees that the state at the end of a segment is equal to the state at the beginning of the next. + +The state to propagate from one segment to another contains some of the zkEVM registers plus the current memory. These registers +are stored in memory as dedicated global metadata, and the memory to propagate is stored in two STARK tables: \texttt{MemBefore} and \texttt{MemAfter}. To check the +consistency of the memory, the Merkle cap of the previous \texttt{MemAfter} is compared to the Merkle cap of the next \texttt{MemBefore}. + \subsubsection{CPU columns} \paragraph*{Registers:} \begin{itemize} @@ -72,4 +82,6 @@ \subsubsection{CPU columns} See \ref{stackhandling} for more details. \label{push_general_view} \item \texttt{Push}: \texttt{is\_not\_kernel} is used to skip range-checking the output of a PUSH operation when we are in privileged mode, as the kernel code is known and trusted. + \item \texttt{Context pruning}: When \texttt{SET\_CONTEXT} is called to return to a parent context, this makes the current context stale. The kernel indicates it +by setting one general column to 1. For more details about context pruning, see \ref{context-pruning}. \end{itemize} diff --git a/docs/arithmetization/tables/mem-continuations.tex b/docs/arithmetization/tables/mem-continuations.tex new file mode 100644 index 000000000..9aee737a6 --- /dev/null +++ b/docs/arithmetization/tables/mem-continuations.tex @@ -0,0 +1,15 @@ +\subsection{Memory continuations} +\label{mem-continuations} + +The MemBefore (resp. MemAfter) table holds the content of the memory before (resp. after) the execution of the current segment. +For consistency, the MemAfter trace of a segment must be identical to the MemAfter trace of the next segment. +Each row of these tables contains: + +\begin{enumerate} + \item $a$, the memory cell address, + \item $v$, the initial value of the cell. +\end{enumerate} +The tables should be ordered by $(a, \tau)$. Since they only hold values, there are no constraints between the rows. + +A CTL copies all of the MemBefore values in the memory trace as reads, at timestamp $\tau = 0$. +Another CTL copies the final values from memory to MemAfter. For more details on which values are propagated, consult \ref{final-memory}. \ No newline at end of file diff --git a/docs/arithmetization/tables/memory.tex b/docs/arithmetization/tables/memory.tex index d6f3267c3..6eef3bab1 100644 --- a/docs/arithmetization/tables/memory.tex +++ b/docs/arithmetization/tables/memory.tex @@ -68,7 +68,7 @@ \subsubsection{Timestamps} Note that it doesn't mean that all memory operations have unique timestamps. There are two exceptions: \begin{itemize} - \item Before the CPU cycles, we write some global metadata in memory. These extra operations are done at timestamp $\tau = 0$. + \item Before the CPU cycles, we preinitialize the memory with the flashed state stored in the MemBefore table and we write some global metadata. These operations are done at timestamp $\tau = 0$. \item Some tables other than CPU can generate memory operations, like KeccakSponge. When this happens, these operations all have the timestamp of the CPU row of the instruction which invoked the table (for KeccakSponge, KECCAK\_GENERAL). \end{itemize} @@ -77,11 +77,22 @@ \subsubsection{Memory initialization} By default, all memory is zero-initialized. However, to save numerous writes, we allow some specific segments to be initialized with arbitrary values. \begin{itemize} - \item The read-only kernel code (in segment 0, context 0) is initialized with its correct values. It's checked by hashing the segment and verifying -that the hash value matches a verifier-provided one. - \item The code segment (segment 0) in other contexts is initialized with externally-provided account code, then checked against the account code hash. -If the code is meant to be executed, there is a soundness concern: if the code is malformed and ends with an incomplete PUSH, then the missing bytes must + \item The code segment (segment 0) is either part of the initial memory for the kernel (context 0), or is initialized with externally-provided account code, then checked against the account code hash. +In non-zero contexts, if the code is meant to be executed, there is a soundness concern: if the code is malformed and ends with an incomplete PUSH, then the missing bytes must be 0 accordingly to the Ethereum specs. To prevent the issue, we manually write 33 zeros (at most 32 bytes for the PUSH argument, and an extra one for the post-PUSH PC value). \item The ``TrieData'' segment is initialized with the input tries. The stored tries are hashed and checked against the provided initial hash. Note that the length of the segment and the pointers -- within the ``TrieData'' segment -- for the three tries are provided as prover inputs. The length is then checked against a value computed when hashing the tries. \end{itemize} + +\subsubsection{Final memory} +\label{final-memory} + +The final value of each cell of the memory must be propagated to the MemAfter table. Since memory operations are ordered by address and by timestamps, this is +easy to do: the last value of an address is the value of the last row touching this address. In other words, we propagate values of rows before the address changes. + +\paragraph*{Context pruning} +\label{context-pruning} + +We can observe that whenever we return from a context (e.g. with a RETURN opcode, from an exception...), we will never access it again and all its memory is now stale. +We make use of this fact to prune stale contexts and exclude them from MemAfter. + diff --git a/docs/arithmetization/zkevm.pdf b/docs/arithmetization/zkevm.pdf index d1986f70d..ee815d59a 100644 Binary files a/docs/arithmetization/zkevm.pdf and b/docs/arithmetization/zkevm.pdf differ diff --git a/evm_arithmetization/src/cpu/kernel/asm/mpt/linked_list/linked_list.asm b/evm_arithmetization/src/cpu/kernel/asm/mpt/linked_list/linked_list.asm index ce7b9659e..b422a5d36 100644 --- a/evm_arithmetization/src/cpu/kernel/asm/mpt/linked_list/linked_list.asm +++ b/evm_arithmetization/src/cpu/kernel/asm/mpt/linked_list/linked_list.asm @@ -24,8 +24,8 @@ /// The values at the respective positions are: /// - 0: The account key /// - 1: The slot key -/// - 2: A ptr to the payload (the stored value) -/// - 3: A ptr to the initial payload. +/// - 2: The slot value. +/// - 3: The initial slot value. /// - 4: A ptr (in segment @SEGMENT_ACCOUNTS_LINKED_LIST) to the next node in the list. %macro store_initial_accounts