Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
vertexcodec: Optimize encoding selection of zero groups
When checking if a byte group can be encoded as zero, we need to check 16 bytes; to reduce branch mispredictions we can load the byte group into two 64-bit registers and check the bitwise or. This results in slightly suboptimal codegen for gcc, but is optimal for clang/MSVC. This function can also be used to determine if a given vertex block can use zero encoding as a control mode. For cases when the zero encoding is selected, this scans the bytes faster and does not rely on auto-vectorization which sometimes synthesizes rather poor code in this case. This change makes encoding ~5-10% faster depending on the data.
- Loading branch information