Skip to content

Commit

Permalink
Updated BF16 comment.
Browse files Browse the repository at this point in the history
  • Loading branch information
FinnWilkinson committed Nov 6, 2024
1 parent 1811569 commit 9b13a5c
Showing 1 changed file with 6 additions and 9 deletions.
15 changes: 6 additions & 9 deletions src/lib/arch/aarch64/Instruction_execute.cc
Original file line number Diff line number Diff line change
Expand Up @@ -533,10 +533,9 @@ void Instruction::execute() {
float zn1, zn2, zm1, zm2;
// Horrible hack in order to convert bf16 (currently stored in a
// uint16_t) into a float.
// Each bf16 is copied into the least significant 16-bits of each
// float variable.
// Need to re-interpret each float destination as a uint16_t* inside
// the memcpy so that the least-significant bits can be accessed.
// Each bf16 is copied into the most significant 16-bits of each
// float variable; given IEEE FP32 and BF16 have the same width
// exponent and one sign bit.
memcpy((uint16_t*)&zn1 + 1, &zn[2 * i], 2);
memcpy((uint16_t*)&zn2 + 1, &zn[2 * i + 1], 2);
memcpy((uint16_t*)&zm1 + 1, &zm[2 * zmIndex], 2);
Expand Down Expand Up @@ -2260,11 +2259,9 @@ void Instruction::execute() {
float zn1, zn2, zm1, zm2;
// Horrible hack in order to convert bf16 (currently stored in a
// uint16_t) into a float.
// Each bf16 is copied into the least significant 16-bits of each
// float variable.
// Need to re-interpret each float destination as a uint16_t*
// inside the memcpy so that the least-significant bits can be
// accessed.
// Each bf16 is copied into the most significant 16-bits of each
// float variable; given IEEE FP32 and BF16 have the same width
// exponent and one sign bit.
memcpy((uint16_t*)&zn1 + 1, &zn[2 * row], 2);
memcpy((uint16_t*)&zn2 + 1, &zn[2 * row + 1], 2);
memcpy((uint16_t*)&zm1 + 1, &zm[2 * col], 2);
Expand Down

0 comments on commit 9b13a5c

Please sign in to comment.