-
Notifications
You must be signed in to change notification settings - Fork 2
Banjo Kazooie checksum routine
Update: A complete comparison of the disassembly in all 4 games is here: https://github.com/bryc/rare-n64-chksm/wiki/Checksum-code-disassembly. So this old page has repetitive info.
8025C164:
Load Byte at Address: S0
(First)
8025C164: LBU T8, 0x0000 (S0) ; T8 = bytes[i], S0=address to read byte
8025C168: LW T5, 0x004C (SP) ; T5 = (mem_DWORD & 0xFFFFFFFF);
8025C16C: ANDI T9, S1, 0x000F ; T9 = S1 & 0x000F;
8025C170: SLLV T0, T8, T9 ; T0 = T8 << T9;
8025C174: LW T4, 0x0048 (SP) ; T4 = (mem_DWORD >> 32);
8025C178: ADDU T7, T0, T5 ; T7 = T0 + T5;
8025C17C: SRA T2, T0, 0x1F ; T2 = T0 >> 0x1F;
8025C180: SLTU AT, T7, T5 ; if(T7 < T5){AT = 1;}else{AT = 0;}
8025C184: ADDU T6, AT, T2 ; T6 = AT + T2;
8025C188: ADDU T6, T6, T4 ; T6 = T6 + T4;
8025C18C: SW T6, 0x0048 (SP) ; mem_DWORD = (T6 << 32) + T7;
8025C190: SW T7, 0x004C (SP) ; mem_DWORD = (T6 << 32) + T7;
8025C194: JAL 0x8025C29C ; Jump and set RA = 0x8025C19C (next instruction after OR)
8025C198: OR A0, S2, R0 ; A0 = S2 | R0
8025C29C:
Meat of the algorithm (Bit shifting, XOR etc)
8025C29C: LD A3, 0x0000 (A0) ; A3 = mem_DWORD;
8025C2A0: DSLL32 A2, A3, 0x1F ; A2 = A3 << (0x1F + 32);
8025C2A4: DSLL A1, A3, 0x1F ; A1 = A3 << 0x1F;
8025C2A8: DSRL A2, A2, 0x1F ; A2 = A2 >> 0x1F;
8025C2AC: DSRL32 A1, A1, 0x0 ; A1 = A1 >> (0x00 + 32);
8025C2B0: DSLL32 A3, A3, 0xC ; A3 = A3 << (0x0C + 32);
8025C2B4: OR A2, A2, A1 ; A2 = A2 | A1;
8025C2B8: DSRL32 A3, A3, 0x0 ; A3 = A3 >> (0x00 + 32);
8025C2BC: XOR A2, A2, A3 ; A2 = A2 ^ A3;
8025C2C0: DSRL A3, A2, 0x14 ; A3 = A2 >> 0x14;
8025C2C4: ANDI A3, A3, 0x0FFF ; A3 = A3 & 0x0FFF;
8025C2C8: XOR A3, A3, A2 ; A3 = A3 ^ A2;
8025C2CC: DSLL32 V0, A3, 0x0 ; V0 = A3 << (0x00 + 32);
8025C2D0: SD A3, 0x0000 (A0) ; mem_DWORD = A3;
8025C2D4: JR RA ; (Jump to RA 0x8025C19C)
8025C2D8: DSRA32 V0, V0, 0x0 ; V0 = V0 >> (0x00 + 32);
8025C19C:
Jump back to increase counters, then loop until S0 == S5
. (First)
8025C19C: ADDIU S0, S0, 0x0001 ; (S0 = S0 + 0x0001) or (i++)
8025C1A0: ADDIU S1, S1, 0x0007 ; S1 = S1 + 0x0007;
8025C1A4: BNE S0, S5, 0x8025C164 ; Loop until S0 == S5. OR: for as long as (S0 < S5) / (i < sizeof(bytes))
8025C1A8: XOR S3, S3, V0 ; S3 = S3 ^ V0;
Update S0
and S5
for 2nd loop. continue on to the 2nd loop.
8025C1AC: LW A3, 0x0058 (SP) ; load start address? SP + 0x58
8025C1B0: ADDIU S0, S5, 0xFFFF ; S0 = S5 - 1 (start address = stop address minus 1)
8025C1B4: SLTU AT, S0, A3
8025C1B8: BNEZ AT, 0x8025C20C ; branch where?
8025C1BC: ADDIU S2, SP, 0x0048 ; S2 = SP + 0x48 (would already be the same as S2? S2 = S2)
8025C1C0: ADDIU S5, A3, 0xFFFF ; S5 = A3 - 1 (stop address = start address minus 1)
8025C1C4:
Load Byte at Address: S0
(Second)
8025C1C4: LBU T1, 0x0000 (S0) ; T1 = bytes[i];
8025C1C8: LW T3, 0x004C (SP) ; T3 = (mem_DWORD & 0xFFFFFFFF);
8025C1CC: ANDI T8, S1, 0x000F ; T8 = S1 & 0x000F;
8025C1D0: SLLV T9, T1, T8 ; T9 = T1 << T8;
8025C1D4: LW T2, 0x0048 (SP) ; T2 = (mem_DWORD >> 32);
8025C1D8: ADDU T5, T9, T3 ; T5 = T9 + T3;
8025C1DC: SRA T0, T9, 0x1F ; T0 = T9 >> 0x1F;
8025C1E0: SLTU AT, T5, T3 ; if (T5 < T3) { AT = 1; } else { AT = 0; }
8025C1E4: ADDU T4, AT, T0 ; T4 = AT + T0;
8025C1E8: ADDU T4, T4, T2 ; T4 = T4 + T2;
8025C1EC: SW T4, 0x0048 (SP) ; mem_DWORD = (T4 << 32) + T5;
8025C1F0: SW T5, 0x004C (SP) ; mem_DWORD = (T4 << 32) + T5;
8025C1F4: JAL 0x8025C29C ; Jump and set RA = 0x8025C1FC (next instruction after OR)
8025C1F8: OR A0, S2, R0 ; A0 = S2 | R0
8025C29C:
Same algorithm as before
8025C29C: LD A3, 0x0000 (A0) ; A3 = mem_DWORD;
8025C2A0: DSLL32 A2, A3, 0x1F ; A2 = A3 << (0x1F + 32);
8025C2A4: DSLL A1, A3, 0x1F ; A1 = A3 << 0x1F;
8025C2A8: DSRL A2, A2, 0x1F ; A2 = A2 >> 0x1F;
8025C2AC: DSRL32 A1, A1, 0x0 ; A1 = A1 >> (0x00 + 32);
8025C2B0: DSLL32 A3, A3, 0xC ; A3 = A3 << (0x0C + 32);
8025C2B4: OR A2, A2, A1 ; A2 = A2 | A1;
8025C2B8: DSRL32 A3, A3, 0x0 ; A3 = A3 >> (0x00 + 32);
8025C2BC: XOR A2, A2, A3 ; A2 = A2 ^ A3;
8025C2C0: DSRL A3, A2, 0x14 ; A3 = A2 >> 0x14;
8025C2C4: ANDI A3, A3, 0x0FFF ; A3 = A3 & 0x0FFF;
8025C2C8: XOR A3, A3, A2 ; A3 = A3 ^ A2;
8025C2CC: DSLL32 V0, A3, 0x0 ; V0 = A3 << (0x00 + 32);
8025C2D0: SD A3, 0x0000 (A0) ; mem_DWORD = A3;
8025C2D4: JR RA ; (Jump to RA 0x8025C1FC)
8025C2D8: DSRA32 V0, V0, 0x0 ; V0 = V0 >> (0x00 + 32);
8025C1FC:
Jump back to increase/decrease counters, then loop until S0 == S5
. (Second)
8025C1FC: ADDIU S0, S0, 0xFFFF ; (S0 = S0 + 0xFFFF) or (i--)
8025C200: ADDIU S1, S1, 0x0003 ; S1 = S1 + 0x0003;
8025C204: BNE S0, S5, 0x8025C1C4 ; Loop until S0 == S5. OR: for as long as (S0 > S5) / (i > 0)
8025C208: XOR S4, S4, V0 ; S4 = S4 ^ V0;
S3
and S4
should contain the component checksums.
It immediately saves some values to memory:
checksum_save:
lw t6,96(sp)
sw s3,0(t6) ; goldeneye and banjo-tooie use
sw s4,4(t6) ; s3 and s4 raw as a u64 value
lw ra,44(sp)
lw s5,40(sp)
lw s4,36(sp)
lw s3,32(sp)
lw s2,28(sp)
lw s1,24(sp)
lw s0,20(sp)
jr ra
addiu sp,sp,88
Checksum verification occurs in a different area of memory, and also is where the final XOR is located:
8033C040: SW V1, 0x0000 (A1) ; Save stored checksum to V1?
8033C044: LW T7, 0x0028 (SP) ; Load S3 to T7?
8033C048: LW T6, 0x002C (SP) ; Load S4 to T6?
8033C04C: SW RA, 0x0014 (SP) ; Save return address?
8033C050: XOR T8, T6, T7 ; T8 = T6 ^ T7? XOR T6 and T7.
8033C054: BEQ V1, T8, 0x8033C068 ; Branch if EQual. If V1===T8, go here.