Skip to content

Latest commit

 

History

History
152 lines (112 loc) · 6.48 KB

todo.md

File metadata and controls

152 lines (112 loc) · 6.48 KB

### The Grand Plan

  • figure out how to implement two-phase fetch/execute cycle...

  • get compiled output running on FPGA...

  • get rid of f4pga/yosys stuff...in favour of vivado...

  • use simple f4pga build CLI with flow.json & get rid of all makefiles etc?

  • get it running on the real FPGA again, just using LEDs as output (need to resolve some differences in hardware first??)

  • update readme including info on how to run hardware test benches...

  • hook up screen

  • the worlds worst text editor

    • add inverting blinking cursor
    • enable arrow keys
    • enable backspace
  • allow negative literal numbers (so that e.g. you can say let foo = -32768 or let foo = 0b1111000000000000)

  • allow break; in loop

  • allow continue; in loop

  • add hardware modulo

  • implement other games

    • pong
    • mandelbrot
    • tetris
    • game of life
    • asteroids
    • snake
    • game selector / home screen
  • get game of life running on real screen

  • hook up keyboard and get other games running using keyboard

  • link layer (ethernet)

  • internet layer (ip)

  • transport layer (tcp)

  • app layer (http)

  • review possible optimisations of pointer usage / pruning stuff

How call graph / pointer usage analysis works

Besides SP which is different because it's kind of global across all stack frames, there are 4 pointers that can be used by any stack frame:

LCL ARG THIS THAT

Conceptually, each stack frame has its own version of each of these. For the current stack frame, these are stored at memory addresses 2-5. When calling a function, the values for the current frame are saved on the stack, and then restored again when the callee returns.

However, we make some optimisations to avoid unnecessary work here. We record which pointers are used by which functions. This analysis is done at the level of the vm code.

A function is considered to have directly used the LCL pointer if it contains any commands involving pushing or popping locals.

Similarly, a function is considered to have directly used the ARG pointer if it contains any commands involving pushing or popping arguments.

(TODO - is this flawed? Since there may be other commands also which involve ARG...namely the call command?? Or is that OK, since it only writes to ARG, and never reads?? Do we need to add some notion of reading vs writing to our pointer usage analysis? Even if the current approach is sound, this may allow for further optimisations in terms of program size.)

A function is considered to have directly used THIS if any command pushes or pops to POINTER 0, or to THIS.

A function is considered to have directly used THAT if any command pushes or pops to POINTER 1, or to THAT.

How calling a function works

  • push the arguments
  • push return address
  • push the pointers from the current frame that need saving
  • set ARG pointer (if in "pointers"...)
  • set LCL pointer (if in "pointers"...)
  • jump to subroutine address

("pointers" is pointers used directly by the function, plus the ones that we'll need to restore when returning from the function?)

How returning from a function works

  • stash return value into R7
  • pop locals
  • work through saved caller pointers on stack frame, check whether they can simply be popped, or whether we need to restore them
  • stash return address in R8
  • place return value from R7
  • jump to return address

TODO

  • fix bug in call graph analysis to unblock game of life

get_next_state itself doesn't use the ARG pointer... so we think we're safe in not restoring it when we return from get_next_state. but we DO set the ARG pointer when CALLING get_next_state. so we need to take this into account?

optimisations

  • don't waste an instruction on zeroes and ones - add new VM command set e.g. set local 1 0, where the first argument is the offset, as usual, and the second arg is the value, which is restricted to 1, 0 or -1, so that it can be set in a single instruction without needing any pushes or pops

  • use set instruction whereever possible, e.g. for let i = 0

  • I think I'm shuffling stuff around r7 or r8 unnecessarily sometimes?

  • eliminate unnecessary pushing/popping of return values

  • drawRectangle

  • create web IO implementation for web debugger?

  • make code panels draggable/collapsible?

  • do compiling in browser with webapp, using wasm-bindgen to get result with sourcemap - then can use types generated by wasm-bindgen - no need for ts_rs?

  • read docs on wasm-bindgen https://rustwasm.github.io/docs/wasm-bindgen/examples/index.html

  • "realtime play" - will require windowing code-panels to improve perf

  • show contextual jack node

  • include linting and tsc in test suite

  • void return values - don't push zero just to pop it again

perf/code-size optimisations

  • identical code folding - automatic extraction of subroutines
  • use an even smaller font e.g. this 5x5 px one https://www.dafont.com/5x5.font (and caps only?)
  • peephole optimisation of vm code

emulator enhancements

  • report stack overflows etc
  • add stepping ability
    • step line of jack code
    • step line of vm code
    • step asm instruction

refactoring

  • make error handling and reporting more consistent in parsers
  • maybe get rid of clap and parse cli args myself

TODO

  • write full stdlib
  • implement vector module
  • add SCREEN and GLYPHS variable for use in jack code
  • allow use of e.g. var int[4] foo; to declare fixed-length arrays to be allocated in static section, or on stack. this could make the code in Memory.jack much neater
  • figure out limits of current algo for two's complement multiplication - is there a simple failing example for a small negative number?
  • booth's algo? or...read this: https://pages.cs.wisc.edu/~markhill/cs354/Fall2008/beyond354/int.mult.html ?
  • check arg count equals param count? might be difficult - would need to look across classes sometimes...

jack extras

  • for loops
  • pointers
  • rudimentary typechecking, arity checking etc? might be tricky...would need to allow some coercions - e.g. obj to int for Memory.dealloc, array to obj for constructors.
  • break/continue

hardware

  • add timer?
  • add another hardware register?
  • bitshift - this would be really helpful - to find places for using it, search code for division, multiplication, pow2...
  • integer multiplication / division?
  • floating point math?
  • ethernet...
  • graphics: fix flickering by assigning a "don't draw" register which programs can use to flag when frame buffer is in an inconsistent state, and which the computer will read to decide whether or not to actually refresh the screen. will need to figure out how to make this work on the fpga too!