Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Counting Immutable Beans #82

Open
andorp opened this issue Mar 8, 2020 · 8 comments
Open

Counting Immutable Beans #82

andorp opened this issue Mar 8, 2020 · 8 comments
Assignees
Labels
proposal A suggestion on how to improve the compiler

Comments

@andorp
Copy link
Member

andorp commented Mar 8, 2020

Implementation of the Counting Immutable Beans (CIB) for the GRIN compiler.

Summary

CIB uses instrumentation of the original program. There are four new instructions in the syntax
that are inserted via instrumentation. This can be categorized into two;

  • reference counter instructions:
    • inc
    • dec
  • heap location reuse:
    • reset
    • reuse

In the CIB approach every heap location has a reference counter associated with it. Inc increments the counter for the location, and also increments all the referred locations transitively.
Dec decrements the counter for the location, and also decrements all the referred locations transitively.

Reset, reuse:

From the CIB paper:

let y = reset x.

If x is a shared value, than y is set to a special pointer value BOX, otherwise to the heap location associated with x.
If x is not shared than reset decrements the reference counters of the components of x, and y is set to x.

let z = reuse y in ctor_i w.

If y is BOX reuse allocates a new heap for the constructor.
If y is not Box the runtime reuses the heap location for storing the constructor.

Application of the same idea for GRIN:

Differences: meanwhile Lean's IR put every variable on the heap, GRIN uses variables as were registers and only a subset of the registers are associated with heap locations. A register is associated with heap location if its type is Loc. This means the GRIN implementation of the CIB approach needs a type environment which tells which variables can be affected by the CIB operations.

In GRIN:

  • The CIB instrumentation should happen after the optimization steps.
  • Special interpreter should be implemented which handles the CIB instructions.
  • Probably it should have its own LLVM code generator and LLVM implemented runtime, preferably a plugin for the existing one.

We need to add 4 new instructions:

  • x <- reset y; where y is a heap location, x can be a special heap location, which can be BOX too.
  • z <- reuse x y; where x is a special heap location created by reset, and y is a Node value.
  • z <- inc x; where x is a heap location, it transitively increments the reference counters in the locations. Cycle detection should happen. The increment operation computes unit as its return value.
  • z <- dec x; where x is a heap location, it transitively decrements the reference counters in the locations. Cycle detection should happen. When the counter reaches zero, the runtime must deallocate the location. The decrement operation computes unit as its return value.

The GRIN nodes store primitive values, but the runtime makes the difference between location values and primitive values, thus it is able to create the transitive closure of the reachability relation of a location and manipulate its reference counters.

Every of the four instructions needs to be implemented in the GRIN runtime/interpreter.

In the original paper reuse of the constructors could happen only of the arity of the constructors
are the same. But in GRIN as the runtime needs to allocate heaps based on the type of the heap location. This means every heap location can have its own arity, and reuse if the heap location is possible only if the new node does not exceeds the arity of the heap node. Otherwise a new node needs to be allocated, with the maximum arity.

The most important change is the reuse construction. It changes certain instances of the
store operation to the reuse operation.

Before:
x <- store y;

After:
z <- reset w;
...
x <- reuse z y;

In this case we need to decide to reuse the heap location associated with w only if w can accommodate all the possible values of x. This means the max-arity(w) >= max-arity(x). Meanwhile Lean's approach uses the arity of the constructors in the alternatives, we can use the abstract information of all the possible runs.

Implementation steps:

  1. Import abstracting the definitional interpreters from the mini-grin repo
  2. Change the implementation to use base functors instead of Expr datatype
  3. Implement reference statistics with the new interpreter, as a warm-up exercise
  4. Implement CIB program instrumentation for GRIN producing ExprF :+: CibF AST
  5. Implement interpreter for CIB extended GRIN program
  6. Extra: Implement LLVM codegen plugin for CIB instructions
@andorp andorp self-assigned this Mar 8, 2020
@andorp andorp added the proposal A suggestion on how to improve the compiler label Mar 8, 2020
@bjorn3
Copy link

bjorn3 commented Mar 8, 2020

Reference counting may have a higher total overhead than garbage collection. I don't know if re-using allocations saves sufficient time to make it's overhead smaller than garbage collection. I think it would be really useful to benchmark CIB against GC once it is implemented.

@andorp
Copy link
Member Author

andorp commented Mar 8, 2020

There are some preliminary results in the Lean paper; https://arxiv.org/pdf/1908.05647.pdf
FireShot Capture 001 -  - arxiv org

@andorp
Copy link
Member Author

andorp commented Mar 8, 2020

About the comparing GC with CIB. We can't do that measurement yet. As we have only a simple runtime implemented in C and LLVM which does only have allocation as memory management.

I agree that it would be nice to have something like a GC. What stops us? Mainly that our focus is currently is not on the LLVM and runtime implementation. Why? We are changing the AST to accomodate datalog as possible way of implementing different kind of HeapPointsTo analysis. The current proof of concept implementation can't handle bigger programs than 70k AST nodes.

The project is based around the idea of whole program analysis of GRIN is equivalent to Modern Pointer Analysis problems. Of course we can opt-out from the whole program analysis and run analysis on procedure or module level as other compilers do and generate LLVM code from that level of optimized code. But our personal goal is apply whole program analysis on industrial sized projects.

I think a simple GC supported GRIN runtime implemented in LLVM would be a nice master thesis for someone, who is interested in such a topic. Even another bachelor's or master's could be the implementation of the CIB runtime in LLVM for GRIN.

Please shout if you know anybody that would be interested doing such a thing. :)

@andorp andorp changed the title Discuss proposal Counting Immutable Beans Apr 25, 2020
@Avi-D-coder
Copy link

I have been researching, and slowly building a FP oriented concurrent copying GC (Sundial design document) for Rust, one of the project goals is to enable it's use as the foundation of FP language runtimes, I suspect in the long run it would be a good fit. It requires both direct and transitive set of points to type info, in order to achieve pauseless concurrent collection, While Sundial will support vtable/runtime polymorphism, it will come at a higher cost. My interpretation of the docs folder is that GRIN's points to analysis provides this type info?

On a related note, is update always done with release ordering? Is a more efferent C11 style memory model planned?

@savuori
Copy link

savuori commented Mar 27, 2022

On the other hand, reference counting seems to be a prerequisite for in-place mutation optimization that Koka (FBIP) and Roc use. That could be really interesting optimization as well.

@andorp
Copy link
Member Author

andorp commented Mar 27, 2022

I discontinued this development for the sole reason of not having enough time and resources in my life now, and it would require more than I can allocate to it. I am more than happy to discuss my learnings with anyone who would like to pick this up.

@GunpowderGuy
Copy link

I have been researching, and slowly building a FP oriented concurrent copying GC (Sundial design document) for Rust, one of the project goals is to enable it's use as the foundation of FP language runtimes, I suspect in the long run it would be a good fit. It requires both direct and transitive set of points to type info, in order to achieve pauseless concurrent collection, While Sundial will support vtable/runtime polymorphism, it will come at a higher cost. My interpretation of the docs folder is that GRIN's points to analysis provides this type info?

On a related note, is update always done with release ordering? Is a more efferent C11 style memory model planned?

Would Grin exploit your GC by compiling to rust code that uses its api?

@Avi-D-coder
Copy link

@GunpowderGuy that was not what I was thinking, but sadly the research was never finished so nothing can use it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proposal A suggestion on how to improve the compiler
Projects
None yet
Development

No branches or pull requests

5 participants