Skip to content
This repository has been archived by the owner on Nov 29, 2020. It is now read-only.

WIP - 24 lazy val optimization for pure code #41

Closed
wants to merge 10 commits into from
Closed

WIP - 24 lazy val optimization for pure code #41

wants to merge 10 commits into from

Conversation

heyrutvik
Copy link
Contributor

Related: #24

Component will generate single flag for entire class/object. For each lazy value, flag value (with which we perform and operation in if condition) will be dynamically computed using bitwise operation 1 << n.

class LazyFoo {
  lazy val foo: Int = 1
  lazy val bar: Double = 2.0
  lazy val baz: Boolean = false
}

object LazyFoo {
  lazy val qux: String = "scalaz"
}

will transform to

class LazyFoo {
  private[this] var $lazyflag$1: Int = 0;

  private[this] var foo$value$0: Int = _;
  def foo: Int = {
    if (LazyFoo.this.$lazyflag$1.&(1).!=(0))
      {
        LazyFoo.this.$lazyflag$1_=(LazyFoo.this.$lazyflag$1.|(1));
        LazyFoo.this.foo$value$0_=(1)
      };
    LazyFoo.this.foo$value$0
  };
  
  private[this] var bar$value$1: Double = _;
  def bar: Double = {
    if (LazyFoo.this.$lazyflag$1.&(2).!=(0))
      {
        LazyFoo.this.$lazyflag$1_=(LazyFoo.this.$lazyflag$1.|(2));
        LazyFoo.this.bar$value$1_=(2.0)
      };
    LazyFoo.this.bar$value$1
  };
  
  private[this] var baz$value$2: Boolean = _
  def baz: Boolean = {
    if (LazyFoo.this.$lazyflag$1.&(4).!=(0))
      {
        LazyFoo.this.$lazyflag$1_=(LazyFoo.this.$lazyflag$1.|(4));
        LazyFoo.this.baz$value$2_=(false)
      };
    LazyFoo.this.baz$value$2
  }
}
object LazyFoo {
  private[this] var $lazyflag$2: Int = 0;
  
  private[this] var qux$value$0: String = _
  def qux: String = {
    if (LazyFoo.this.$lazyflag$2.&(1).!=(0))
      {
        LazyFoo.this.$lazyflag$2_=(LazyFoo.this.$lazyflag$2.|(1));
        LazyFoo.this.qux$value$0_=("scalaz")
      };
    LazyFoo.this.qux$value$0
  }
}


if (containsLazyVals) {
// create var flag for whole block
val flagName = freshTermName("$lazyflag$")(currentFreshNameCreator)
Copy link
Contributor

@sir-wabbit sir-wabbit Jul 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if there are more than 32 lazy vals?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.. I haven't considered that.
How about Long? But then we will stuck with 64 lazy vals.
What do you suggest?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generate ((n + 31) / 32) Ints.

cd.mods,
cd.name,
cd.tparams,
processBody(cd.symbol.enclClass, tmpl))
Copy link
Contributor

@sir-wabbit sir-wabbit Jul 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need a test to see what happens with override lazy val, override val overriding lazy val and lazy vals in traits.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will test it.. If it won't work, will make it work :)

@heyrutvik heyrutvik changed the title 24 lazy val optimization for pure code WIP - 24 lazy val optimization for pure code Jul 27, 2018
@fommil
Copy link

fommil commented Jul 27, 2018

the var should be @volatile (and probably @transient) ... unless you didn't want to support that to avoid the perf hit?

BTW incase you haven't seen it: https://github.com/runarorama/latr

Also you may as well use Long if you're doing bitwise ops on it, since modern JVMs are 64bit and use 64 bits to represent Int anyway...

@heyrutvik
Copy link
Contributor Author

@fommil
I think we are only take account for pure code as described by @alexknvl in #24. So, I haven't added volatile annotation.
re: Long. Cool.. thanks for suggestion. I will change Ints to Longs.

@fommil
Copy link

fommil commented Jul 27, 2018

@heyrutvik it doesn't matter if it's pure or not: @volatile is about visibility between threads. Or do you not care about duplicated work across threads?

@heyrutvik
Copy link
Contributor Author

Yes @fommil, I know @volatile is about memory visibility but I choose not use it deliberately because issue says,

Replace all lazy vals with the cheapest version of https://github.com/runarorama/latr (@Lazify).

Which doesn't use @volatiles.

If the code is pure, re-evaluating a thunk twice is not a problem and the probability of that happening is minuscule anyway.

So, yes I guess I do not care about duplicated work across threads.

If you think we should not be doing this, then I suggest you to talk about it with @alexknvl. :)

@Blaisorblade
Copy link

Or do you not care about duplicated work across threads?

This is useful, but pure initializers don't guarantee thread safety. Duplicated work doesn't affect correctness (only performance), but that doesn't mean that "non-thread-safe" lazy vals are thread-safe simply if the initializer is pure. For the semantics you want, I think it helps if fields reachable for the lazy val are final, which they are for Scala val.

Concretely: If two threads race initializing x, defined as lazy val x: List[T] = ..., and thread A wins, the contents of x might only be visible to thread A and not thread B — you'll need a barrier to publish them across CPUs. That's because of scala/collection-strawman#50.

Indeed, "thread safety for pure initializers" is not what SIP-20 promises:

This version is faster than all other versions on benchmarks but does not correctly handle safe publication in the case of multiple threads. It can be used in applications that utilize multiple threads if some other means of safe publication is used instead.

Understanding that sentence assumes proficiency with the Java Memory Model, and just making the initializer pure doesn't help.

@sir-wabbit
Copy link
Contributor

Having safe publication is quite important when dealing with List[A] or nested lazy vals (since lazy involves mutability), and @volatile is incredibly cheap in the single-threaded case. So I think it is a good idea to add it by default.

@fommil
Copy link

fommil commented Jul 27, 2018

Actually I agree with the decision to not use @volatile but I think it should be documented because people like me will wonder if it is intentional or not 😄

@sir-wabbit
Copy link
Contributor

sir-wabbit commented Jul 27, 2018

What if instead we use this encoding:

lazy val foo: T = init

// Casts and the type of foo$value can be specialized
// depending on how much we know about T.
private[this] var foo$value: AnyRef = null;
def foo: T = {
  if (foo$value eq null) {
    val value: T = init
    foo$value = value.asInstanceOf[AnyRef]
    value.asInstanceOf[T]
  } else foo$value.asInstanceOf[T]
};

It will incur a single unboxing on access in the case of where T is primitive.

EDIT: This one should take care of safe-publishing (?):

protected[this] @volatile var $publish: Int = 0 // can be shared across all lazy vals
private[this] var foo$value: AnyRef = null;
def foo: T = {
  if (foo$value eq null) {
    val value: T = init
    foo$value = value.asInstanceOf[AnyRef]
    $publish = 0 // write-barrier
    value.asInstanceOf[T]
  } else foo$value.asInstanceOf[T]
};

@sir-wabbit
Copy link
Contributor

sir-wabbit commented Jul 27, 2018

To reduce memory allocation, I propose we put

public class Fence {
  private static volatile int value = 0;
  public static void write() { value = 0; }
  public static void read() { int x = value; }
  public static void readWrite() { value += 1; }
}

into the library that comes with the plugin and then do

private[this] var foo$value: AnyRef = null;
def foo: T = {
  if (foo$value eq null) {
    val value: T = init
    foo$value = value.asInstanceOf[AnyRef]
    Fence.write()
    value.asInstanceOf[T]
  } else foo$value.asInstanceOf[T]
};

EDIT: Need to test that this actually works.

@sir-wabbit
Copy link
Contributor

sir-wabbit commented Jul 27, 2018

After thinking a bit more about this issue, I am no longer sure that we actually want a memory barrier. I don't have expectations that non-immutable data-structures will be safely published across different threads even when I use val. Scala 2.13 will use memory fencing in ListBuffer that should prevent any problems with safe-publishing List[A]: scala/scala#6425

private[this] var foo$value: AnyRef = null;
def foo: T = {
  if (foo$value eq null) {
    val value: T = init
    // Write-fence here will ensure that value is published
    // before foo$value is changed, but do we actually need it
    // if everything is properly immutable?
    foo$value = value.asInstanceOf[AnyRef]
    value.asInstanceOf[T]
  } else foo$value.asInstanceOf[T]
};

@sir-wabbit
Copy link
Contributor

Tested that at least on x86 a write barrier using a volatile int works.

import scala.collection.mutable.ListBuffer

object App {
  def main(args: Array[String]): Unit = {
    new X().runTest()
  }
}

class Foo {
  var var_ : List[String] = null
  def reset: Unit = { var_ = null }
  def get: List[String] = {
    val current = var_
    if (current eq null) {
      val lb = new ListBuffer[String]
      lb.+=("")
      lb.+=("")
      val result = lb.toList

      // Fence.write()
      // without the fence, you'll see 
      // 285676610937800 Thread: 53 size is 1
      // every once in a while
      var_ = result
      result
    } else current
  }
}

class X extends Runnable {
  var x: Foo = new Foo

  def run() {
    val threadId = Thread.currentThread().getId
    println(s"Thread $threadId started")
    while (true) {
      val size = x.get.size
      if (size != 2) println(f"${System.nanoTime()}%10d Thread: $threadId%2d size is $size%d")
      // Thread.sleep(1)
    }
  }

  def runTest(): Unit = {
    Seq.fill(50)(new Thread(this)).foreach(_.start())

    println("Starting mutation")
    while (true) {
      x.reset
    }
  }
}

@heyrutvik
Copy link
Contributor Author

heyrutvik commented Jul 28, 2018

@alexknvl re: #41 (comment)
I think we are expecting and encouraging the use of immutable values so memory barriers are off the table.

👍 for private[this] var foo$value: AnyRef = null; def foo: T = ... encoding for lazy val foo.

@Blaisorblade
Copy link

As Alex pointed out, lazy vals are pure but implemented with mutation, so a barrier is needed there.
If you don't want it, it's fine, just don't describe the result as thread safe for pure code, since it isn't. Users who care about multiple threads can say so with a @volatile lazy val.

@sir-wabbit
Copy link
Contributor

sir-wabbit commented Jul 28, 2018

@Blaisorblade It is thread-safe if all of your code avoids mutation. You could say that there is a bug in List[A]: it is precisely because tail is mutable and assigned without a write-barrier, lazy val becomes thread-unsafe without a write-barrier as well. However, if List[A] was properly fenced, that would not be a problem. I agree that it is nice to have an option to add a barrier.

A properly written IORef has var as either @volatile or as an AtomicReference, so there should not be any non-volatile vars in purely functional code.

@Blaisorblade
Copy link

@alexknvl I agree List is questionable, so forget that. But you pointed out correctly the issue with nested lazy vals, and that’s what I was talking about.
I might be wrong, but it seems that nested lazy vals are allowed in Scalazzi (and they should IMHO, they’re allowed in Haskell) and using them does not count as using mutation. Hence, it takes a barrier on the outermost lazy val to be thread-safe.
But apart from my opinions, it’d be good that statements about thread-safety were formal enough to be either true or false, rather than “it depends on what exactly you mean by mutable”.

@Blaisorblade
Copy link

@alexknvl BTW, to actually make the barriers be guaranteed to work, you’ll also need to read from a volatile on the read path, otherwise there’s no JMM guarantee as you don’t have a happens-before edge. X86 might be more lenient, or you might have been lucky, but I don’t trust synchronization code based on testing (and IMO you shouldn’t either): you need to use the relevant math (JMM), even tho it’s not phrased through types. (Studying platform-specific memory models is also acceptable).

@edmundnoble
Copy link
Contributor

edmundnoble commented Jul 28, 2018

As @Blaisorblade says, your "fence" is not an actual fence @alexknvl. x86 is the worst architecture to test on for this, but even if you wanted to do so you should use jcstress. See https://shipilev.net/blog/2016/close-encounters-of-jmm-kind/#wishful-unobserved-volatiles for a specific example of why this strategy doesn't work.

Edit: Interesting case from the post, apparently Graal can entirely elide the "barrier".

@fommil
Copy link

fommil commented Jul 28, 2018

Will this encoding allow variables captured in the scope of init to be garbage collected? That's a pretty big deal, e.g. for Need

@sir-wabbit
Copy link
Contributor

sir-wabbit commented Jul 28, 2018

@Blaisorblade @edmundnoble Right. We can use the one from scala/scala#6425 if need be.

@fommil 👍 Haven't even thought about it. We could ensure garbage collection using:

final class $Init[A](final val init: () => A) // defined once somewhere

private[this] var foo$value: AnyRef = new $Init(() => init);
def foo: T = {
  val old = foo$value
  if (old.getClass eq classOf[$Init[_]]) {
    val value: AnyRef = old.asInstanceOf[$Init[AnyRef]].init()
    foo$value = value
    value.asInstanceOf[T]
  } else old.asInstanceOf[T]
}

EDIT: Need to test whether the resulting bytecode closes over constructor variables inside of () => init rather than pushing them into this.

EDIT2: Neither the above snippet nor

trait $Init {
  def apply(): AnyRef
}

class LazyCellTest(i: Int, a: String) {
  private[this] var foo$value: AnyRef = new $Init {
    def apply() = { a + i }
  }

  def foo: String = {
    val old = foo$value
    if (old.getClass eq classOf[$Init]) {
      val value: AnyRef = old.asInstanceOf[$Init].apply()
      foo$value = value
      value.asInstanceOf[String]
    } else old.asInstanceOf[String]
  }
}

work w.r.t. GC:

public class LazyCellTest {
  public final int LazyCellTest$$i;

  public final java.lang.String LazyCellTest$$a;

  public java.lang.String foo();
    Code:
       0: aload_0
       1: getfield      #20                 // Field foo$value:Ljava/lang/Object;
       4: astore_1
       5: aload_1
       6: invokevirtual #24                 // Method java/lang/Object.getClass:()Ljava/lang/Class;
       9: ldc           #26                 // class $Init
      11: if_acmpne     36
      14: aload_1
      15: checkcast     #26                 // class $Init
      18: invokeinterface #30,  1           // InterfaceMethod $Init.apply:()Ljava/lang/Object;
      23: astore_2
      24: aload_0
      25: aload_2
      26: putfield      #20                 // Field foo$value:Ljava/lang/Object;
      29: aload_2
      30: checkcast     #32                 // class java/lang/String
      33: goto          40
      36: aload_1
      37: checkcast     #32                 // class java/lang/String
      40: areturn

  public LazyCellTest(int, java.lang.String);
    Code:
       0: aload_0
       1: iload_1
       2: putfield      #42                 // Field LazyCellTest$$i:I
       5: aload_0
       6: aload_2
       7: putfield      #44                 // Field LazyCellTest$$a:Ljava/lang/String;
      10: aload_0
      11: invokespecial #47                 // Method java/lang/Object."<init>":()V
      14: aload_0
      15: new           #10                 // class LazyCellTest$$anon$1
      18: dup
      19: aload_0
      20: invokespecial #50                 // Method LazyCellTest$$anon$1."<init>":(LLazyCellTest;)V
      23: putfield      #20                 // Field foo$value:Ljava/lang/Object;
      26: return
}

EDIT3:
This one works:

class LazyCellTest$foo$init(i: Int, a: String) {
  def apply(): AnyRef = a + i
}

class LazyCellTest(i: Int, a: String) {
  private[this] var foo$value: AnyRef = new LazyCellTest$foo$init(i, a)

  def foo: String = {
    val old = foo$value
    if (old.getClass eq classOf[LazyCellTest$foo$init]) {
      val value: AnyRef = old.asInstanceOf[LazyCellTest$foo$init].apply()
      foo$value = value
      value.asInstanceOf[String]
    } else old.asInstanceOf[String]
  }
}

@heyrutvik heyrutvik closed this Nov 16, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants