Skip to content

In Process Sandbox

Ben Nordick edited this page May 15, 2020 · 10 revisions

To run the compiled code quickly, Jeed loads it into the current JVM process. Since the submitted code is untrusted, Jeed takes measures to confine each task, securing the process and other confined tasks from it. There are three categories of concerns:

  1. Confined tasks must not be able to perform privileged operations to manipulate the host machine.
  2. Confined tasks must completely stop after reaching the configured timeout.
  3. Confined tasks must not be able to ruin the environment for other tasks.

Java security

TODO!

Security manager

TODO!

Class loading restrictions

TODO!

Reloaded classes

Classes from the classpath—the Java standard library and other libraries we use—are shared between all code running in the JVM process. Changes in static fields can affect all tasks. This is a severe problem when those static fields refer to threads or thread groups. The standard Kotlin coroutines dispatcher keeps a pool of threads and distributes work among them. In the best case, the pool gets created inside the sandbox and is ruined when that confined task is done; in the worst case, untrusted code gets dispatched to unconfined threads used by Jeed itself.

To give different confined tasks different sets of static variables, we reload the problematic library classes, isolating the confined tasks' copies from each other. When asked for a class that is set to be isolated, our sandboxed classloader defines a new copy instead of getting the existing Class object from the parent classloader. The bytecode for reloaded classes is cached across all tasks for speed, but every confined task gets a different live Class based on it. To ensure that confined tasks can be shut down reliably, reloaded classes are subject to the same bytecode transformation (described below) as untrusted classes.

Bytecode transformation

The Java SecurityManager is an excellent start, preventing untrusted threads/classes from invoking privileged methods, but unfortunately not sufficient to address all concerns.

Try-catch instrumentation

After the timeout for a confined task expires, we call stop on all the threads in the thread group. This method is generally advised against and deprecated, but it is the only way to stop a thread without its cooperation. It works by fabricating a ThreadDeath error and throwing it in the target thread. There is nothing special about ThreadDeath as a throwable: it can be caught by catch blocks and will trigger finally blocks, so the thread is not actually immediately killed. Malicious code can simply ignore the exception and continue running. The security manager is not notified when methods catch exceptions. To gain control over them, we need the cooperation of the thread, so we have to change the bytecode, creating a new sandboxed classloader with a transformed version of every untrusted class.

The ASM framework allows us to examine and manipulate classes element by element. Each visitor receives calls from upstream (the ClassReader for the input bytecode, or the previous visitor in the chain) and can intercept or modify aspects of the class before passing the information to the downstream visitor toward the output ClassWriter. Our SandboxingMethodVisitor is responsible for most of our transformation. Before any of the labels that control can be nonlinearly transferred to are visited, our visitor is notified of try-catch blocks with visitTryCatchBlock. Each try-catch or try-finally is represented in bytecode as a range of protected instructions (try block), the class of exception to catch, and the handler position to jump to if the exception occurs in the protected range. (Finally blocks are implemented as try-catch blocks without a filter, so they are entered for all exceptions so they can run the finally code before rethrowing the exception. Interestingly, each finally block appears twice in the bytecode, once for successful completion and once for if an exception is thrown.) If the exception filter is broad enough to catch an exception type the task shouldn't be allowed to survive, our visitor records the handler label as in need of rewriting. When such a label is visited with visitLabel, the visitor prepares to rewrite it, but does not do so until the following visitFrame. That way, it will be covered by the stack map entry for the catch/finally block where the caught exception was pushed onto the stack.

Immediately at the beginning of the suspicious catch/finally block, the visitor inserts a call to the checkException function of our RewriteBytecode object, passing the caught exception. That function checks whether the runtime type of the throwable should be unsurvivable. If so, it rethrows the throwable, unwinding the stack before letting the untrusted class run more code. This function also helps accelerate shutdown by always throwing if the timeout has been exhausted.

Try-catch filtering

In Java, it is impossible to write a try-catch block that "handles" an exception by directly retrying the try block from the top. In bytecode, however, the handler label for a try-catch block is allowed to be in the protected "try" region. The Java compiler sometimes produces such try-catch entries near synchronized blocks for unclear reasons. Usually they don't cause a problem because no exception ever gets thrown in them, but if one does due to our thread termination, it will be thrown and caught infinitely, freezing or even crashing the JVM!

Clearly that is a problem, so such try-catch entries must be removed entirely. Unfortunately ASM requires that visitTryCatchBlock is called before the start, end, and handler labels are visited with visitLabel, so SandboxingMethodVisitor could not by itself determine whether a try-catch block is bad until it has already been passed along. We therefore do a "preinspection" of the same bytecode with another class visitor that visits methods with PreviewingMethodVisitor. This method visitor doesn't pass anything toward a ClassWriter; it just records what it's told by the ClassReader. It keeps track of the order in which try-catch entries and labels are visited. Once the method has been completely visited, the visitor can cross-reference the try-catch entries with the labels to see which try-catch blocks handle exceptions by jumping back inside the protected region. Fortunately try-catch entries have a specific order in the bytecode, so the entry visited first in one read is also visited first for another read. The indexes of bad try-catch entries are passed to the SandboxingMethodVisitor, which removes the offending entries by not forwarding the corresponding visitTryCatchBlock calls downstream.

Finalizer removal

Objects can override finalize to perform cleanup before their storage is freed by the garbage collector. This feature is, as the Javadoc says, "inherently problematic." Finalizers run on a dedicated thread outside any of our thread groups, and by the time they run the untrusted object's confined task might have completed and been forgotten by Jeed, so they would allow untrusted code to perform arbitrary operations. They are therefore removed: our sandboxing class visitor simply does not pass finalize() methods downstream.

Clone this wiki locally