A way for task preemption in Big data analytics platform
- a simple shell script for monitoring node's metadata (e.g. disk access, network Tx Rx etc) in a cluster
- read and write for chekcpointing data (note: checkpointRead is private in spark, we need to package function into org.apache.spark)
The performance gain is 15-30%
- Measure checkpoint latency using Spark
- Word Count with checkpointing
- Sorting with checkpoint
- GroupByKey with Checkpointing
- DecisionTree with periodic Checkpointing
- We now design schemes for evaluate best gain we can get using Benefault
- find sweet spot for whether kill or preempt