Skip to content

EVF Tutorial Overview

Paul Rogers edited this page May 17, 2019 · 3 revisions

The Log Plugin

The Drill log plugin is the focus of this tutorial. A simplified version of this plugin is explained in the Learning Apache Drill book. The version used here is the one which ships with Drill.

The focus here is on the conversion to EVF, rather than the details of the plugin. Each plugin has its own internal structure, so we leave it to the reader to map from the log reader to some other plugin.

Plugin Design

Most format plugins are based on the "easy" framework. EVF still uses the "easy" framework, but the implementation differs.

"Legacy" plugins are based on the idea of a "record reader" (a concept borrowed from Hive.) Unlike the hive record readers, Drill's never read a single record: they all read a batch of records. In EVF, the reader changes to be a "row batch reader" which implements a new interface.

In Drill 1.16 and earlier, the LogRecordReader uses a typical method to write to value vectors using the associated Mutator class.

Other readers are more clever: the "V2" text reader (Drill 1.16 and earlier) worked with direct memory itself, handling its own buffer allocation, offset vector calculations and so on.

With the EVF, we'll replace the Mutator with a ColumnWriter. We'll first do the simplest possible conversion, then look at how to use advanced features, such as type conversions, schema and table properties.

Let's work though the needed changes one-by-one.


Next: Plugin Revisions

Clone this wiki locally