Zipkin 1.5
Zipkin 1.5 is all about the dependency view in the UI.
Many of you may have seen the dependency tab, and never any data in it. This would be the case if you were running Cassandra or Elasticsearch.
What you should have seen is a diagram showing the relative amount of calls between services, something like this (except with your services present!):
Zipkin 1.5 includes support to populate the data under this screen for all storage options (mysql, cassandra and elasticsearch).
The job that produces this data is called zipkin-dependencies. Zipkin Dependencies aggregates links between services into a daily bucket. This means you should run it daily, like a batch job (eventhough underneath it is spark). In fact, our docker image includes cron setup to do that for you!
For example, here's a run against a small cassandra DB using spark standalone (default):
$ STORAGE_TYPE=cassandra CASSANDRA_CONTACT_POINTS=192.168.99.100 java -jar zipkin-dependencies.jar
Running Dependencies job for 2016-07-23: 1469232000000000 ≤ Span.timestamp 1469318399999999
11:05:09.653 [main] WARN o.a.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
11:05:09.706 [main] WARN org.apache.spark.util.Utils - Your hostname, acole resolves to a loopback address: 127.0.0.1; using 192.168.1.10 instead (on interface en0)
11:05:09.706 [main] WARN org.apache.spark.util.Utils - Set SPARK_LOCAL_IP if you need to bind to another address
11:05:11.078 [main] WARN com.datastax.driver.core.NettyUtil - Found Netty's native epoll transport, but not running on linux-based operating system. Using NIO instead.
Saved with day=2016-07-23
Dependencies: [{"parent":"brave-resteasy-example","child":"brave-resteasy-example","callCount":1}, {"parent":"zipkin-server","child":"cassandra","callCount":14}]
Upgrading
If you are using cassandra or elasticsearch, you should upgrade to zipkin 1.5, but there's no schema-related change required.
If you are using mysql, you'll need to add a new table for this to work. Here's a copy/paste of the DDL for your convenience.
CREATE TABLE IF NOT EXISTS zipkin_dependencies (
`day` DATE NOT NULL,
`parent` VARCHAR(255) NOT NULL,
`child` VARCHAR(255) NOT NULL,
`call_count` BIGINT
) ENGINE=InnoDB ROW_FORMAT=COMPRESSED;
ALTER TABLE zipkin_dependencies ADD UNIQUE KEY(`day`, `parent`, `child`);
Credits
The spark job was originally written by @yurishkuro, based on a hadoop job originally written by @eirslett years ago. IOTW, the job itself isn't new, rather the accessibility of it. Before, it only worked with cassandra and wasn't published to maven central or integrated with docker. Now, it should be easy for anyone to include this functionality into their deployment.