Skip to content

Releases: openzipkin/zipkin

Zipkin 1.3

10 Jul 08:11
Compare
Choose a tag to compare

Zipkin 1.3 includes highlighting of spans in error state and improvements to the Cassandra storage component.

Error annotations

Inspired by recent work in OpenTracing, we've added a new annotation "error". When an annotation value, this indicates when a potentially transient error occurred. When a binary annotation key, the value is a human readable message associated with a error resulting in a failed span. See #1140 for details.

Thanks to @virtuald the UI acts according to these rules, highlighting degraded spans yellow, and failed ones red.

trace
Instrumentation (like Brave, zipkin-tracer etc) need to change to support this. Please help if you have time!

Span.timestamp, duration 0 coerce to null

We've noticed some instrumentation log invalid timestamp and duration of 0, when they meant to log null. Timestamp or duration of 0 microseconds are invalid or don't explain latency. We now coerce these 0s to null. For cases where a sub-microsecond span duration occurred, you should round up to 1. See #1155 and #1176

Elasticsearch daily bucket fix

We found and fixed a concurrency bug that could put spans into the wrong daily buckets. See #1175

Cassandra

Schema bug fix

We found a bug where traces against the same service in the same millisecond weren't indexed. This affects indexes only (trace data itself wasn't lost). For example, you might find a trace that exists in cassandra, but you can't query it using the api.

Specifically, the following indexes now have trace_id added to their PRIMARY_KEY definitions.

  • service_span_name_index
  • service_name_index
  • annotations_index

There's no automatic data migration available. The most straight-forward way to address this in an existing cluster is to drop the following indexes and restart a zipkin server (which will recreate them as long as CASSANDRA_ENSURE_SCHEMA=true). You can also update the indexes manually based on the schema

Tuning

We've done a lot of work tuning the amount of data written to indexes on a per-span basis. Those using Cassandra should see a significant drop in index size due to reasons documented in the tuning section of the README.

Query logging

Those supporting zipkin may need to debug query latency. We now use the QueryLogger which is enabled when the log category "com.datastax.driver.core.QueryLogger" is at debug or trace level. Trace level includes bound values. See #1156

Zipkin 1.2

29 Jun 05:52
Compare
Choose a tag to compare

Zipkin 1.2.1 includes Prometheus metrics and Elasticsearch bug fixes.

Prometheus metrics are enabled by default, under the /prometheus endpoint.

Many thanks to Kristian from Iterate for developing this feature!

1.1.5

21 Jun 14:35
Compare
Choose a tag to compare

This is a patch release that fixes a bug where json received with optional fields set to null failed to parse. You should update to this patch, particularly if your apps are using the zipkin ruby gem.

See #1136 for details

Zipkin 1.1.4

04 Jun 03:21
Compare
Choose a tag to compare

This is a patch release that fixes a bug where CASSANDRA_ENSURE_SCHEMA didn't work when the keyspace was absent. See #1128 for details

Zipkin 1.1.0

27 Aug 22:47
Compare
Choose a tag to compare

Zipkin 1.1 most notably offers improvements in the UI and the Cassandra storage component.

Zipkin UI

Thanks to @virtuald, the Zipkin UI now includes a JSON button! This allows you to see the json behind a trace diagram, something quite useful in support. For example, when people report problems in Zipkin, we often ask for json and this feature makes that easier.

json button

@virtuald also improved error handling dramatically. Before, Zipkin wouldn't show server errors, so you'd have to use the javascript console to troubleshoot problems. Now, errors will show in pink boxes.

pink boxes

Cassandra Storage

Thanks to @michaelsembwever, the Cassandra schema now includes default ttls. These obviate the explicit ttls we were programmatically adding when storing each span. Those using CASSANDRA_ENSURE_SCHEMA(default) will automatically update into this. Those manually controlling the schema should run cassandra-schema-cql3-upgrade-1.txt before Zipkin 2.0 is released (unplanned).

Throttled retries when storage is down

Zipkin can start when storage is unavailable. The health check will report as unavailable until it is. Before, any activation of storage would re-attempt to connect. This has been throttled to no more than once per second to avoid thrashing the network or process with retry attempts.

Zipkin 1.0

03 Jun 10:07
Compare
Choose a tag to compare

Many of you noticed we've been working on a new dependency-light codebase for Zipkin.

Over the last 10 months we've gone beyond feature parity with the previous server, and reduced moving parts until zipkin could be deployable as a single process. The new codebase also includes new features like Elasticsearch and better metrics.

Those already using zipkin should know zipkin-java is schema, api, and environment variable compatible with the old servers. We took great care to ensure it is a drop-in.

Those worried about being first to the fire will be interested in the fact that this code has been in development for 10 months and used in production already by companies like Bouyant, LINE.me and anyone using Spring Cloud Sleuth.