Distributed Tracing 

Diego Parra|@diegolparra|Cristian Spinetta|@cebspinetta

Agenda

Microservices

Observability

Distributed Tracing

Zipkin/Jaeger

Kamon

Demo

Why Microservices?

Improved Modularity

Team Autonomy

Heterogeneous

Environmental Isolation

Better Functional Composition

What happens internally?

Expectation

Reality

The 3 Pillars of Observability

Why We Want Distributed Tracing?

Which services did a request pass through? 

Where are the bottlenecks?

How much time is lost due to network lag during communication between services?

What occurred in each service for a given request?

A few of the critical questions that DT can answer quickly and easily:

Distributed Tracing

  • There are two services involved in serving the /users endpoint of this system.
  • There are three HTTP calls made from Service A to Service B and happen in parallel.
  • Storing the session token happens after all HTTP calls to Service B have completed.
  • A substantial amount of time was spent on storing the session token.

Distributed Tracing Components

  • A Span represents a logical unit of work
  • Tags and Marks add extra information to spans.

Distributed Tracing Components

  • A Trace is a end-to-end latency graph, composed of spans.

Context Propagation

For a new trace, the root span would have a new TraceID, SpanID  and no ParentSpanID.

For a child span continuing a trace, it would have the same TraceID as incomming request, a new SpanID  and a ParentSpanID pointing to the incomming request's SpanID.

Sampled decision indicates if the Span must be reported to the tracing system.

Distributed Tracing

Brave

Things To Keep In Mind

Sampling reduces Overhead 

Observability tools are unintrusive

Instrumentations can be delegated to commons frameworks

Don't trace every single operation 

Zipkin

Distributed Tracing System

Based on Google Dapper (2010)

Created by Twitter (2012)

OpenZipkin (2015)

Active Community

Zipkin UI

Jaeger

Distributed Tracing System

Based on Google Dapper (2010)

Inspired by OpenZipkin

Created by Uber

Jaeger UI

Tracers

Distributed Tracing, Metrics and Context Propagation for application running on the JVM.

- Observability SDK(metrics, tracing).

Trace instrumentation API definitions.

- OpenZipkin's java library and instrumentation.

Brave

Kamon at a Glance

So, how do I Kamonize my service?

Kamonization

// build.sbt
libraryDependencies ++= Seq(
  "io.kamon" %% "kamon-core" % "1.1.3",
  "io.kamon" %% "kamon-prometheus" % "1.1.3",
  "io.kamon" %% "kamon-zipkin" % "1.1.3",
  "io.kamon" %% "kamon-jaeger" % "1.1.3")

Add Dependencies

// application.conf
kamon {
  environment {
    service = "kamon-showcase"
  }
  trace {
     sampler = "random"
  }
}

Add Configuration

Kamon.addReporter(new PrometheusReporter())
Kamon.addReporter(new ZipkinReporter())

// OR
Kamon.loadReportersFromConfig()

Start the Reporters

java -javaagent:/path/to/kanela-agent.jar

Start with Kanela (optional step)

Kamonization

val span = Kamon.buildSpan("find-users")
  .withTag("span.kind", "server")
  .withTag("string-tag", "hello")
  .withTag("number-tag", 42)
  .withTag("boolean-tag", true)
  .withMetricTag("early-tag", "value")
  .start()

// Do your stuff here

span.finish()
// You got traces, you got metrics!

Using a Span

Or

@Trace(operationName = "find-users", tags = "${'span.kind':'server', 'string-tag':'hello'}")
public List<Users> findUsers(List<Long> ids) {}

Annotate your Service

@Trace(operationName = "find-users", tags = "${'span.kind':'server', 'string-tag':'hello'}")
def findUsers(ids:Seq[Long]): Seq[Users]

Java

Scala

  • Akka Actors, Routers, Dispatchers. Local and Remote
  • Scala Futures
  • JDBC and Hikari CP
  • Akka HTTP
  • Play Framework
  • Http4S
  • Spring Web
  • OkHttp3
  • Servlets
  • Cassandra Driver
  • Logback (AsyncAppender)
  • Executors
  • System and JVM Metrics

Instrumentation

Reporters

  • Prometheus
  • InfluxDB
  • StatsD
  • Zipkin
  • Jaeger
  • Datadog (only metrics)
  • Sematext SPM
  • Kamino

Planned:

  • Stackdriver
  • Amazon Cloudwatch + X-ray

Demo

Thanks for Coming!

Get more info at http://kamon.io/

https://github.com/kamon-io

@kamonteam

Questions?

Distributed Tracing

By Diego Parra

Distributed Tracing

Are you looking at a bunch of fragments of your distributed infrastructure? Your microservices structure has been shattered, your users are complaining and you don't have a clue where to start ... which of your hundreds of services is slowing down requests? Distributed Tracing comes to the rescue! In this talk we will show you a possible solution based on Zipkin and Kamon

  • 1,888