Brave 收集数据

Brave 收集数据

在 Java 生态世界中,Zipkin 团队官方提供了 Brave 用来收集数据到 Zipkin Server 中。其它的用来收集数据的框架还有 cassandra-zipkin-tracingDropwizard ZipkinhtraceSpring Cloud Sleuth 以及 Wingtips 等。

示例代码

配置 Tracer

配置 Tracer 以向 Zipkin Server 上传数据。

// Configure a reporter, which controls how often spans are sent
//   (this dependency is io.zipkin.reporter2:zipkin-sender-okhttp3)
sender = OkHttpSender.create("http://127.0.0.1:9411/api/v2/spans");
//   (this dependency is io.zipkin.reporter2:zipkin-reporter-brave)
zipkinSpanHandler = AsyncZipkinSpanHandler.create(sender);

// Create a tracing component with the service name you want to see in Zipkin.
tracing = Tracing.newBuilder()
                 .localServiceName("my-service")
                 .addSpanHandler(zipkinSpanHandler)
                 .build();

// Tracing exposes objects you might need, most importantly the tracer
tracer = tracing.tracer();

// Failing to close resources can result in dropped spans! When tracing is no
// longer needed, close the components you made in reverse order. This might be
// a shutdown hook for some users.
tracing.close();
zipkinSpanHandler.close();
sender.close();

进程内跟踪

// Start a new trace or a span within an existing trace representing an operation
ScopedSpan span = tracer.startScopedSpan("encode");
try {
  // The span is in "scope" meaning downstream code such as loggers can see trace IDs
  return encoder.encode();
} catch (RuntimeException | Error e) {
  span.error(e); // Unless you handle exceptions, you might not know the operation failed!
  throw e;
} finally {
  span.finish(); // always finish the span
}

也可以通过如下更为高级、更为灵活的方式跟踪数据:

// Start a new trace or a span within an existing trace representing an operation
Span span = tracer.nextSpan().name("encode").start();
// Put the span in "scope" so that downstream code such as loggers can see trace IDs
try (SpanInScope ws = tracer.withSpanInScope(span)) {
  return encoder.encode();
} catch (RuntimeException | Error e) {
  span.error(e); // Unless you handle exceptions, you might not know the operation failed!
  throw e;
} finally {
  span.finish(); // note the scope is independent of the span. Always finish a span.
}

主要类讲解

Span

Span 是存储跟踪数据的容器,其主要属性和行为如下:

// 主要属性
public abstract TraceContext context();
@Override public abstract Span name(String name);
@Override public abstract Span annotate(String value);
@Override public abstract Span tag(String key, String value);

// 主要行为
public abstract Span start();
public abstract void finish();
public abstract void abandon();
public abstract void flush();

跟踪器 Tracer

Tracer 可以创建各种各样的 Span。其主要字段:

public class Tracer {
  
  final Clock clock;
  final Propagation.Factory propagationFactory;
  final SpanHandler spanHandler; // only for toString
  final PendingSpans pendingSpans;
  final Sampler sampler;
  final CurrentTraceContext currentTraceContext;
  final boolean traceId128Bit, supportsJoin, alwaysSampleLocal;
  final AtomicBoolean noop;

}

下面以伪代码说明执行 tracer.startScopedSpan("encode") 做了哪些事情:

// 获取 parent 上下文
TraceContext parent = currentTraceContext.get();
// 装饰 parent 上下文或创建新的 root 上下文
TraceContext context = parent != null ? decorateContext(parent, parent.spanId()) : newRootContext(0);

// 创建 RealScopedSpan
Scope scope = currentTraceContext.newScope(context);
PendingSpan pendingSpan = pendingSpans.getOrCreate(parent, context, true);
Clock clock = pendingSpan.clock();
MutableSpan state = pendingSpan.state();
state.name(name);
return new RealScopedSpan(context, scope, state, clock, pendingSpans);

上述建立 Context 的过程,spanIdtraceIdid 的生成方式如下:

// 创建 64-bit spanId
if (spanId == 0L) spanId = nextId();
// 创建 TraceId
if (traceId == 0L) { // make a new trace ID
    traceIdHigh = traceId128Bit ? Platform.get().nextTraceIdHigh() : 0L;
    traceId = spanId;
}
// localRootId
if (localRootId == 0L) {
    localRootId = spanId;
}

/** Generates a new 64-bit ID, taking care to dodge zero which can be confused with absent */
long nextId() {
    long nextId = Platform.get().randomLong();
    while (nextId == 0L) {
        nextId = Platform.get().randomLong();
    }
    return nextId;
}

上下文 CurrentTraceContext

包含了 Trace ID 、采集的数据等信息。主要字段如下:

public final class TraceContext extends SamplingFlags {
  
  final long traceIdHigh, traceId, localRootId, parentId, spanId;
  final List<Object> extraList;
  
}

采样器 Sampler

采集上来的跟踪数据,要每一条都要发送到服务器吗?量会不会特别大?会不会有许多冗余重复的数据?采样器 Sampler 让你自主选择哪些数据需要发送,哪些不需要发送。

public abstract class Sampler {
    
    // 这条 traceId 对应的数据,是否需要统计
    public abstract boolean isSampled(long traceId);

}

Brave 自带的几个采样器:

跟踪监听器 SpanHandler

public abstract class SpanHandler {

    public boolean begin(TraceContext context, MutableSpan span, @Nullable TraceContext parent) {
        return true;
    }

    public boolean end(TraceContext context, MutableSpan span, Cause cause) {
        return true;
    }

}

计时器 Clock

public interface Clock {

  long currentTimeMicroseconds();

}

其只提供了一个实现 TickClock

final class TickClock implements Clock {
  
  final long baseEpochMicros;
  final long baseTickNanos;

  TickClock(long baseEpochMicros, long baseTickNanos) {
    this.baseEpochMicros = baseEpochMicros;
    this.baseTickNanos = baseTickNanos;
  }

  @Override public long currentTimeMicroseconds() {
    return ((System.nanoTime() - baseTickNanos) / 1000) + baseEpochMicros;
  }

}

Propagation

该类用来将 TraceContext 中携带的信息转为文本信息,以放到 Request 中,用以跨进程跟踪。