1. Trang chủ
  2. » Công Nghệ Thông Tin

IT training opentracing emerging distributed tracing standard khotailieu

36 24 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 36
Dung lượng 1,02 MB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

The client makes an HTTP request to the server which results in generating one parent span.. 3 OpenTracing API OpenTracing API is modeled around two fundamental types: Tracer – ​knows h

Trang 3

 1  Introduction 

 

As organizations are embracing the cloud-native movement and thus migrating their       

applications from monolithic to microservice architectures, the need for general visibility       

and observability into software behavior becomes an essential requirement Because the       

monolithic code base is segregated into multiple independent services running inside their       

own processes, which in addition can scale to various instances, such a trivial task as       

diagnosing the latency of an HTTP request issued from the client can end up being a       

serious deal To fulfill the request, it has to propagate through load balancers, routers,       

gateways, cross machine’s boundaries to communicate with other microservices, send       

asynchronous messages to message brokers, etc Along this pipeline, there could be a       

possible bottleneck, contention or communication issue in any of the aforementioned       

components Debugging through such a complex workflow wouldn’t be feasible if not       

relying on some kind of tracing/instrumentation mechanism That’s why distributed       

tracers like ​   Zipkin​, ​ Jaeger   or ​AppDash were born (most of them are inspired on Google’s       

Dapper large-scale distributed tracing platform) All of the aforementioned tracers help       

engineers and operation teams to understand and reason about system behavior as       

complexity of the infrastructure grows exponentially Tracers expose the ​source of truth      

for the interactions originated within the system​ ​Every transaction (if properly          

instrumented) might reflect performance anomalies in an early phase when new services       

are being introduced by (probably) independent teams with polyglot software stacks and       

continuous deployments However, each of the tracers stick with its proprietary API and other peculiarities that       

makes it costly for developers to switch between different tracer implementations Since       

implanting instrumentation points requires code modification, ​OSS services, application        frameworks and other platforms would have hard time if tying to a single tracer vendor

Trang 4

OpenTracing aims to offer a consistent, unified and tracer-agnostic instrumentation API       

for a wide range of frameworks, platforms and programming languages It abstracts away       

the differences among numerous tracer implementations, so shifting from an existing one       

to a new tracer system would only require configuration changes specific to that new       

tracer For what it’s worth, we should mention the benefits of distributed tracing:   out of the box infrastructure overview, how the interactions between       

services are done and their dependencies ​efficient and fast detection of       

latency issues ​ntelligent error reporting Spans transport error messages       

and stack traces We can take advantage of that insight to identify ​root  

cause factors or cascading failures trace data can be forwarded to log       

processing platforms for query and analysis     2  OpenTracing basics    In a distributed system, a ​trace encapsulates the transaction’s state as it propagates       

through the system During the journey of the transaction, it can create one or multiple       

spans​ A span represents a single unit of work inside transaction, for example, an RPC       

client/server call, sending query to the database server, or publishing a message to the       

message bus Speaking in terms of OpenTracing data model, the trace can also been seen       

as a collection of spans structured around the directed acyclic graph (DAG) The ​edges       

indicate the casual relationships (references) between spans The span is identified by its       

unique ID, and optionally may include the parent identifier If the parent identifier is       

omitted, we call that span as ​root span​ The span also comprises human-readable        

operation name​, ​start and ​end timestamps All spans are grouped under the same        

trace identifier​

 

Trang 5

 

The diagram above depicts the transit of an hypothetical RPC request The client makes       

an HTTP request to the server which results in generating one parent span In order to       

satisfy the client’s request, the server sends a query to the storage engine That operation       

produces one more span The response from the database engine to the server and from       

the server to the client creates two additional spans.    Spans may contain ​tags ​that represent contextual metadata relevant to a specific         request

Trang 6

They consist of an unbounded sequence of ​key-value pairs, where keys are strings and       

values can be strings, numbers, booleans or date data types Tags allow for context       

enrichment that may be useful for monitoring or debugging system behavior While not mandatory, it’s highly recommended to follow the ​       OpenTracing semantics   

guidelines when naming tags Such as that, we should assign ​component tag to the       

framework, module or library which generates span/spans, use ​peer.hostname and       

peer.port   to describe target hosts, etc Another reason for tagging standardization is       

making the tracer aware of existence of certain tags that would add intelligence or       

instruct the tracer to put special emphasis on them.    As illustrated on ​Figure 2, ​the      

spans are annotated with tags that       

obey  OpenTracing  semantic  conventions.  Furthermore,  the  spans are rendered with different     

chart This type of waterfall-like     

visualization adds the dimension of     

time and thus makes it easier to       

spot the duration of each span.    Besides tags, OpenTracing has a     

notion of ​log ​events​ They     

represent  timestamped  textual  (although not limited to textual     

content) annotations that may be     

recorded along the duration of a       

span Events could express any     

occurrence of interest to the active       

span, like timer expiration, cache     

miss events, build or deployment     

starting events, etc. 

 

Trang 7

Baggage items ​allow for cross-span propagation, i.e., they let associate metadata that          

also propagates to future children of the root span In other words, the local data is       

transported along the full path as request if traveling downstream through the system.       

However, this powerful feature should be used carefully because it can easily saturate       

network links if the propagated items are about to be injected into many descendant       

spans.    As at the time of writing, OpenTracing supports two types of relationships:    ChildOf – ​to express casual references between two spans Following with our            

RPC scenario, the server side span would be the ​ChildOf ​the initiator (request)        

span.  FollowsFrom – when parent span isn’t linked to the outcome of the child span This          

relationship is usually used to model asynchronous executions like emitting       

messages to the message bus. 

 

 

Trang 8

 3  OpenTracing API 

 

OpenTracing API is modeled around two fundamental types: 

Tracer –   ​knows how to create a new span as well as inject/extract span contexts       

across process boundaries All OpenTracing compatible tracers must provide a       

client with the implementation of the ​Tracer​ interface.  Span –   ​tracer’s ​build method yields a brand new created span We can invoke a       

number of operations after the span has been started, like aggregating tags,       

changing span’s operation name, binding references to other spans, adding       

baggage items, etc.   SpanContext –    ​the consumers of the API only interact with this type when       

injecting/extracting the span context from the transport protocol.   Let’s see some code Although we’ll focus on Java, the API semantics are identical (or at       

least they should be) for any other programming language (OpenTracing has API specs       

for Go, Python, JavaScript, Java, C#, Objective-C, C++, Ruby, PHP).    Figure 4 represents the role of OpenTracing API instrumentation within the tracing          

landscape It’s important for the tracer clients to be compatible with the OpenTracing       

specification For instance, we could be biased to use Zipkin tracing system The       

instrumentation points in our applications are created via OpenTracing API despite we’re       

using Zipkin clients for span reporting After evaluating other tracers, we could figure out       

Jaeger fits better our needs In that case, switching from Zipkin to Jaeger would be a       

matter of registering the corresponding instance of the tracer, while instrumentation        points would remain the same, i.e., we wouldn’t have to adapt any code. 

Trang 9

Because Jaeger tracer is compatible with Zipkin span formats, we could use the same       Zipkin client to submit span requests to Jaeger. 

Trang 10

endpoint and the ​component which sends the instrumentation data to the tracer In       

case of ​Jaeger​ tracer, we would have the following code snippet: 

 

import ​com.uber.jaeger.Configuration​ ; 

import ​io.opentracing.util.GlobalTracer​ ; 

 

Configuration ​ config = ​ new ​ Configuration(component, 

new ​Configuration​ SamplerConfiguration(​ "const"​ , 1),

new ​Configuration​ ReporterConfiguration(

true​ , host, port, 1000, 10000) ); 

GlobalTracer​ ​ register​ (config.getTracer()); 

To start a new span use the ​buildSpan​ method within ​try​ block which automatically 

finishes the span and handles any exceptions: 

io.opentracing.Tracer​ ​ tracer = ​ GlobalTracer​ ​ get​ ();  

try ​ (​ ActiveSpan ​ span = tracer.buildSpan(​ "create-octi"​ ) 

setTag(​ "http.url"​ , ​ "/api/octi"​ )  setTag(​ "http.method"​ , ​ "POST"​ )  setTag(​ "peer.hostname"​ , ​ "apps.sematext.com"​ ) 

.startActive()) { // HTTP request code here

 

Trang 11

Tracer registration and span management can be simplified with Sematext 

opentracing-common​ library. 

TracerInitializer ​ tracerInitializer = ​ new ​ TracerInitializer(​ Tracers​ ​ ZIPKIN​ );  tracerInitializer.setup(​ "localhost"​ , 9411, ​ "log-service"​ ); 

 

SpanOperations ​ spanOps = ​ new ​ SpanTemplate(tracerInitializer.getTracer()); 

try ​ (​ ActiveSpan ​ span = spanOps.startActive(​ "create-octi"​ )) { 

// add tags and make the HTTP request  

 

The best practice is to create an instance of ​TracerInitializer ​and SpanTemplate ​via       

dependency injection container to reuse those references from any place within the       

application.     4  Context propagation    One of the most compelling and powerful features attributed to tracing systems is       

distributed context propagation​ Context propagation composes the causal chain          

and dissects the transaction from inception to finalization – it illuminates the request’s       

path to its final destination From a technical point of view, context propagation is the ability for the system or       

application to extract the propagated span context from a variety of carriers like HTTP       

headers, AMQP message headers or Thrift fields, and then join the trace from that point.       

Context propagation is very efficient since it only involves propagating identifiers and       

baggage items All other metadata like tags, logs, etc isn’t propagated but transmitted       

asynchronously to the tracer system It’s the responsibility of the tracer to assemble and       

construct the full trace from distinct spans that might be- injected ​in-band ​/​out-of-band​. 

Trang 12

 

The ​figure 5 illustrates the flow of the context propagation A request hits the first          

service (probably triggered by user interaction from mobile/web application) At this point       

no active span is scheduled, so ​service 1 will start a new span and populate the tags to          

contextualize the request This is the parent for the subsequent spans Let’s suppose the       

context is injected and carried to ​service 2 ​(that lives on another machine) via HTTP          

headers Service 2 attempts to extract the span context from the headers If context is       

decoded successfully, another child span will be generated under the same trace       

Trang 13

identifier As we already pointed out, ​only identifiers are propagated ​– tags        

contributed by individual spans are sent out of band Following with the path, the        service   3 deserializes the span context, might add tags, baggage items, etc Then, it injects the       

context, crosses the process boundaries to collaborate with the next service, and so on       

until casual chain is completed OpenTracing standardizes context propagation across process boundaries by       

Inject/Extract ​pattern.      5  Distributed tracers    OpenTracing hides the differences between different distributed tracer implementations,       

so in order to instrument the application via OpenTracing standard, it’s necessary to have       

an OpenTracing compatible tracer correctly deployed and listening for incoming span       

requests The following section is a breakdown of some prominent distributed tracers.     5.1  Zipkin    Zipkin is a distributed tracing system implemented in Java and with OpenTracing       

compatible API It’s responsible for span ingestion and storage by providing a number of       

collectors (HTTP, Kafka, Scribe) as well as ​storage engines (in-memory, MySQL,       

Cassandra, Elasticsearch) The UI is also a self-contained web application (although it can       

be served separately) and is used to explore the traces and their associated spans Spans may be sent to collectors ​out​-​of-band, i​.e., the data is reported asynchronously to        

Zipkin since the span is completed and trace/span identifiers don’t have to propagate       

downstream, or ​in-band if context propagation is required and headers are used to        transport the identifiers. 

Trang 14

The component that is responsible for transporting the spans is called a ​reporter​. 

Every instrumented application contains a reporter It records timing metrics, associates       

metadata and routes it to the ​collector​. 

Trang 15

To get started with Zipkin, download and run ​zipkin-server as standalone jar (note ​JRE   8​ is required to bootstrap the Zipkin server): 

- fetch the latest zipkin image from the remote Docker repository 

- expose the port ​9411 on the host machine so you can browse the UI on       

Trang 16

The collectors are responsible for forwarding the span requests to the storage layer.       

HTTP​ collector is the default ingress point for span stream

Other than HTTP collector, Zipkin also offers ​Kafka​ and ​Scribe ​for span ingestion. 

 

 

 

 

Trang 17

 5.1.2  Storage 

 

As mentioned above, Zipkin supports in-memory, MySQL, Cassandra and Elasticsearch       

storage engines In-memory store comes in handy for dev environments and for the ​POC       scenarios where persistence is not required MySQL storage type is discouraged for       production environments due to known ​performance issues​. 

For production workloads, Cassandra or Elasticsearch are more suitable options. 

Ngày đăng: 12/11/2019, 22:26