Skip to main content
Version: 0.46.0

Traceability and Observability

This document describes Lana's observability infrastructure, including distributed tracing with OpenTelemetry.

Overview

Lana implements comprehensive observability through:

  • Distributed Tracing: OpenTelemetry integration
  • Structured Logging: Contextual log entries
  • Metrics: Performance and business metrics
  • Correlation: Request tracing across services

Architecture

OpenTelemetry Integration

Configuration

use opentelemetry::global;
use opentelemetry_otlp::WithExportConfig;
use tracing_subscriber::prelude::*;

pub fn init_telemetry() -> Result<()> {
let tracer = opentelemetry_otlp::new_pipeline()
.tracing()
.with_exporter(
opentelemetry_otlp::new_exporter()
.tonic()
.with_endpoint("http://localhost:4317")
)
.install_batch(opentelemetry::runtime::Tokio)?;

let telemetry = tracing_opentelemetry::layer()
.with_tracer(tracer);

tracing_subscriber::registry()
.with(telemetry)
.with(tracing_subscriber::fmt::layer())
.init();

Ok(())
}

Instrumentation

use tracing::{instrument, info, span, Level};

#[instrument(skip(self), fields(facility_id = %facility_id))]
pub async fn process_disbursal(
&self,
facility_id: CreditFacilityId,
) -> Result<Disbursal> {
info!("Processing disbursal");

let facility = self.repo.find_by_id(facility_id).await?;

// Nested span for ledger operation
let ledger_span = span!(Level::INFO, "ledger_transfer");
let _guard = ledger_span.enter();

self.ledger.transfer(facility.amount).await?;

info!("Disbursal completed");
Ok(disbursal)
}

Context Propagation

HTTP Headers

Trace context propagates through HTTP headers:

traceparent: 00-0af7651916cd43dd8448eb211c80319c-b7ad6b7169203331-01
tracestate: lana=value

GraphQL Context

pub struct GraphQLContext {
trace_id: TraceId,
span_id: SpanId,
subject: SubjectId,
}

impl GraphQLContext {
pub fn from_request(req: &Request) -> Self {
let trace_context = extract_trace_context(req);
Self {
trace_id: trace_context.trace_id,
span_id: trace_context.span_id,
subject: extract_subject(req),
}
}
}

Structured Logging

Log Format

use tracing::{info, warn, error};

// Structured log with context
info!(
customer_id = %customer.id,
email = %customer.email,
"Customer created"
);

// Error with context
error!(
error = %e,
facility_id = %facility_id,
"Failed to process disbursal"
);

Log Levels

LevelUsage
ERRORSystem failures, requires attention
WARNUnexpected but recoverable conditions
INFOBusiness operations, audit events
DEBUGDetailed debugging information
TRACEVery detailed tracing

Metrics

Business Metrics

use metrics::{counter, gauge, histogram};

// Count operations
counter!("disbursals_processed_total").increment(1);

// Track current values
gauge!("active_facilities").set(count as f64);

// Measure durations
histogram!("disbursal_processing_seconds").record(duration);

System Metrics

  • Request latency
  • Error rates
  • Database connection pool
  • Memory usage

Correlation IDs

All requests carry correlation context:

#[derive(Debug, Clone)]
pub struct CorrelationContext {
pub trace_id: String,
pub span_id: String,
pub request_id: Uuid,
pub subject_id: Option<SubjectId>,
}

impl CorrelationContext {
pub fn new() -> Self {
Self {
trace_id: Span::current().context().trace_id().to_string(),
span_id: Span::current().context().span_id().to_string(),
request_id: Uuid::new_v4(),
subject_id: None,
}
}
}

Dashboards

Operations Dashboard

  • Request volume and latency
  • Error rates by endpoint
  • Active users
  • System health

Business Dashboard

  • Facility creations per day
  • Disbursement volume
  • Payment processing
  • Customer onboarding

Alerts

Critical Alerts

  • Service unavailable
  • High error rate (>5%)
  • Database connection failures
  • External service timeouts

Warning Alerts

  • Elevated latency
  • High queue depth
  • Memory pressure
  • Disk space low