Fixing dropped spans in async Python FastAPI routes

Attach the active context propagation context before crossing any asyncio boundary — BackgroundTasks, create_task, and thread pool executors will otherwise create orphaned root spans with a different trace ID.

Context and when it matters

OpenTelemetry Python’s SDK stores the current span in a contextvars.ContextVar. Within a single async def function this works transparently: await suspensions preserve the ContextVar value, so child spans nest correctly. The failure surfaces the moment execution leaves the current coroutine’s contextvars scope — into a BackgroundTask, a new asyncio.Task, or a thread pool worker. At that point the new execution unit sees either an empty context or a stale snapshot, and every span it creates becomes a separate root trace. In Jaeger or Grafana Tempo the symptom appears as two separate traces where you expected one, and orphaned spans in the “no parent” filter.

This failure mode is especially common in FastAPI because the framework explicitly schedules BackgroundTasks after the HTTP response is committed — by design, outside the request middleware scope.

How context detaches: the three failure modes

Async context detachment failure modes in FastAPI Three columns showing BackgroundTasks, create_task, and run_in_executor failure modes. Each column shows the request span at top, an async boundary in the middle, and an orphaned span at the bottom disconnected from the parent. BackgroundTasks asyncio.create_task() run_in_executor / to_thread GET /api/process trace_id: abc123 GET /api/process trace_id: abc123 GET /api/process trace_id: abc123 after response sent post-attach mutations lost OS thread, no ContextVar db_sync_background trace_id: xyz999 ✗ task_child_span parent_span_id: null ✗ cpu_work_span trace_id: 000000 ✗ orphaned root trace orphaned root trace orphaned root trace instrumented span broken context link

The OpenTelemetry Python SDK uses contextvars to store the active span. Context loss occurs through three distinct mechanisms:

  1. BackgroundTask detachment. FastAPI’s BackgroundTasks schedules work after the ASGI response is sent. By that point the OTel middleware has already detached its context token, so the background coroutine sees an empty ContextVar and any span it creates becomes an orphan with a new trace ID.

  2. asyncio.create_task() post-attach mutations. Python 3.7+ copies the current contextvars context when a task is created. This works correctly for spans active at creation time, but any subsequent context.attach() call in middleware — common with per-request propagator setup — does not propagate into already-created tasks.

  3. Thread pool context stripping. asyncio.to_thread() and loop.run_in_executor() hand work to OS threads. Those threads do not inherit contextvars state, so spans produced in the worker have trace_id = 0 or a freshly generated ID with no parent.

Minimal reproduction

This route demonstrates all three failure modes before any fix is applied:

# dropped_spans_repro.py
from fastapi import FastAPI, BackgroundTasks
from opentelemetry import trace, context
import asyncio, httpx

app = FastAPI()
tracer = trace.get_tracer(__name__)

async def background_db_sync():
    # Runs after response — sees empty contextvars, creates orphan span
    with tracer.start_as_current_span("db_sync_background"):
        await asyncio.sleep(0.05)

@app.get("/api/process")
async def process_request(background_tasks: BackgroundTasks):
    with tracer.start_as_current_span("process_logic"):
        background_tasks.add_task(background_db_sync)   # detaches here
        result = await asyncio.to_thread(lambda: 42)    # thread strips context
        return {"result": result}

Broken trace structure visible in Jaeger/Tempo:

Trace abc123
  └─ GET /api/process
       └─ process_logic

Trace xyz999          ← separate root, wrong trace_id
  └─ db_sync_background

Implementation detail: the complete fix

# fastapi_context_fix.py — production-ready, all three fixes applied
from contextlib import asynccontextmanager
from contextvars import copy_context
from fastapi import FastAPI, BackgroundTasks
from opentelemetry import trace, context
from opentelemetry.instrumentation.fastapi import FastAPIInstrumentor
import asyncio

# ── 1. Lifespan handler guarantees buffered spans are flushed on shutdown ──
@asynccontextmanager
async def lifespan(app: FastAPI):
    yield                                          # startup: nothing extra needed
    trace.get_tracer_provider().shutdown()         # flush BatchSpanProcessor queue

app = FastAPI(lifespan=lifespan)
FastAPIInstrumentor.instrument_app(app)            # must wrap before first request
tracer = trace.get_tracer(__name__)

# ── 2. Context-safe background task wrapper ────────────────────────────────
async def background_db_sync():
    # By the time this runs, the original contextvars scope is intact
    with tracer.start_as_current_span("db_sync_background"):
        await asyncio.sleep(0.05)

@app.get("/api/process")
async def process_request(background_tasks: BackgroundTasks):
    with tracer.start_as_current_span("process_logic"):

        # Capture context NOW, while it is still attached to this coroutine
        captured_ctx = context.get_current()       # snapshot of active span + baggage

        async def context_aware_background():
            token = context.attach(captured_ctx)   # re-attach in new execution unit
            try:
                await background_db_sync()
            finally:
                context.detach(token)              # O(1), prevents context memory leak

        background_tasks.add_task(context_aware_background)

        # ── 3. Thread pool fix: copy_context().run() carries contextvars ──
        def cpu_bound_work():
            # Runs in thread with a copy of the current contextvars context
            with tracer.start_as_current_span("cpu_work"):
                return sum(range(1_000_000))

        # copy_context() snapshots the full ContextVar mapping before crossing
        # the thread boundary — child span will have correct parent_span_id
        result = await asyncio.get_event_loop().run_in_executor(
            None, copy_context().run, cpu_bound_work
        )

        return {"result": result}

Each annotated line maps directly to a tracing concept:

  • context.get_current() — reads the ContextVar holding the active span and any baggage values into a frozen Context object.
  • context.attach(captured_ctx) — installs that snapshot as the active context in the new coroutine or thread, so tracer.start_as_current_span() can find the correct parent.
  • context.detach(token) — restores the previous context entry, equivalent to popping a stack frame; omitting this leaks memory in long-lived coroutines.
  • copy_context().run(fn) — Python standard-library mechanism that runs fn inside a full copy of the current ContextVar mapping on an OS thread, without requiring OTel-specific instrumentation.
  • FastAPIInstrumentor.instrument_app(app) — installs the ASGI middleware that injects W3C TraceContext headers on outbound calls and extracts them from inbound requests.
  • provider.shutdown() — triggers an immediate ForceFlush on the BatchSpanProcessor, ensuring spans buffered in memory are exported before the event loop exits.

Decision criteria

Use this pattern whenever:

  • A FastAPI route calls background_tasks.add_task() with any function that creates spans.
  • A coroutine spawns work via asyncio.create_task() after middleware has modified the context (common with per-request auth decorators).
  • CPU-bound logic is offloaded to asyncio.to_thread() or loop.run_in_executor().
  • A WebSocket or Server-Sent Event handler maintains a long-lived connection whose spans should nest under the initial handshake trace.

You do not need manual attach/detach when awaiting other coroutines directly within the same function — Python’s contextvars machinery handles those transitions automatically.

Common pitfalls

  • Capturing context too late. If you call context.get_current() inside the background function rather than in the route handler, the context is already detached. Always snapshot before the add_task() or create_task() call.
  • Forgetting context.detach(token) in the finally block. Each attach() increments an internal counter; without a matching detach(), long-lived coroutines (WebSockets, SSE) accumulate stale context entries and slowly leak memory.
  • Middleware ordering. FastAPI’s middleware stack executes in last-registered-first-executed order. Registering FastAPIInstrumentor after a custom authentication middleware means the OTel context is attached inside auth scope — calling attach() in auth and then spawning tasks still requires the explicit capture pattern above.

Verification

After applying the fix, confirm correctness without deploying:

# tests/test_trace_propagation.py
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export.in_memory_span_exporter import InMemorySpanExporter
from opentelemetry.sdk.trace.export import SimpleSpanProcessor
from httpx import AsyncClient, ASGITransport
import pytest, asyncio

@pytest.mark.asyncio
async def test_background_task_trace_continuity():
    exporter = InMemorySpanExporter()
    provider = TracerProvider()
    provider.add_span_processor(SimpleSpanProcessor(exporter))

    # Re-instrument with test provider
    from opentelemetry import trace
    trace.set_tracer_provider(provider)

    from fastapi_context_fix import app          # import AFTER setting provider
    async with AsyncClient(transport=ASGITransport(app=app), base_url="http://test") as client:
        await client.get("/api/process")

    await asyncio.sleep(0.1)                     # allow background task to complete
    spans = exporter.get_finished_spans()
    trace_ids = {s.context.trace_id for s in spans}

    assert len(trace_ids) == 1, f"Expected 1 trace_id, got {len(trace_ids)}: {trace_ids}"
    names = {s.name for s in spans}
    assert "db_sync_background" in names
    assert "cpu_work" in names

You can also query Jaeger or Grafana Tempo directly: filter by parent_span_id = 0 (or null) across the last 5 minutes. After the fix, db_sync_background and cpu_work must not appear in that filter — only the inbound HTTP span should be a root.

For production monitoring, configure a Prometheus alert on the ratio of zero-parent spans to total spans. When orphaned spans exceed 0.5% of trace volume, an async boundary is leaking context.


↑ Back to Handling Async Boundaries in Node.js and Python