Is anyone here familiar with the Open Telemetry Kotlin implementation: <https://github.com/open-tele...
j
Is anyone here familiar with the Open Telemetry Kotlin implementation: https://github.com/open-telemetry/opentelemetry-java/tree/main/extensions/kotlin. I'm trying to find the bit that allows the Span to propogate context across coroutines from a non-suspendable function:
Copy code
runBlocking {
        val error = IllegalAccessException("Bad things are happening!")
        val message = "Isn't it a nice day"
        Span.current()?.let { span ->
            val attributes = Attributes.builder().apply {
                message?.let { put("error_message", it) }
            }
                .build()

            span.setStatus(StatusCode.ERROR)
            span.recordException(throwable, attributes)
        }
        
        launch { 
            // This will cause the spans to get linked together and then sync metadata from the above. 
            @WithSpan
            suspendFunction()
        }
    }
The reason i'm asking is that i'm trying to understand some strange behavior in our system where the context doesn't always propagate from a parent to child correctly.
e
o
Yes, using the
ContextElement
in coroutine land is the key, like so:
Copy code
withContext(span.asContextElement()) {
    // ...
}
A stripped-down version of code I am working on looks like this:
Copy code
import io.opentelemetry.api.GlobalOpenTelemetry
import io.opentelemetry.api.trace.Span
import io.opentelemetry.api.trace.SpanBuilder
import io.opentelemetry.api.trace.StatusCode
import io.opentelemetry.api.trace.Tracer
import io.opentelemetry.extension.kotlin.asContextElement
import kotlinx.coroutines.CoroutineName
import kotlinx.coroutines.withContext
import kotlin.coroutines.coroutineContext

val tracer: Tracer = GlobalOpenTelemetry.getTracer("myPackage", "0.0.0")

/**
 * Executes [block] in a tracing span with optional SpanBuilder [parameters].
 *
 * [parameters] example: `parameters = { setParent(parentContext); addLink(span1.spanContext) }`
 *
 * The span will be
 * * a child of a parent context, if set via [parameters], or
 * * a child of the current span (from the current coroutine context), or
 * * a top-level span.
 */
suspend fun <Result> withSpan(
    name: String,
    parameters: (SpanBuilder.() -> Unit)? = null,
    block: suspend (span: Span?) -> Result
): Result {
    val span: Span = tracer.spanBuilder(name).run {
        if (parameters != null)
            parameters()
        coroutineContext[CoroutineName]?.let {
            setAttribute("coroutine.name", it.name)
        }
        startSpan()
    }

    return withContext(span.asContextElement()) {
        try {
            block(span)
        } catch (throwable: Throwable) {
            span.setStatus(StatusCode.ERROR)
            span.recordException(throwable)
            throw throwable
        } finally {
            span.end()
        }
    }
}
a
@Oliver.O Curious as to if you ever wound up with something reasonable here - trying to come up with an abstraction of your own? I'm on the same path, and I came across your notes.
o
Well, did this already. Internally, we have a set of wrappers for telemetry, combined with multiplatform logging and scoped runtime configuration (sort of feature flags on steriods). Due to limited resources, nothing to publish, unfortunately. Currently also coupled with a custom variant of ksp (published as a PR), possibly to be replaced by a direct compiler plugin.
a
Thanks Oliver 🙂 Please let me know if you have any recommendations for the wrapper - did it wind up looking similar to the snippet you shared earlier? Or did you wind up trying to roll that into a compiler plugin as well?
o
It still looks pretty much the same as above, plus some additions for error handling and coroutine cancellations:
Copy code
/**
 * Executes [block] in a tracing span with optional SpanBuilder [parameters].
 *
 * [parameters] example: `parameters = { setParent(parentContext); addLink(span1.spanContext) }`
 *
 * The span will be
 * - a child of a parent context, if set via [parameters], or
 * - a child of the current span (from the current coroutine context), or
 * - a top-level span.
 *
 * Guidelines:
 * - [Trace Semantic Conventions](<https://opentelemetry.io/docs/reference/specification/trace/semantic_conventions/>)
 * - [Attribute Naming](<https://opentelemetry.io/docs/reference/specification/common/attribute-naming/>)
 */
suspend fun <Result> withSpan(
    name: String,
    parameters: (SpanBuilder.() -> Unit)? = null,
    exceptionIsError: (Throwable) -> Boolean = { it !is CancellationException },
    block: suspend (span: Span?) -> Result
): Result {
    val span: Span = tracer.spanBuilder(name).run {
        if (parameters != null) {
            parameters()
        }
        coroutineContext[CoroutineName]?.let {
            setAttribute("coroutine.name", it.name)
        }
        startSpan()
    }

    return withContext(span.asContextElement()) {
        try {
            block(span).also {
                span.setStatus(StatusCode.OK)
            }
        } catch (throwable: Throwable) {
            if (exceptionIsError(throwable)) {
                span.setStatus(StatusCode.ERROR)
                span.recordException(throwable)
            } else {
                span.addEvent(
                    "Completed with exception",
                    attributes(
                        "exception.type" to throwable.javaClass.name,
                        "exception.message" to (throwable.message ?: "(none)")
                    )
                )
                span.setStatus(StatusCode.OK)
            }
            throw throwable
        } finally {
            span.end()
        }
    }
}
The compiler plugin exists to generate names for the "feature flags on steroids" stuff and is not required for the above.
a
Awesome! Thank you for sharing sir - very cool. What prompted you to do add the
StatusCode.OK
yourself? I saw a caveat against that behaviour - and so I generally avoid
From https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/api.md
Generally, Instrumentation Libraries SHOULD NOT set the status code to
Ok
, unless explicitly configured to do so. Instrumentation Libraries SHOULD leave the status code as
Unset
unless there is an error, as described above.
o
In the above code, the result is intended to be final, so leaving it
Unset
does not make much sense to me. So in this case the above code is that of an 'application developer', I guess:
Application developers and Operators may set the status code to
Ok
.
When span status is set to
Ok
it SHOULD be considered final and any further attempts to change it SHOULD be ignored.
And no need for 'sir', I'm just Oliver. 😆
3933 Views