An Update on OpenTelemetry and WildFly
In a recent post, I worked through setting up OpenTelemetry support in your Jakarta EE application. Since that time, I’ve put quite a bit of work into integrating that support, as teased in the post, into WildFly. In this post, I’d like to provide an update on what that WildFly support currently looks like, and put out a request for feedback.
To get started experimenting with my changes, you need to do one of two things:
With the current state of changes, you get the following:
CDI injection of a
CDI injection of an
OpenTelemetryinstance should want to manually create a
Automatic context propagation on all incoming REST requests so long as the request adheres to the OpenTelemetry context propagation spec.
Automatic context propagation on all outgoing REST Client requests. This is done via an automatically-registered
ClientRequestFilter, so no additional work need be done in your application.
Along with that runtime functionality, you can configure how OpenTelemetry behaves:
service-name: The name of the service reported in the traces
exporter: Can either be
otlp. The default is
endpoint: The endpoint of the trace collector. The default is for Jaeger on
span-processor: Can either be
simple. The default is
batch-delay: The time in milliseconds to delay a batch processing. This is only used if
span-processoris set to
batch. The default is 5000ms.
max-queue-size: The maximum size of the batch before sending. The default is 2048.
max-export-batch-size: The maximum number of samples to export at a time. The default is 512.
export-timeout: The maximum wait time while exporting traces. The default is 30000ms, or 30 seconds.
sampler: The sampler to use:
sampler-arg: The ratio to use when sampling traces.
From a WildFly configuration perspective, the configuration looks like this:
<subsystem xmlns="urn:wildfly:opentelemetry:1.0" exporter="jaeger" endpoint="http://localhost:14250" span-processor="batch" batch-delay="5000" max-queue-size="2048" max-export-batch-size="512" export-timeout="30000" />
As it stands now, it seems to work really well. In designing and implementing what I have so far, I’ve discussed things internally with other Red Hat engineers in the observability space, as well as with some in the CNCF Slack channel, but more input would be extremely helpful.
Are there features you’d like to see?
Are there any changes you’d like to see in the configuration?
Is there anything missing in the runtime support that you’d like to see?
Currently, the service name is the same for all applications deployed to a given WildFly instance. Is that acceptable? If not, if it’s technically possible, would a per-app service name be preferable?