An Update on OpenTelemetry and WildFly
Friday, July 09, 2021 |In a recent post, I worked through setting up OpenTelemetry support in your Jakarta EE application. Since that time, I’ve put quite a bit of work into integrating that support, as teased in the post, into WildFly. In this post, I’d like to provide an update on what that WildFly support currently looks like, and put out a request for feedback.
To get started experimenting with my changes, you need to do one of two things:
With the current state of changes, you get the following:
-
CDI injection of a
Tracer
instance. -
CDI injection of an
OpenTelemetry
instance should want to manually create aTracer
. -
Automatic context propagation on all incoming REST requests so long as the request adheres to the OpenTelemetry context propagation spec.
-
Automatic context propagation on all outgoing REST Client requests. This is done via an automatically-registered
ClientRequestFilter
, so no additional work need be done in your application.
Along with that runtime functionality, you can configure how OpenTelemetry behaves:
-
service-name
: The name of the service reported in the traces -
exporter
: Can either bejaeger
orotlp
. The default isjaeger
using gRPC. -
endpoint
: The endpoint of the trace collector. The default is for Jaeger onlocalhost
. -
span-processor
: Can either bebatch
orsimple
. The default isbatch
. -
batch-delay
: The time in milliseconds to delay a batch processing. This is only used ifspan-processor
is set tobatch
. The default is 5000ms. -
max-queue-size
: The maximum size of the batch before sending. The default is 2048. -
max-export-batch-size
: The maximum number of samples to export at a time. The default is 512. -
export-timeout
: The maximum wait time while exporting traces. The default is 30000ms, or 30 seconds. -
sampler
: The sampler to use:on
,off
, orratio
-
sampler-arg
: The ratio to use when sampling traces.
From a WildFly configuration perspective, the configuration looks like this:
<subsystem xmlns="urn:wildfly:opentelemetry:1.0" exporter="jaeger" endpoint="http://localhost:14250" span-processor="batch" batch-delay="5000" max-queue-size="2048" max-export-batch-size="512" export-timeout="30000" />
As it stands now, it seems to work really well. In designing and implementing what I have so far, I’ve discussed things internally with other Red Hat engineers in the observability space, as well as with some in the CNCF Slack channel, but more input would be extremely helpful.
Are there features you’d like to see?
Are there any changes you’d like to see in the configuration?
Is there anything missing in the runtime support that you’d like to see?
Currently, the service name is the same for all applications deployed to a given WildFly instance. Is that acceptable? If not, if it’s technically possible, would a per-app service name be preferable?
Any and all feedback is welcome. You can always find me on Twitter or, better yet, comment on the issue in JIRA.