Jezen Thomas

Jezen Thomas

I'm a programmer and entrepreneur. Occasionally I write about technology, and speak at conferences. Find me on Instagram, or on GitHub, or write me an email.

Tracing User Requests in Yesod

Recently I decided I want to invest in my Yesod application’s logging system. Yesod provides sane defaults for logging using the Apache style and this is enough for most applications, however in my case I had a few extra demands. Namely, the logs should:

  • Contain an authenticated user’s ID so we have full visibility into every user’s behaviour
  • Generate a unique ID for each request, so requests can be discussed internally
  • Show the request’s total duration, which will help us find performance bottlenecks
  • Be formatted in JSON for ease of parsing with other tools

Most of these needs could be addressed with functionality the relevant libraries included in a typical scaffolded Yesod site already provide. One of my needs necessitated a little more work. More on that in a moment.

First, let’s take a look at how logging is typically set up in a scaffolded Yesod site. For our purposes, all we need to care about is what we define as our outputFormat. The value we use here should be different depending on whether you’re running the application in development or production — unless of course you want to see JSON logs in development.

Yesod applications are set to use detailed logging in development by default. This format is easier to visually parse than either Apache or JSON logs, but would be far too noisy in production. The True in Detailed True means we want the logs to use colours. A more descriptive type here might have been useful, but that’s by the by. Here’s how those detailed logs look (here without colour):

GET /auth/page/simple/login
  Accept: text/html,application/xhtml+xml,application/xml;q…
  Status: 200 OK 0.007939s
GET /favicon.ico
  Accept: image/webp,image/apng,image/*,*/*;q=0.8
  Status: 200 OK 0.001512s

Ok, so that’s just fine, but we’re more interested now in the production logs. We don’t want the Apache format, so that’s what we’ll change. The available data constructors can be found in the wai-extra package. You’ll see Apache IPAddrSource and Detailed Bool, both of which we’re familiar with from the scaffolding. There are a few others however with “custom output format” in their names which are a little more useful.

The penultimate data constructor has an associated JSON formatter called formatAsJSON. As it turns out, this combination of data constructor and formatting function (and also a custom middleware, but that’s internal) includes the request duration. So that’s two of my requirements taken care of! I believe logging the request duration is based on this prior art by Maximilian Tagher — the implementation looks identical.

So if we have the means to provide a custom formatter, why don’t we just add the ID of an authenticated user there? Well, here’s the thing. Request logging happens at the WAI level, which is the outer layer of the system. To read the authenticated user’s ID we need to check against the authentication system, but authentication happens at the Yesod level, which is an inner layer of the system. So first, we need some way to pass arbitrary data from the inner Yesod layer to the outer WAI layer. I think request and response headers are a suitable mechanism for transferring this kind of data, and we can use the addHeader function to easily add any response header we want. We also want this response header to be applied to every request, so we should use a middleware to add this functionality to all handlers.

Adding Yesod middleware is trivial:

In this middleware — which is just a polymorphic handler that composes with any other handler — we use maybeAuth to get a value of Maybe (Entity User). We have a function toKey which takes a user’s ID from an Entity User, and since our user entity is wrapped in a Maybe, and then again wrapped in a monadic action that retrieves that value from the database, we need to fmap twice over maybeAuth to get our Maybe UserUUID in one pass. The custom operator <<$>> gives us this double fmap, and is defined as such:

If the user is not authenticated, the value of mUserId will be Nothing, in which case we do nothing. If the user is authenticated, we add their ID it to a response header with a key of User-ID. The traverse_ function in this context gives us this behaviour of “if the value is a Just then do something with it, otherwise do nothing”.

We also need to apply our middleware, which is just a case of tacking it onto whatever other middleware chain we already have.

That’s great, but there’s a problem: How do we get these response headers in our request log formatter? I went through the source, and unfortunately none of the formatting middlewares actually provided the response headers to the log formatter. I sent a pull request which adds another OutputFormat data constructor, along with its associated custom middleware and log formatter to the wai-extra package, and it was merged earlier today. So as of version 3.0.27, you can use this data constructor:

…together with the new formatAsJSONWithHeaders function. So our makeLogWare function can now be changed to this:

This will now render all request logs in production as JSON, along with the request duration and any response headers. It will also redact any request header with a key of Cookie, or any response header with a key of Set-Cookie.

Almost there. Now we just want to generated a unique ID for each request as it’s coming in, and stuff that ID in a request header. We can do this easily enough at the outer WAI layer.

The first line uses the uuid package to generate a UUIDv4, and then converts it to a human-readable bytestring. We then construct a new value for the request headers — here marked as hs — which is just the new header prepended to the list of existing request headers. We then continue with the rest of the request lifecycle, sending along a modified request with the new headers.

As before, don’t forget to add this new middleware to the appropriate chain. This would be our WAI middleware chain, not the Yesod middleware chain we used earlier.

Go forth, and trace users.