Haproxy Charset

| More posts about

A common problem we encounter is for things like ñ not showing up correctly. This actually caused some issues in the recent Philippine elections, but this isn’t about hash codes or anything like that.

By default, we use UTF-8 for text storage and rendering. A problem is that browsers don’t assume UTF-8 as the default and you need to have either a <meta charset="utf-8" /> in the HTML or Content-Type: text/html; charset=utf-8 in the headers. A few of our services don’t set the Content-Type with the charset=utf-8 part so you’d get piñata instead of piñata.

Being lazy, we usually just correct this at the reverse proxy side. It’s trivial to do in nginx. You just need to add charset utf-8; to your configuration and you’re good. For haproxy though, I couldn’t readily find a solution for it and had to go through the docs to see what I could do.

After a bit of experimenting, I had success with this:

# set content-type to utf-8 if not already
acl has_charset hdr_sub(content-type) -i charset=
rspirep (Content-Type.*) \1;\ charset=utf-8 unless has_charset

This is probably not the best way to do it. Arguably, we should just fix our services to have the correct Content-Type in the first place, but I can do that some other time.