Saturday, December 10, 2011

Reverse proxy node.js websockets with HAProxy

The more I toy with it, the more I love node.js and websockets. Pushing notifications asynchronously has never been this easy. In the little project I use for this post, I have an apache2 hosted app serving my pages, and I built a little notifier with node.js to broadcast various events to clients with socket.io.

Now the small issue here is I want websockets to communicate with node, while I want my other http request to be handled by apache. A first solution is to have node listen on another port, with client code looking like this:
<script src="http://www.test.tld:9000/socket.io/socket.io.js"></script>
<script>
    var socket = io.connect('http://www.test.tld:9000');
    ...
</script>

I have two obvious problems here. For one, some network setups only allow ports 80 and 443 to go through. And then it's just painfully ugly! A more graceful way would be to differenciate the requests based on subdomains rather than ports, and for that I need a reverse-proxy.

Apache2 already possesses a reverse proxy mod, but it doesn't handle websockets (and doing this with apache2 would somehow defeat the point of using node.js). I played a little with nginx, which is very light and fast, but having to patch the source code to use tcp_proxy to handle websockets made me unconfortable regarding the update process in the future. I finally chose HAProxy, which handles websockets out-of-the-box.

For the reverse proxy to work we first need to modify the ports node and apache listen to. So I changed the apache conf to have it listen locally to port 9010, and kept node on port 9000. HAProxy will now handle the initial requests on port 80 and dispatch them to node and apache. I want the requests sent to the domain "io.test.tld" to be forwarded to node, and the rest to be forwarded to apache. Here's a sample HAProxy configuration that does just that:
global
    daemon
    maxconn 4096
    user haproxy
    group haproxy

defaults
    log global

#this frontend interface receives the incoming http requests
frontend http-in
    mode http
    #process all requests made on port 80
    bind *:80
    #set a large timeout for websockets
    timeout client 86400000
    #default behavior sends the requests to apache
    default_backend www_backend
    #it all happens here: a simple check on the host string
    #when "io.test.tld" is matched, an acl I call arbitrarily
    # "websocket" triggers
    acl websocket hdr_end(host) -i io.test.tld
    #redirect to my node backend if the websocket acl triggered
    use_backend node_backend if websocket

#apache backend, transfer to port 9010
backend www_backend
    mode http
    timeout server 86400000
    timeout connect 5000
    server www_test localhost:9010

#node backend, transfer to port 9000
backend node_backend
    mode http
    timeout server 86400000
    timeout connect 5000
    server io_test localhost:9000

It's fairly straightforward and it just works. Of course the best way to handle this case is to have 2 different IP addresses, but as far as subdomains discrimination is concerned, I'm very satisfied with this solution so far.