The tool procured some great press last week with SwissMiss tweeting about us, and Life Hacker picking it up. Accordingly, things broke. This was rather unfortunate as I'd sort of planned for traffic spikes, and this sort of thing shouldn't have happened. Two key things went wrong:

  • The pecl_http extension seg faulted on certain requests. supervisord promptly restarted the worker, which then picked up a similar job from the queue, and seg faulted again. This happened enough times that supervisord gave up on the worker and left it shut down. Gearman detected that the job was never completed, and re-queued it, ready to crash another worker when it came up. The issue was ultimately caused by incomplete HTTP response headers lacking the reason phrase, a number of systems seem to omit that message, crashing the worker.

    Ilia was able to patch pecl_http, and we've updated to the more recent release to obtain that fix.

  • gearman re-submitted crashed jobs forever. This ensured that all workers eventually died with even a very small number of requests that caused the workers to crash. This is silly for a few reasons. First, it allowed this to happen. Second, these requests are timely, there's no point in trying something thirty times as by then the web client has given up on the request itself and the data will never be used.

    We've re-configured the gearman system to only retry a job once. This should allow random issues to be retried, but prevent pervasive problems from crashing the system. If you're running gearman I'd strongly suggest supplying some sort of maximum retried value using --job-retries=N

We've come through the outage and fixed both elements of the problem (though fixing either of them would have prevented an identical issue from causing a problem). We're also looking at better ways of monitoring this to be informed of problems sooner.

Our apologies for the outage.


There's a popular turn of phrase "pave the cow paths", which was introduced to me by my friend Chris Shiflett in one of his talks. The essence (as I understand it) of paving the cow paths is that it's easier to positively encourage users to act they way they already want to, than to have them change their behaviour.

Twitter has some great examples of paving the cow paths. Look at @replies, not a feature they built into the service, just something that developed through use. Later developers included features within the product to better support this. Hash tags followed the same route, possibly co-tags will come later.

In each of these cases the developers had the opportunity to observe how their users behaved (creating their own paths), then worked to encourage and support their behaviour (paving them).

I think there's a lot to learn here for people working on a new project or product. You can't hope to guess all the ways users will want to interact with your products. Those unexpected use cases may turn out (in the long run) to be a major part of your application. You can however release your core feature with some flexibility and mind, then watch.

I've written before about the hard choice of getting something right, or getting something up, and I argued then for just getting it up. I think this is another great reason to follow that route. In between releasing the product, and determining where the paths lie you'll have some time to round some corners and fix some bugs. Don't worry.

Finding the Paths.
Twitter had it easy, watch the stream see what users are doing. Your project is different, your paths will be too. Here's some ideas:

  • Log routes not just hits.
  • Your access log will contain a record of all the pages your users visit. Turn this raw data into information by tracking individual users as they navigate your site. Your webserver should allow you to modify the logging format to include enough unique information to track users (session id?). While IPs are not unique, they may be sufficiently unique for this purpose.

    Things to look out for
    • Common paths
    • Wasted steps (do users always go home -> friends -> news? Then include News as a top level link, or drop them there in the first place)
  • Give them somewhere to chat
  • If they'll use it, user forums can be a great resource. Some of your more passionate users will start arguing for features with each other, saving you time and presenting ideas.

  • Leave it open
  • Many sites still try to lock you down when you register. A service devoid of APIs, restrictive comment fields and closed data. If your audience is at all techy, they'll try to pave the paths themselves while they're using the service. Wether it's greasemonkey scripts, bookmarklets, or full api implementations. These tools are invaluable maps to where the paths are being laid. If you've provided users with a place to chat (forums) you've provided a natural place for the developers of these tools to congregate and share. Help them! Then make them redundant by improving your app. Keeping them in the loop throughout the feature development process (hey! we love what you've done here, so much in fact we'd like our app to do it for everyone. Here's a preview what do you think) is a great way to solicit early feedback, and avoid developer backlash.

  • Divide and Conquer
  • In Malcom Gladwell's article The Ketchup Conundrum (also appearing in What the Dog Saw) we learned that there is no one perfect spaghetti sauce. Different people want different things, chunky vs smooth, thick vs thin, etc. Your service may be no different. Again, consider twitter. There's users who tweet each inane portion of their lives to a few close friends. Others who tweet carefully and selectively. Since different people will want to use it differently, stop trying to find one trend within the whole. Instead look for segments in your user base, and discover what their needs are. Where needs prove to be mutually exclusive, provide the ability to customize the experience.


Hi, I’m Paul Reinheimer, a developer working on the web.

I co-founded WonderProxy which provides access to over 200 proxies around the world to enable testing of geoip sensitive applications. We've since expanded to offer more granular tooling through Where's it Up

My hobbies are cycling, photography, travel, and engaging Allison Moore in intelligent discourse. I frequently write about PHP and other related technologies.

Search