If in Doubt, Stick a Graph on It

The London Cycle Hire scheme launched on Friday, and whilst I’ve not actually had a chance to try it (my key appears to be duff and is being replaced), I’ve been poking around with the live availability data that drives the online map.

I set up a cronjob to fetch the HTML of the map page, once a minute, and built a little Rails 3 app which imports this data and presents it using Protovis (which is fantastic, if a little tricky to get your head around at first).

Here’s the graph for the last 24 hours at Leonard Circus in Shoreditch:

Graph of Available Bikes at Leonard Circus in Shoreditch

As you can see, there are lots of little spikes, often jumping ±3 bikes from minute to minute. That makes it seem like in the space of 5 minutes, 15 or more people are hiring and returning bikes. Even at 5:30am. That simply doesn’t tally with what I’ve observed in the wild – only 9000 people have keys so far, whilst everyone else just mills around looking interested.

These spikes are consistent across all of the stations (343) which I’ve got data for. For example, here’s Drury Lane in Covent Garden:

Drury Lane, Covent Garden

I suspect something is wrong with either the availability data that TFL are getting from the docking stations, or with the map on the web. Filtering out the spikes should be possible, but it’s tricky, and not something I can be bothered with. For now, I’ll put the project on the back burner, but keep capturing the data and revisit it if it looks like the quality improves or an official API appears.

Update: It turns out this was fairly easy to fix. If you enable cookies, all works as expected. So, in my case, the cronjob now looks like:

curl "https://web.barclayscyclehire.tfl.gov.uk/maps" --cookie ~/cyclehire/cookies.txt --cookie-jar ~/cyclehire/cookies.txt -A "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_4; en-gb) AppleWebKit/533.16 (KHTML, like Gecko) Version/5.0 Safari/533.16" -o ~/cyclehire/$(date +%s).html -s