Internet Research #1

1. I've been enjoying noodling around with some electronics recently. One of the things I'm working on is converting a pair of beautiful old Bang and Olufsen CX50 speakers into an open source Sonos-esque multiroom audio system. I'm following HifiBerry's guide for the hardware part, and in it there's two 3D printed parts you need.

So I go on 3D Hubs, upload the two .stl files provided, get a quote, credit card, and now it's printing somewhere in Farringdon. You still need to know whether you want PLA or ABS or whatever, but I was very impressed at how easy the whole thing was. Just another mundane e-commerce transaction. We're out the other side of the hype cycle.

2. Simon Willison linked to libpostal on Twitter. On the surface, a useful library for turning unformatted addresses into structured data. Useful, but nothing out of the ordinary. But on a closer look:

libpostal's international address parser uses machine learning (Conditional Random Fields) and is trained on over 1 billion addresses in every inhabited country on Earth.

Increasingly, just chunking loads and loads of data at a problem becomes an option. Why bother writing a parser to handle every condition and edge case if you can just throw all the known address data in the world at it and let the computer work it out? The writeups on how it works are interesting and thorough.

Statistical NLP on OpenStreetMap: Part 1, Part 2

libpostal screenshot

3. Mapzen funded libpostal, and loads of other brilliant geodata projects. Sadly they're shutting down, but because it's all open source the tens of £millions of value won't be lost. Their migration guide walks through the alternatives to all their hosted services.

Aaron's post about Who's On First, the gazetteer of places, is a brilliant example of how to ensure an important project lasts beyond the lifespan of a single organisation or corporate strategy. Lots of loosely joined pieces, many easy to self or community host. Designing your project to be defended against the organisation that hosts it means designing that in from day one.

In many ways everything about the way Who’s On First has been designed has been done with this day in mind. We all endeavour to achieve the sort of “escape velocity” that immunizes us from circumstance but that is rare indeed and there was always a chance this day would come. So while “success” was the goal in many ways preventing what I call “the reset to zero” has always been of equal importance.

Who's On First, Chapter 2


4. I think spotted this on FaveJet probably via Russell.

David Rudnick's beautiful drawings of MiniDiscs. Drawings. In Photoshop.