Well, that took a little longer than planned…
[I won't go into the details, but suffice to say our internal deadline got squeezed between the combination of a fast-growing website, the usual issues of large datasets, and that tricky business of finding and managing coders who can program in Ruby, get data, and be really good at scraping tricky websites.]
But I’m pleased to say we’ve now well on our way to not just resurrecting PlanningAlerts in a sustainable, scalable way but a whole lot more too.
Where we’re heading: a open database of UK planning applications
First, let’s talk about the end goal. From the beginning, while we wanted to get PlanningAlerts working again – the simplicity of being able to put in your postcode and email address and get alerts about nearby planning applications is both useful and compelling – we also knew that if the service was going to be sustainable, and serve the needs of the wider community we’d need to do a whole lot more.
Particularly with the significant changes in the planning laws and regulations that are being brought in over the next few years, it’s important that everybody – individuals, community groups, NGOs, other websites, even councils – have good and open access to not just the planning applications in their area, but in the surrounding areas too.
In short, we wanted to create the UK’s first open database of planning applications, free for reuse by all.
That meant not just finding when there was a planning application, and where (though that’s really useful), but also capturing all the other data too, and also keep that information updated as the planning application went through the various stages (the original PlanningAlerts just scraped the information once, when it was found on the website, and even then pretty much just got the address and the description).
Of course, were local authorities to publish the information as open data, for example through an API, this would be easy. As it is, with a couple of exceptions, it means an awful lot of scraping, and some pretty clever scraping too, not to mention upgrading the servers and making OpenlyLocal more scalable.
Where we’ve got to
Still, we’ve pretty much overcome these issues and now have hundreds of scrapers working, pulling the information into OpenlyLocal from well over a hundred councils, and now have well over half a million planning applications in there.
There are still some things to be sorted out – some of the council websites seem to shut down for a few hours overnight, meaning they appear to be broken when we visit them, others change URLs without redirecting to the new ones, and still others are just, well, flaky. But we’ve now got to a stage where we can start opening up the data we have, for people to play around with, find issues with, and start to use.
For a start, each planning application has its own permanent URL, and the information is also available as JSON or XML:
There’s also a page for each council, showing the latest planning applications, and the information here is available via the API too:
There’s also a GeoRSS feed for each council too allowing you to keep up to date with the latest planning applications for your council. It also means you can easily create maps or widgets for the council, showing the latest applications of the council.
Finally, Andrew Speakman, who’d coincidentally been doing some great stuff in this area, has joined the team as Planning editor, to help coordinate efforts and liaise with the community (more on this below).
The next main task is to reinstate the original PlanningAlert functionality. That’s our focus now, and we’re about halfway there (and aiming to get the first alerts going out in the next 2-3 weeks).
We’ve also got several more councils and planning application systems to add, and this should bring the number of councils we’ve got on the system to between 150 and 200. This will be an ongoing process, over the next couple of months. There’ll also be some much-overdue design work on OpenlyLocal so that the increased amount of information on there is presented to the user in a more intuitive way – please feel free to contact us if you’re a UX person/designer and want to help out.
We also need to improve the database backend. We’ve been using MySQL exclusively since the start, but MySQL isn’t great at spatial (i.e. geographic) searches, restricting the sort of functionality we can offer. We expect to sort this in a month or so, probably moving to PostGIS, and after that we can start to add more features, finer grained searches, and start to look at making the whole thing sustainable by offering premium services.
We’ll be working too on liaising with councils who want to offer their applications via an API – as the ever pioneering Lichfield council already does – or a nightly data dump. This not only does the right thing in opening up data for all to use, but also means we don’t have to scrape their websites. Lichfield, for example, uses the Idox system, and the web interface for this (which is what you see when you look at a planning application on Lichfield’s website) spreads the application details over 8 different web pages, but the API makes this available on a single URL, reducing the work the server has to do.
Finally, we’re going to be announcing a bounty scheme for the scraper/developer community to write scrapers for those areas that don’t use one of the standard systems. Andrew will be coordinating this, and will be blogging about this sometime in the next week or so (and you can contact him at planning at openlylocal dot com). We’ll also be tweeting progress at @planningalert.
Thanks for your patience.
One of the first and best examples of how data could make a difference to ordinary people’s lives was the inspirational PlanningAlerts.com, built by Richard Pope, Mikel Maron, Sam Smith, Duncan Parkes, Tom Hughes and Andy Armstrong.
In doing one simple thing – allowing ordinary people to subscribe to an email alert when there was a planning application near them, regardless of council boundaries – it showed that data mattered, and more than that had the power to improve the interaction between government and the community.
It did so many revolutionary things and fought so many important battles that everyone in the open data world (and not just the UK) owes all those who built it a massive debt of gratitude. Richard Pope and Duncan Parkes in particular put masses of hours writing scrapers, fighting the battle to open postcodes and providing a simple but powerful user experience.
However, over the past year it had become increasingly difficult to keep the site going, with many of the scrapers falling into disrepair (aka scraper rot). Add to that the demands of a day job, and the cost of running a server, and it’s a tribute to both Richard and Duncan that they kept PlanningAlerts going for as long as they did.
So when Richard reached out to OpenlyLocal and asked if we were interested in taking over PlanningAlerts we were both flattered and delighted. Flattered and delighted, but also a little nervous. Could we take this on in a sustainable manner, and do as good a job as they had done?
Well after going through the figures, and looking at how we might architect it, we decided we could – there were parts of the problem that were similar to what we were already doing with OpenlyLocal – but we’d need to make sustainability a core goal right from the get-go. That would mean a business plan, and also a way for the community to help out.
Both of those had been given thought by both us and by Richard, and we’d come to pretty much identical ideas, using a freemium model to generate income, and ScraperWiki to allow the community help with writing scrapers, especially for those councils didn’t use one of the common systems. But we also knew that we’d need to accelerate this process using a bounty model, such as the one that’s been so successful for OpenCorporates.
Now all we needed was the finance to kick-start the whole thing, and we contacted Nesta to see if they were interested in providing seed funding by way of a grant. I’ve been quite critical of Nesta’s processes in the past, but to their credit they didn’t hold this against us, and more than that showed they were capable and eager to working in a fast, lightweight & agile way.
We didn’t quite manage to get the funding or do the transition before Richard’s server rental ran out, but we did save all the existing data, and are now hard at work building PlanningAlerts into OpenlyLocal, and gratifyingly making good progress. The PlanningAlerts.com domain is also in the middle of being transferred, and this should be completed in the next day or so.
We expect to start displaying the original scraped planning applications over the next few weeks, and have already started work on scrapers for the main systems used by councils. We’ll post here, and on the OpenlyLocal and PlanningAlert twitter accounts as we progress.
We’re also liaising with PlanningAlerts Australia, who were originally inspired by PlanningAlerts UK, but have since considerably raised the bar. In particular we’ll be aiming to share a common data structure with them, making it easy to build applications based on planning applications from either source.
Tonight, hyperlocal bloggers (and in fact any ordinary members of the public) got two great boosts in their access to council meetings, and their ability to report on them.
Windsor & Maidenhead this evening passed a motion to allow members of the public to video the council meetings. This follows on from my abortive attempt late last year to video one of W&M’s council meeting – see the full story here, video embedded below – following on from the simple suggestion I’d made a couple of months ago to let citizens video council meetings. I should stress that that attempt had been pre-arranged with a cabinet member, in part to see how it would be received – not well as it turned out. But having pushed those boundaries, and with I dare say a bit of lobbying from the transparency minded members, Windsor & Maidenhead have made the decision to fully open up their council meetings.
Separately, though perhaps not entirely coincidentally, the Department for Communities & Local Government tonight issued a press release which called on councils across the country to fully open up their meetings to the public in general and hyperlocal bloggers in particular.
Councils should open up their public meetings to local news ‘bloggers’ and routinely allow online filming of public discussions as part of increasing their transparency, Local Government Secretary Eric Pickles said today.
To ensure all parts of the modern-day media are able to scrutinise Local Government, Mr Pickles believes councils should also open up public meetings to the ‘citizen journalist’ as well as the mainstream media, especially as important budget decisions are being made.
Local Government Minister Bob Neill has written to all councils urging greater openness and calling on them to adopt a modern day approach so that credible community or ‘hyper-local’ bloggers and online broadcasters get the same routine access to council meetings as the traditional accredited media have.
The letter sent today reminds councils that local authority meetings are already open to the general public, which raises concerns about why in some cases bloggers and press have been barred.
Importantly, the letter also tells councils that giving greater access will not contradict data protection law requirements, which was the reason I was given for W&M prohibiting me filming.
So, hyperlocal bloggers, tweet, photograph and video away. Do it quietly, do it well, and raise merry hell in your blogs and local press if you’re prohibited, and maybe we can start another scoreboard to measure the progress. To those councils who videocast, make sure that the videos are downloadable under the Open Government Licence, and we’ll avoid the ridiculousness of councillors being disciplined for increasing access to the democratic process.
And finally if we can collectively think of a way of tagging the videos on Youtube or Vimeo with the council and meeting details, we could even automatically show them on the relevant meeting page on OpenlyLocal.
A couple of months ago, I blogged about the ridiculous situation of a local councillor being hauled up in front of the council’s standards committee for posting a council webcast onto YouTube, and worse, being found against (note: this has since been overturned by the First Tier Tribunal for Local Government Standards, but not without considerable cost for the people of Brighton).
At the time I said we should make the following demand:
Give the public the right to record any council meeting using any device using Flip cams, tape recorders, frankly any darned thing they like as long as it doesn’t disrupt the meeting.
Step forward councillor Liam Maxwell from the Royal Borough of Windsor & Maidenhead, who as the cabinet member for transparency has a personal mission to make RBWM the most transparent council in the country. I don’t see why you couldn’t do that our council, he said.
So, last night, I headed over to Maidenhead for the scheduled council meeting to test this out, and either provide a shining example for other councils, or show that even the most ‘transparent’ council can’t shed the pomposity and self-importance that characterises many council meetings, and allow proper open access.
The video below, less than two minutes long, is the result, and as you can see, they chose the latter course:
Interestingly, when asked why videoing was not allowed, they claimed ‘Data Protection’, the catch-all excuse for any public body that doesn’t want to publish, or open up, something. Of course, this is nonsense in the context of a public meeting, and where all those being filmed were public figures who were carrying out a civic responsibility.
There’s also an interesting bit to the end when a councillor answered that they were ‘transparent’ in response to the observation that they were supposed to be open. This is the same old you-can-look-but-don’t touch attitude that has characterised much of government’s interactions with the public (and works so well at excluding people from the process). Perhaps naively, I was a little shocked to hear this from this particular council.
So there you have it. That, I guess, is where the boundaries of transparency lies at Windsor & Maidenhead. Why not test them out at your council, and perhaps we can start a new scoreboard at OpenlyLocal to go with the open data scoreboard, and the 10:10 council scoreboard
A couple of weeks ago Will Perrin and I, along with some feedback from the Local Public Data Panel on which we sit, came up with some guidelines for publishing local spending data. They were a first draft, based on a request by Camden council for some guidance, in light of the announcement that councils will have to start publishing details of spending over £500.
Now I’ve got strong opinions about standards: they should be developed from real world problems, by the people using them and should make life easier, not more difficult. It slightly concerned me that in this case I wasn’t actually using any of the spending data – mainly because I hadn’t got around to adding it in to OpenlyLocal yet.
This week, I remedied this, and pulled in the data from those authorities that had published their local spending data – Windsor & Maidenhead, the GLA and the London Borough of Richmond upon Thames. Now there’s a couple of sites (including Adrian Short’s Armchair Auditor, which focuses on spending categories) already pulling the Windsor & Maidenhead data but as far as I’m aware they don’t include the other two authorities, and this adds a different dimension to things, as you want to be able to compare the suppliers across authorities.
First, a few pages from OpenlyLocal showing how I’ve approached it (bear in mind they’re a very rough first draft, and I’m concentrating on the data rather than the presentation). You can see the biggest suppliers to a council right there on the council’s main page (e.g. Windsor & Maidenhead, GLA, Richmond):
Click through to more info gets you a pagination view of all suppliers (in Windsor & Maidenhead’s case there are over 2800 so far):
Clicking any of these will give you the details for that supplier, including all the payments to them:
And clicking on the amount will give you a page just with the transaction details, so it can be emailed to others
But we’re getting ahead of ourselves. The first job is to import the data from the CSV files into a database and this was where the first problems occurred. Not in the CSV format – which is not a problem, but in the consistency of data.
Take Windsor & Maidenhead (you should just be able to open these files an any spreadsheet program). Looking at each data set in turn and you find that there’s very little consistency – the earliest sets don’t have any dates and aggregate across a whole quarter (but do helpfully have the internal Supplier ID as well as the supplier name). Later sets have the transaction date (although in one the US date format is used, which could catch out those not looking at them manually), but omit supplier ID and cost centre.
On the GLA figures, there’ a similar story, with the type of data and the names used to describe changing seemingly randomly between data sets. Some of the 2009 ones do have transaction dates, but the 2010 one generally don’t, and the supplier field has different names, from Supplier to Supplier Name to Vendor.
This is not to criticise those bodies – it’s difficult to produce consistent data if you’re making the rules up as you go along (and given there weren’t any established precedents that’s what they were doing), and doing much of it by hand. Also, they are doing it first and helping us understand where the problems lie (and where they don’t). In short they are failing forward –getting on with it so they can make mistakes from which they (and crucially others) can learn.
But who are these suppliers?
The bigger problem, as I’ve said before, is being able to identify the suppliers, and this becomes particularly acute when you want to compare across bodies (who may name the same company or body slightly differently). Ideally (as we put in the first draft of the proposals), we would have the company number (when we’re talking about a company, at any rate), but we recognised that many accounts systems simply won’t have this information, and so we do need some information that helps use identify them.
Why do we want to know this information? For the same reason we want any ID (you might as well ask why Companies House issues Company Numbers and requires all companies to put that number on their correspondence) – to positively identify something without messing around with how someone has decided to write the name.
With the help of the excellent Companies Open House I’ve had a go at matching the names to company numbers, but it’s only been partially successful. When it is, you can do things like this (showing spend with other councils on a suppliers’ page):
It’s also going to allow me to pull in other information about the company, from Companies House and elsewhere. For other bodies (i.e. those without a company number), we’re going to have to find another way of identifying them, and that’s next on the list to tackle.
Thoughts on those spending data guidelines
In general I still think they’re fairly good, and most of the shortcomings have been identified in the comments, or emailed to us (we didn’t explicitly state that the data should be available under an open licence such as the one at data.gov.uk, and be definitely should have done). However, adding this data to OpenlyLocal (as well as providing a useful database for the community) has crystalised some thoughts:
- Identification of the bodies is essential, and it think we were right to make this a key point, but it’s likely we will need to have the government provide a lookup table between VAT numbers and Company Numbers.
- Speaking of Government datasets, there’s no way of finding out the ancestry of a company – what its parent company is, what its subsidiaries are, and that’s essential if we’re to properly make use of this information, and similar information released by the government. Companies House bizarrely doesn’t hold this information, but the Office For National Statistics does, and it’s called the Inter Departmental Business Register. Although this contains a lot of information provided in confidence for statistical reasons, the relationships between companies isn’t confidential (it just isn’t stored in one place), so it would be perfectly feasible to release this information.
- We should probably be explicit whether the figures should include VAT (I think the Windsor & Maidenhead ones don’t include it, but the GLA imply that theirs might).
- Categorisation is going to be a tricky one to solve, as can be seen from the raw data for Windsor & Maidenhead – for example the Children’s Services Directorate is written as both Childrens Services & Children’s Services, and it’s not clear how this, or the subcateogries, ties into standard classifications for government spending, making comparison across authorities tricky.
- I wonder what would be the downside to publishing the description details, even, potentially, the invoice itself. It’s probably FOI-able, after all.
As ever, comments welcome, and of course all the data is available through the API under an open licence.
As I mentioned in the last post, I’ve recently added council- and ward-level statistics to OpenlyLocal, using the data from the Office of National Statistics Neighbourhood Statistics database. All very well and nice to have it in the same place as the democratic info.
However, what I was really interested in was getting and showing statistics about local areas that’s a bit more, well, meaty. So when I did that statistical backend of OpenlyLocal I wanted to make sure that I could use it for other datasets from other sources.
The first of those is now online, and it’s a good one, the 2006-07 Local Spending Report for England, published in April 2009. What is this? In a nutshell it lists the spending by category for every council in England at the time of the report (there have been a couple of new ones since then).
Now this report has been available to download online if you knew it existed, as a pretty nasty and unwieldy spreadsheet (in fact the recent report to Parliament, Making local public expenditure data public and the development of Local Spending Reports, even has several backhanded references to the inaccessibility of it).
However, unless you enjoy playing with spreadsheets (and at the very minimum know how to unhide hidden sheets and read complex formulae), it’s not much use to you. Much more helpful, I think, is an accessible table you can drill down for more details.
Let’s start with the overview:
Here you can see the total spending for each council over all categories (and also a list of the categories). Click on the magnifying glass at the right of each row and you’ll see a breakdown of spending by main category:
Click again on the magnifying glass for any row now and you’ll see the breakdown of spending for the category of spending in that row:
Finally (for this part) if you click on the magnifying glass again you’ll get a comparison with councils of the same type (District, County, Unitary, etc) you can compare with other councils:
You can also compare between all councils. From the main page for the Local Spending Dataset, click on one of the categories and it will show you the totals for all councils. Click on one of the topics on that page and it will give you all councils for that topic. Well, hopefully you get the idea. Basically, have a play and give us some feedback.
[There'll also be a summary of the figures appearing on the front page for each council sometime in the next few hours.]
Comments, mistakes found, questions all welcome in the usual locations (comments below, twitter or email at CountCulture at gmail dot com).
A bit overdue (I’ve been talking about doing this for a couple of months), but at last there’s now a Ning app for OpenlyLocal local data so anyone who has a UK Ning hyperlocal site (well, anyone in the 90+ councils we’ve opened up data for) can now have information about their council right there in their site.
Like the OpenlyLocal Google gadget (see OpenlyLocal info on your website, Part 1: Google Gadgets), it’s fairly straightforward to use. You need to be the owner of the Ning community to add it, and then it’s automatically available to users as just another tab (like Forums, Videos, Photos, etc). Once you’ve done it you (and the other members of the community) can see the council’s key data, upcoming meetings, members and committees. More features and functionality will be added, but it’s already a very useful addition to any hyperlocal site.
This is what it looks like in action:
So how do you add the OpenlyLocal application to your Ning community (NB Ning Apps need to be added by the network creator). It’s a breeze, and should take no more than a couple of minutes (probably a lot less):
- Go to the ‘Manage’ Tab on your network:
- Click on Ning Apps and you’ll be shown the Ning App directory. Quickest thing is just to search for the OpenlyLocal app:
- Choose the OpenlyLocal application:
- You’ll then be redirected to the Tab management screeen, where you can change the name of the Tab for the app . By default it’s “Council info :: OpenlyLocal”, but might be better to be just “Our Council” or “Council watch” if space is tight. *Important*: If you do change the name, you must click the Save Tab Settings button, otherwise just click on the link:
- You should now be shown the OpenlyLocal page (if not, just click on the tab), and you should click on the ‘edit settings’ link (in the top RIGHT of the info area, not the ‘settings’ link just above it and to the left).
- Select your local authority, and then “Save Changes”
- The app will then get the data from OpenlyLocal (but some may be hidden – if so so just reload the page).
- That’s it.
There’s more features coming, but I hope you’ll agree it’s an essential addition to any Ning hyperlocal community. Comments as ever welcome, and the code behind the application will be shortly uploaded to the OpenlyLocal github tools page.
p.s. To remove the app, just go back to the tab management page and click on the ‘x’ beside the tab.