Like buses, you wait ages for local councils to publish their spending data, then a whole load come at once… and consequently OpenlyLocal has been importing the data pretty much non-stop for the past month or so.
We’ve now imported spending data for over 140 councils with more being added each day, and now have over a million and a half payments to suppliers, totalling over £10 billion. I think it’s worth repeating that figure: Ten Billion Pounds, as it’s a decent chunk of change, by anybody’s measure (although it’s still only a fraction of all spending by councils in the country).
Along with that we’ve also made loads of improvements to the analysis and data, some visible, other not so much (we’ve made loads of much-needed back-end improvements now that we’ve got so much more data), and to mark breaking the £10bn figure I thought it was worth starting a series of posts looking at the spending dataset.
Let’s start by having a look at those headline figures (we’ll be delving deeper into the data for some more heavyweight data-driven journalism over the next few weeks):
144 councils. That’s about 40% of the 354 councils in England (including the GLA). Some of the others we just haven’t yet imported (we’re adding them at about 2 a day); others have problems with the CSV files they are publishing (corrupted or invalid files, or where there’s some query about the data itself), and where there’s a contact email we’ve notified them of this.
The rest are refusing to publish the CSV files specified in the guidelines, deciding to make it difficult to automatically import by publishing an Excel file or, worse, a PDF (and here I’d like to single out Birmingham council, the biggest in the UK, which shamefully is publishing it’s spending only as a PDF, and even then with almost no detail at all. One wonders what they are hiding).
£10,184,169,404 in 1,512,691 transactions. That’s an average transaction value of £6,732 per payment. However this is not uniform across councils, varying from an average transaction value of £669 for Poole to £46,466 for Barnsley. (In future posts, I’ll perhaps have a look at using the R statistical language to do some histograms on the data, although I’d be more than happy if someone beat me to that).
194,128 suppliers. What does this mean? To be accurate, this is the total number of supplying relationships between the councils and the companies/people/things they are paying.
Sometimes a council may have (or appear to have) several supplier relationships with the same company (charity/council/police authority), using different names or supplier IDs. This is sometimes down to a mistake in keying in the data, or for internal reasons, but either way it means several supplier records are created. It’s also worth noting that redacted payments are often grouped together as a single ‘supplier’, as the council may not have given any identifier to show that a redacted payment of £50,000 to a company (and in general there’s little reason to redact such payments) is to a different recipient than a redacted payment of £800 to a foster parent, for example.
However, using some clever matching and with the help of the increasing number of users who are matching suppliers to companies/charities and other entities on OpenlyLocal (just click on ‘add info’ when you’re looking at a supplier you think you can match to a company or charity)., we’ve matched about 40% of these to real-world organisations such as companies and charities.
While that might not seem very high, a good proportion of the rest will be sole-traders, individuals, or organisations we’ve not yet got a complete list of (Parish and Town councils, for example). And what it does mean is we can start to get a first draft of who supplies local government. And this is what we’ve got:
66,165 companies, with total payments of £3,884,271,203 (£3.88 billion), 38.1% of the total £10bn, in 579,518 transactions, making an average payment of £6,702.
8,236 charities, with total payments of £415,878,177, 4.1% of the total, in 55,370 transactions, making an average payment of £7,511.
Next time, we’ll look at the company suppliers in a little more detail, and later on the charities too, but for the moment, as you can see we’re listing the top 20 matched indivudual companies and charities that supply local government. Bear in mind a company like Capita does business with councils through a variety of different companies, and there’s no public dataset of the relationships between the companies, but that’s another story.
Finally, the whole dataset is available to download as open data under the same share-alike attribution licence as the rest of OpenlyLocal, including the matches to companies/charities that are receiving the money (the link is at the bottom of the Council Spending Data Dashboard). Be warned, however, it’s a very big file (there’s a row for every transaction), and so is too big for Excel (or even Google Fusion tables for that matter), so it’s most use to those using a database, or doing academic research.
* Note: there are inevitably loads of caveats to this data, including that councils are (despite the guidance) publishing the data in different ways, including, occasionally, aggregating payments, and using over-aggressive redaction. It’s also, obviously, only 40% of the councils in England., although that’s a pretty big sample size. Finally there may be errors both in the data as published, and in the importing of it. Please do let us know at firstname.lastname@example.org if you see any errors, or figures that just look wrong.
Since OpenlyLocal started pulling in council spending data, it’s niggled at me that it’s only half the story. Yes, as more and more data is published you’re beginning to get a much clearer idea of who’s paid what. And if councils publish it at a sufficient level of detail and consistently categorised, we’ll have a pretty good idea of what it’s spent on too.
However, useful though that is, that’s like taking a peak at a company’s bank statement and thinking it tells the whole story. Many of the payments relate to goods or services delivered some time in the past, some for things that have not yet been delivered, and there are all sorts of things (depreciation, movements between accounts, accruals for invoices not yet received) that won’t appear on there.
That’s what the council’s accounts are for — you know, those impenetrable things locked up in PDFs in some dusty corner of the council’s website, all sufficiently different from each other to make comparison difficult:
For some time, the holy grail for projects like OpenlyLocal and Where Does My Money Go has been to get the accounts in a standardized form to make comparison easy not just for accountants but for regular people too.
The thing is, such a thing does exist, and it’s sent by councils to central Government (the Department for Communities and Local Government to be precise) for them to use in their own figures. It’s a fairly hellishly complex spreadsheet called the Revenue Outturn form that must be filled in by the council (to get an idea have a look at the template here).
They’re not published anywhere by the DCLG, but they contain no state secrets or sensitive information; it’s just that the procedure being followed is the same one as they’ve always followed, and so they are not published, even after the statistics have been calculated from the data (the Statistics Act apparently prohibit publication until the stats have been published).
So I had an idea: wouldn’t it be great if we could pull the data that’s sitting in all these spreadsheets into a database and so allow comparison between councils’ accounts, thus freeing it from those forgotten corners of government computers.
This would seem to be a project that would be just about simple enough to be doable (though it’s trickier than it seems) and could allow ordinary people to understand their council’s spending in all sorts of ways (particularly if we add some of those sexy Where Does My Money Go visualisations). It could also be useful in ways that we can barely imagine – some of the participatory budget experiments going in on in Redbridge and other councils would be even more useful if the context of similar councils spending was added to the mix.
So how would this be funded. Well, the usual route would be for DCLG or perhaps the one of the Local Government Association bodies such as IDeA to scope out a proposal, involving many hours of meetings, reams of paper, and running up thousands of pounds in costs, even before it’s started.
They’d then put the process out to tender, involving many more thousands in admin, and designed to attract those companies who specialise in tendering for public sector work. Each of those would want to ensure they make a profit, and so would work out how they’re going to do it before quoting, running up their own costs, and inflating the final price.
So here’s part two of my plan, instead going down that route, I’d come up with a proposal that would:
- be a fraction of that cost
- be specified on a single sheet of paper
- paid for only if I delivered
Obviously there’s a clear potential conflict of interest here – I sit on the government’s Local Public Data Panel and am pushing strongly for open data, and also stand to benefit (depending on how good I am at getting the information out of those hundreds of spreadsheets, each with multiple worksheets, and matching the classification systems). The solution to that – I think – is to do the whole thing transparently, hence this blog post.
In a sense, what I’m proposing is that I scope out the project, solving those difficult problems of how to do it, with the bonus of instead of delivering a report, I deliver the project.
Is it a good thing to have all this data imported into a database, and shown not just on a website in a way non-accountants can understand, but also available to be combined with other data in mashups and visualisations? Definitely.
Is it a good deal for the taxpayer, and is this open procurement a useful way of doing things? Well you can read the proposal for yourself here, and I’d be really interested in comments both on the proposal and the novel procurement model.
As I mentioned in the last post, I’ve recently added council- and ward-level statistics to OpenlyLocal, using the data from the Office of National Statistics Neighbourhood Statistics database. All very well and nice to have it in the same place as the democratic info.
However, what I was really interested in was getting and showing statistics about local areas that’s a bit more, well, meaty. So when I did that statistical backend of OpenlyLocal I wanted to make sure that I could use it for other datasets from other sources.
The first of those is now online, and it’s a good one, the 2006-07 Local Spending Report for England, published in April 2009. What is this? In a nutshell it lists the spending by category for every council in England at the time of the report (there have been a couple of new ones since then).
Now this report has been available to download online if you knew it existed, as a pretty nasty and unwieldy spreadsheet (in fact the recent report to Parliament, Making local public expenditure data public and the development of Local Spending Reports, even has several backhanded references to the inaccessibility of it).
However, unless you enjoy playing with spreadsheets (and at the very minimum know how to unhide hidden sheets and read complex formulae), it’s not much use to you. Much more helpful, I think, is an accessible table you can drill down for more details.
Let’s start with the overview:
Here you can see the total spending for each council over all categories (and also a list of the categories). Click on the magnifying glass at the right of each row and you’ll see a breakdown of spending by main category:
Click again on the magnifying glass for any row now and you’ll see the breakdown of spending for the category of spending in that row:
Finally (for this part) if you click on the magnifying glass again you’ll get a comparison with councils of the same type (District, County, Unitary, etc) you can compare with other councils:
You can also compare between all councils. From the main page for the Local Spending Dataset, click on one of the categories and it will show you the totals for all councils. Click on one of the topics on that page and it will give you all councils for that topic. Well, hopefully you get the idea. Basically, have a play and give us some feedback.
[There'll also be a summary of the figures appearing on the front page for each council sometime in the next few hours.]
Comments, mistakes found, questions all welcome in the usual locations (comments below, twitter or email at CountCulture at gmail dot com).
Those who follow me on twitter will know that for the past couple of months I’ve been on-and-off looking at the Official for National Statistics Neighbourhood Statistics, and whether it would be possible and useful to show some of that information on OpenlyLocal.
Usually, when I’ve mentioned it on twitter it has usually been in the context of moaning about the less-than-friendly SOAP interface to the data (even by SOAP standards it’s unwieldy). There’s also the not insignificant issue of getting to grips with the huge amount of data, and how it’s stored on the ONS’s servers (at one stage I looked at downloading the raw data, but we’re talking about tens of thousands of files).
Still, like a person with a loose tooth, I’ve worried the problem on and off in quiet times with occasionally painful results (although the people at the ONS have been very helpful), and have now got to a level where (I think) it’s pretty useful.
Specifically, you can now see general demographic info for pretty much all the councils in England & Wales (unfortunately the ONS database doesn’t include Scotland or Northern Ireland, so if there’s anyone who can help me with those areas, I’d be pleased to hear from them).
More significantly, however, we’ve added a whole load of ward-level statistics:
Inevitably, much of the data comes from the 2001 Census (the next is due in 2011), and so it’s not bang up to date. However, it’s still useful and informative, particularly as you can compare the figures with the other wards in the council, or compare councils of similar type. Want to know which ward has the greatest proportion of people over the age of 90 years old. No prob, just click on the description (‘People aged 90 and over in this case) and you have it:
Doing the same on councils will bring up a comparison with similar councils (e.g. District councils are compared with other district councils, London Authorities with other London Authorities):
As you can see from the list of ONS datasets, there’s huge amounts of data to be shown, and we’ve only imported a small section, in part while we’re working out the best way of making it manageable. As you can see from the religion graph, where it makes more sense for it to be graphed we’ve done it that way, and you can expect to see more of that in the futrue.
It’s also worth mentioning that there are some gaps in the ONS’s database — principally where ward boundaries have changed, or where new local authorities have been formed, and if there’s only a small amount of info for a ward or council, that’s why.
In the meantime, have a play, and if there’s a dataset you want us to expose sooner rather than later, let me know in the comments or via twitter (or email, of course).
p.s. In case you’re wondering the graphs and data are fully accessible so should be fine for screenreaders. The comparison tables are just plain ordinary HTML tables with a bit of CSS styling to make them look like graphs, and the pie charts have the underlying data accompanying them as tables on the page (and can be seen by anyone else just by clicking on the chart).