A simple demand: let us record council meetings

A couple of months ago we had the ridiculous situation of a local council hauling up one of their councillors in front of a displinary hearing for posting videos of the council meeting on YouTube.

The video originated from the council’s own webcasts, and the complaint by Councillor Kemble was that in posting these videos on YouTube, another councillor, Jason Kitcat

(i) had failed to treat his fellow councillors with respect, by posting the clips without the prior knowledge or express permission of Councillor Theobald or Councillor Mears; and
(ii) had abused council facilities by infringing the copyright in the webcast images

and in doing so had breached the Members Code of Conduct.

Astonishingly, the standards committee found against Kitcat and ruled he should be suspended for up to six months if he does not write an apology to Cllr Theobald and submit to re-training on the roles and responsibilities of being a councillor, and it is only the fact that he is appealing to the First-Tier Tribunal (which apparently the council has decided to fight using hire outside counsel) that has allowed him to continue.

It’s worth reading the investigator’s report (PDF, of course) in full for a fairly good example of just how petty and ridiculous these issues become, particularly when the investigator writes things such as:

I consider that Cllr Kitcat did use the council’s IT facilities improperly for political purposes. Most of the clips are about communal bins, a politically contentious issue at the time. The clips are about Cllr Kitcat holding the administration politically to account for the way the bins were introduced, and were intended to highlight what the he believed were the administration’s deficiencies in that regard, based on feedback from certain residents.
Most tellingly, clip no. 5 shows the Cabinet Member responsible for communal bins in an unflattering and politically unfavourable light, and it is hard to avoid the conclusion that this highly abridged clip was selected and posted for political gain.

The using IT facilities, refers, by the way, not to using the council’s own computers to upload or edit the videos (it seems agreed by all that he used his own computer for this), but the fact that the webcasts were made and published on the web using the council’s equipment (or at least those of its supplier, Public-i). Presumably it he’d taken an extract from the minutes of a meeting published on the council’s website that would also have been using the council’s IT resources.

However, let’s step back a bit. This, ultimately, is not about councillors not understanding the web, failing to get get new technology and the ways it can open up debate. This is not even about the somewhat restrictive webcasting system which apparently only has the past six month’s meetings and is somewhat unpleasant to use (particularly if you use a Mac, or Linux — see a debate of the issues here).

This is about councillors failing to understand democracy, about the ability to taking the same material and making up your own mind, and critically trying to persuade others of that view.

In fact the investigator’s statement above, taking “a politically contentious issue at the time… holding the administration politically to account for the way the bins were introduced… to highlight what the he believed were the administration’s deficiencies in that regard” is surely a pretty good benchmark for a democracy.

So here’s simple suggestion for those drawing up the local government legislation at the moment, no let’s make that a demand, since that’s what it should be in a democracy (not a subservient request to your ‘betters’):

Give the public the right to record any council meeting using any device using Flip cams, tape recorders, frankly any darned thing they like as long as it doesn’t disrupt the meeting.

Not only would this open up council meetings and their obscure committees to wider scrutiny, it would also be a boost to hyperlocal sites that are beginning to take the place of the local media.

And if councils want to go to the expense of webcasting their meetings, then require them to make the webcasts available to download under an open licence. That way people can share them, convert them into open formats that don’t require proprietary software, subtititle them, and yes, even post them on YouTube.

I can already hear local politicians saying it will reduce the quality of political discourse, that people may use it in ways they don’t like and can’t control.

Does this seem familiar? It should. It’s the same arguments being given against publishing raw data. The public won’t understand. There may be different interpretations. How will people use it?

Well, folks that’s the point of a democracy. And that’s the point of a data democracy. We can use it in any way we damn well please. The public record is not there to make incumbent councillors or senior staff memebers look good. It’s there to allow the to be held to account. And to allow people to make up their own minds. Stop that, and you’re stopping democracy.

Links: For more posts relating to this case, see also Jason Kitcat’s own blog postsBrighton Argus post, and posts form Mark Pack at Liberal Democrat voice, Jim Killock,  Conservative Home, and even a tweet from Local Government minister Grant Shapps.


Drawing up the Local Spending Data guidelines… and how Google Docs saved the day

Last Thursday, the Local Public Data Panel on which I sit approved the final draft of the guidelines for publishing by councils of their spending over £500 (version 1.0 if you like). These started back in June, with a document Will Perrin and I drew up in response to a request from Camden council, and attracted a huge number of really helpful comments.

Since then, things have moved on a bit. The loose guidelines were fine as a starting point, especially as at that time we were talking theoretically, and hadn’t really had any concrete situations or data to deal with, but from speaking to councils, and actually using the data it became clear the something much firmer was needed.

What followed then was the usual public sector drafting nightmare, with various Word documents being emailed around, people getting very territorial, offline conversations, and frankly something that wasn’t getting very far.

However, a week beforehand I’d successfully used a shared Google Spreadsheet to free up a similar problem. In that case there were a bunch of organisations (including OpenlyLocal, the Local Government Association and Department for Communities and Local Government) that needed an up-to-date list of councils publishing spending data, together with the licence, URL and whether it was machine-readable (Basically what Adrian Short was doing here at one time – I’d asked him if he wanted to do it, but he didn’t have the time to keep his up-to-date.) In addition, it was clear that we each knew about councils the others didn’t.

The answer could have been a dedicated web app, a Word document that was added to and emailed around (actually that’s what started to happen). In the end, it was something much simpler – a Google spreadsheet with edit access given to multiple people. I used the OpenlyLocal API to populate the basic structure (including OpenlyLocal URLs, which mean that anyone getting the data via the API, or as a CSV would have a place they could query for more data), and bingo, it was sorted.

So given this success, Jonathan Evans from the LGA and  I agreed to use the Google Docs approach with the spending guidelines. There are multiple advantages to this, but some are particularly relevant for tackling such a problem:

  • We can all work on the document at the same time, messaging each others as we go, avoiding the delays, arguments and territoriality of the document emailing approach.
  • The version tracking means that all your changes, not just those of the saved version are visible to all participants (and to people who subsequently become participants). This seems to lead to a spirit of collaboration rather than position-taking, and at least on this occasion avoided edit-wars.
  • The world can see the product of your work, without having to separately publish it (though see note below)

You can also automatically get the information as data, either through the Google Docs API or more likely in the case of a spreadsheet particularly, as a CSV file. Construct it with this in mind (i.e. 1 header row), and you’ve got something that can be instantly used in mashups and visualisations.

    Important note 1: The biggest problem with this approach in central government is Internet Explorer 6, which the Department of Communities & Local Government are stuck on and have no plans to upgrade. This means the approach only works when people are prepared to make the additions at home, or some other place that have a browser less than 9 years old.

    Important note 2: Despite having put together the spending scoreboard spreadsheet, we were hopeless at telling the wider world about it, meaning that Simon Rogers at the Guardian ended up duplicating much of the work. Interestingly he was missing some that we knew about, and vice versa, and I’ve offered him edit access to the main spreadsheet so we can all work together on the same one.

    Important note 3: A smaller but nevertheless irritating problem with Google Documents (and this seems to be true of Word and OpenOffice too) is that when they contain tables you get a mess of inaccessible HTML, with the result that when the spending guidance was put on the Local Public Data Panel website, the HTML had to be largely rewritten from scratch (by one of the data.gov.uk stars late at night). So Google, if you’re listening, please allow an option to export as accessible HTML.


    Introducing OpenCharities: Opening up the Charities Register

    A couple of weeks ago I needed a list of all the charities in the UK and their registration numbers so that I could try to match them up to the local council spending data OpenlyLocal is aggregating and trying to make sense of. A fairly simple request, you’d think, especially in this new world of transparency and open data, and for a dataset that’s uncontentious.

    Well, you’d be wrong. There’s nothing at data.gov.uk, nothing at CKAN and nothing on the Charity Commission website, and in fact you can’t even see the whole register on the website, just the first 500 results of any search/category. Here’s what the Charities Commission says on their website (NB: extract below is truncated):

    The Commission can provide an electronic copy in discharge of its duty to provide a legible copy of publicly available information if the person requesting the copy is happy to receive it in that form. There is no obligation on the Commission to provide a copy in this form…

    The Commission will not provide an electronic copy of any material subject to Crown copyright or to Crown database right unless it is satisfied… that the Requestor intends to re-use the information in an appropriate manner.

    Hmmm. Time for Twitter to come to the rescue to check that some other independently minded person hasn’t already solved the problem. Nothing, but I did get pointed to this request for the data to be unlocked, with the very recent response by the Charity Commission, essentially saying, “Nope, we ain’t going to release it”:

    For resource reasons we are not able to display the entire Register of Charities. Searches are therefore limited to 500 results… We cannot allow full access to all the data, held on the register, as there are limitations on the use of data extracted from the Register… However, we are happy to consider granting access to our records on receipt of a written request to the Departmental Record Officer

    OK, so it seems as though they have no intention of making this data available anytime soon (I actually don’t buy that there are Intellectual Property or Data Privacy issues with making basic information about charities available, and if there really are this needs to be changed, pronto), so time for some screen-scraping. Turns out it’s a pretty difficult website to scrape, because it requires both cookies and javascript to work properly.

    Try turning off both in your browser, and see how far you get, and then you’ll also get an idea of how difficult it is to use if you have accessibility issues – and check out their poor excuse for accessibility statement, i.e. tough luck.

    Still, there’s usually a way, even if it does mean some pretty tortuous routes, and like the similarly inaccessible Birmingham City Council website, this is just the sort of challenge that stubborn so-and-so’s like me won’t give up on.

    And the way to get the info seems to be through the geographical search (other routes relied upon Javascript), and although it was still problematic, it was doable. So, now we have an open data register of charities, incorporated into OpenlyLocal, and tied in to the spending data being published by councils.

    Charity supplier to Local authority

    And because this sort of thing is so easy, once you’ve got it in a database (Charity Commission take note), there are a couple of bonuses.

    First, it was relatively easy to knock up a quick and very simple Sinatra application, OpenCharities:

    Open Charities :: Opening up the UK Charities Register

    If there’s any interest, I’ll add more features to it, but for now, it’s just a the simplest of things, a web application with a unique URL for every charity based on its charity number, and with the  basic information for each charity is available as data (XML, JSON and RDF). It’s also searchable, and sortable by most recent income and spending, and for linked data people there are dereferenceable Resource URIs.

    This is very much an alpha application: the design is very basic and it’s possible that there are a few charities missing – for two reasons. One: the Charity Commission kept timing out (think I managed to pick up all of those, and they should get picked up when I periodically run the scraper); and two: there appears to be a bug in the Charity Commission website, so that when there’s between 10 and 13 entries, only 10 are shown, but there is no way of seeing the additional ones. As a benchmark, there are currently 150,422 charities in the OpenCharities database.

    It’s also worth mentioning that due to inconsistencies with the page structure, the income/spending data for some of the biggest charities is not yet in the system. I’ve worked out a fix, and the entries will be gradually updated, but only as they are re-scraped.

    The second bonus is that the entire database is available to download and reuse (under an open, share-alike attribution licence). It’s a compressed CSV file, weighing in at just under 20MB for the compressed version, and should probably only attempted by those familiar with manipulating large datasets (don’t try opening it up in your spreadsheet, for example). I’m also in the process of importing it into Google Fusion Tables (it’s still churning away in the background) and will post a link when it’s done.

    Now, back to that spending data.


    A Local Spending Data wish… granted

    The very wonderful Stuart Harrison (aka pezholio), webmaster at Lichfield District Council, blogged yesterday with some thoughts about the publication of spending data following a local spending data workshop in Birmingham. Sadly I wasn’t able to attend this, but Stuart gives a very comprehensive account, and like all his posts it’s well worth reading.

    In it he made an important observation about those at the workshop who were pushing for linked data from the beginning, and wished there was a solution. First the observation:

    There did seem to be a bit of resistance to the linked data approach, mainly because agreeing standards seems to be a long, drawn out process, which is counter to the JFDI approach of publishing local data… I also recognise that there are difficulties in both publishing the data and also working with it… As we learned from the local elections project, often local authorities don’t even have people who are competent in HTML, let alone RDF, SPARQL etc.

    He’s not wrong there. As someone who’s been publishing linked data for some time, and who conceived and ran the Open Election Data project Stuart refers to, working with numerous councils to help them publish linked data I’m probably as aware of the issues as anyone (ironically and I think significantly none of the councils involved in the local government e-standards body, and now pushing so hard for the linked data, has actually published any linked data themselves).

    That’s not to knock linked data – just to be realistic about the issues and hurdles that need to be overcome (see the report for a full breakdown), and that to expect all the councils to solve all these problems at the same time as extracting the data from their systems, removing data relating to non-suppliers (e.g. foster parents), and including information from other systems (e.g. supplier data, which may be on procurement systems), and all by January, is  unrealistic at best, and could undermine the whole process.

    So what’s to be done? I think the sensible thing, particularly in these straitened times, is to concentrate on getting the raw data out, and as much of it as possible, and come down hard on those councils who publish it badly (e.g. by locking it up in PDFs or giving it a closed licence), or who willfully ignore the guidance (it’s worrying how few councils publishing data at the moment don’t even include the transaction ID or date of the transaction, never mind supplier details).

    Beyond that we should take the approach the web has always done, and which is the reason for its success: a decentralised, messy variety of implementations and solutions that allows a rich eco-system to develop, with government helping solve bottlenecks and structural problems rather than trying to impose highly centralised solutions that are already being solved elsewhere.

    Yes, I’d love it if the councils were able to publish the data fully marked up, in a variety of forms (not just linked data, but also XML and JSON), but the ugly truth is that not a single council has so far even published their list of categories, never mind matched it up to a recognised standard (CIPFA BVACOP, COFOG or that used in their submissions to the CLG), still less done anything like linked data. So there’s a long way to go, and in the meantime we’re going to need some tools and cheap commodity services to bridge the gap.

    [In a perfect world, maybe councils would develop some open-source tools to help them publish the data, perhaps using something like Adrian Short's Armchair Auditor code as the basis (this is a project that took a single council, WIndsor & Maidenhead, and added a web interface to the figures). However, when many councils don't even have competent HTML skills (having outsourced much of it), this is only going to happen at a handful of councils at best, unless considerable investment is made.]

    Stuart had been thinking along similar lines, and made a suggestion, almost a wish in fact:

    I think the way forward is a centralised approach, with authorities publishing CSVs in a standard format on their website and some kind of system picking up these CSVs (say, on a monthly basis) and converting this data to a linked data format (as well as publishing in vanilla XML, JSON and CSV format).

    He then expanded on the idea, talking about a single URL for each transaction, standard identifiers, “a human-readable summary of the data, together with links to the actual data in RDF, XML, CSV and JSON”. I’m a bit iffy about that ‘centralised approach’ phrase (the web is all about decentralisation), but I do think there’s an opportunity to help both the community and councils by solving some of these problems.

    And  that’s exactly what we’ve done at OpenlyLocal, adding the data from all the councils who’ve published their spending data, acting as a central repository, generating the URLs, and connecting the data together to other datasets and identifiers (councils with Snac IDs, companies with Companies House numbers). We’ve even extracted data from those councils who unhelpfully try to lock up their data as PDFs.

    There are at time of writing 52,443 financial transactions from 9 councils in the OpenlyLocal database. And that’s not all, there’s also the following features:

    • Each transaction is tied to a supplier record for the council, and increasingly these are linked to company info (including their company number), or other councils (there’s a lot of money being transferred between councils), and users can add information about the supplier if we haven’t matched it up.
    • Every transaction, supplier and company has a permanent unique URL and is available as XML and JSON
    • We’ve sorted out some of the date issues (adding a date fuzziness field for those councils who don’t specify when in the month or quarter a transaction relates to).
    • Transactions are linked to the URL from which the file was downloaded (and usually the line number too, though obviously this is not possible if we’ve had to extract it from a PDF), meaning anyone else can recreate the dataset should they want to.
    • There’s an increasing amount of analysis, showing ordinary users spending by month, biggest suppliers and transactions, for example.
    • The whole spending dataset is available as a single, zipped CSV file to download for anyone else to use.
    • It’s all open data.

    There are a couple of features Stuart mentions that we haven’t yet implemented, for good reason.

    First, we’re not yet publishing it as linked data, for the simple reason that the vocabulary hasn’t yet been defined, nor even the standards on which it will be based. When this is done, we’ll add this as a representation.

    And although we use standard identifiers such as SNAC ids for councils (and wards) on OpenlyLocal, the URL structure Stuart mentions is not yet practical, in part because SNAC ids doesn’t cover all authorities (doesn’t include the GLA, or other public bodies, for example), and only a tiny fraction of councils are publishing their internal transaction ids.

    Also we haven’t yet implemented comments on the transactions for the simple reason that distributed comment systems such as Disqus are javascript-based and thus are problematic for those with accessibility issues, and site-specific ones don’t allow the conversation to be carried on elsewhere (we think we might have a solution to this, but it’s at an early stage, and we’d be interested to hear other idea).

    But all in all, we reckon we’re pretty much there with Stuart’s wish list, and would hope that councils can get on with extracting the raw data, publishing it in an open, machine-readable format (such as CSV), and then move to linked data as their resources allow.


    Local Spending in OpenlyLocal: what features would you like to see?

    As I mentioned in a previous post, OpenlyLocal has now started importing council local spending data to make it comparable across councils and linkable to suppliers. We now added some more councils, and some more features, with some interesting results.

    As well as the original set of Greater London Authority, Windsor & Maidenhead and Richmond upon Thames, we’ve added data from Uttlesford, King’s Lynn & West Norfolk and Surrey County Council (incidentally, given the size of Uttlesford and of King’s Lynn & West Norfolk, if they publish this data, any council should be able to).

    We’ve also added a basic Spending Dashboard, to give an overview of the data we’ve imported so far:

    Of course the data provided is of variable quality and in various formats. Some, like King’s Lynn & Norfolk are in simple, clean CSV files. Uttlesford have done it as a spreadsheet with each payment broken down to the relevant service, which is a bit messy to import but adds greater granularity than pretty much any other council.

    Others, like Surrey, have taken the data that should be in a CSV file and for no apparent reason have put it in a PDF, which can be converted, but which is a bit of a pain to do, and means maunal intervention to what should be a largely automatic process (challenge for journos/dirt-hunters: is there anything in the data that they’d want to hide, or is it just pig-headedness).

    But now we’ve got all that information in there we can start to analyse it, play with it, and ask questions about it, and we’ve started off by showing a basic dashboard for each council.

    For each council, it’s got total spend, spend per month, number of suppliers & transactions, biggest suppliers and biggest transactions. It’s also got the spend per month (where a figure is given for a quarter, or two-month period, we’ve averaged it out over the relevant months). Here, for example, is the one for the Greater London Authority:

    Lots of interesting questions here, from getting to understand all those leasing costs paid via the Amas Ltd Common Receipts Account, to what the £4m paid to Jack Morton Worldwide (which describes itself as a ‘global brand experience agency’) was for. Of course you can click on the supplier name for details of the transactions and any info that we’ve got on them (in this case it’s been matched to a company – but you can now submit info about a company if we haven’t matched it up).

    You can then click on the transaction to find out more info on it, if that info was published, but which is perhaps the start of an FoI request either way:

    It’s also worth looking at the Spend By Month, as a raw sanity-check. Here’s the dashboard for Windsor & Maidenhead:

    See that big gap for July & August 09. My first thought was that there was an error with importing the data, which is perfectly possible, especially when the formatting changes frequently as it does in W&M’s data files, but looking at the actual file, there appear to be no entries for July & August 09 (I’ve notified them and hopefully we’ve get corrected data published soon). This, for me, is one of the advantages of visualizations: being able to easily spot anomalies in the data, that looking at tables or databases wouldn’t show.

    So what further analyses would you like out of the box: average transaction size, number of transactions over £1m, percentage of transactions for a round number (i.e. with a zero at the end),  more visualizations? We’d love your suggestions – please leave them in the comments or tweet me.


    Local spending data in OpenlyLocal, and some thoughts on standards

    A couple of weeks ago Will Perrin and I, along with some feedback from the Local Public Data Panel on which we sit, came up with some guidelines for publishing local spending data. They were a first draft, based on a request by Camden council for some guidance, in light of the announcement that councils will have to start publishing details of spending over £500.

    Now I’ve got strong opinions about standards: they should be developed from real world problems, by the people using them and should make life easier, not more difficult. It slightly concerned me that in this case I wasn’t actually using any of the spending data – mainly because I hadn’t got around to adding it in to OpenlyLocal yet.

    This week, I remedied this, and pulled in the data from those authorities that had published their local spending data – Windsor & Maidenhead, the GLA and the London Borough of Richmond upon Thames. Now there’s a couple of sites (including Adrian Short’s Armchair Auditor, which focuses on spending categories) already pulling the Windsor & Maidenhead data but as far as I’m aware they don’t include the other two authorities, and this adds a different dimension to things, as you want to be able to compare the suppliers across authorities.

    First, a few pages from OpenlyLocal showing how I’ve approached it (bear in mind they’re a very rough first draft, and I’m concentrating on the data rather than the presentation). You can see the biggest suppliers to a council right there on the council’s main page (e.g. Windsor & Maidenhead, GLA, Richmond):

    Click through to more info gets you a pagination view of all suppliers (in Windsor & Maidenhead’s case there are over 2800 so far):

    Clicking any of these will give you the details for that supplier, including all the payments to them:

    And clicking on the amount will give you a page just with the transaction details, so it can be emailed to others

    But we’re getting ahead of ourselves. The first job is to import the data from the CSV files into a database and this was where the first problems occurred. Not in the CSV format – which is not a problem, but in the consistency of data.

    Take Windsor & Maidenhead (you should just be able to open these files an any spreadsheet program). Looking at each data set in turn and you find that there’s very little consistency – the earliest sets don’t have any dates and aggregate across a whole quarter (but do helpfully have the internal Supplier ID as well as the supplier name). Later sets have the transaction date (although in one the US date format is used, which could catch out those not looking at them manually), but omit supplier ID and cost centre.

    On the GLA figures, there’ a similar story, with the type of data and the names used to describe changing seemingly randomly between data sets. Some of the 2009 ones do have transaction dates, but the 2010 one generally don’t, and the supplier field has different names, from Supplier to Supplier Name to Vendor.

    This is not to criticise those bodies – it’s difficult to produce consistent data if you’re making the rules up as you go along (and given there weren’t any established precedents that’s what they were doing), and doing much of it by hand. Also, they are doing it first and helping us understand where the problems lie (and where they don’t). In short they are failing forward –getting on with it so they can make mistakes from which they (and crucially others) can learn.

    But who are these suppliers?

    The bigger problem, as I’ve said before, is being able to identify the suppliers, and this becomes particularly acute when you want to compare across bodies (who may name the same company or body slightly differently). Ideally (as we put in the first draft of the proposals), we would have the company number (when we’re talking about a company, at any rate), but we recognised that many accounts systems simply won’t have this information, and so we do need some information that helps use identify them.

    Why do we want to know this information? For the same reason we want any ID (you might as well ask why Companies House issues Company Numbers and requires all companies to put that number on their correspondence) – to positively identify something without messing around with how someone has decided to write the name.

    With the help of the excellent Companies Open House I’ve had a go at matching the names to company numbers, but it’s only been partially successful. When it is, you can do things like this (showing spend with other councils on a suppliers’ page):

    It’s also going to allow me to pull in other information about the company, from Companies House and elsewhere. For other bodies (i.e. those without a company number), we’re going to have to find another way of identifying them, and that’s next on the list to tackle.

    Thoughts on those spending data guidelines

    In general I still think they’re fairly good, and most of the shortcomings have been identified in the comments, or emailed to us (we didn’t explicitly state that the data should be available under an open licence such as the one at data.gov.uk, and be definitely should have done). However, adding this data to OpenlyLocal (as well as providing a useful database for the community) has crystalised some thoughts:

    • Identification of the bodies is essential, and it think we were right to make this a key point, but it’s likely we will need to have the government provide a lookup table between VAT numbers and Company Numbers.
    • Speaking of Government datasets, there’s no way of finding out the ancestry of a company – what its parent company is, what its subsidiaries are, and that’s essential if we’re to properly make use of this information, and similar information released by the government. Companies House bizarrely doesn’t hold this information, but the Office For National Statistics does, and it’s called the Inter Departmental Business Register. Although this contains a lot of information provided in confidence for statistical reasons, the relationships between companies isn’t confidential (it just isn’t stored in one place), so it would be perfectly feasible to release this information.
    • We should probably be explicit whether the figures should include VAT (I think the Windsor & Maidenhead ones don’t include it, but the GLA imply that theirs might).
    • Categorisation is going to be a tricky one to solve, as can be seen from the raw data for Windsor & Maidenhead – for example the Children’s Services Directorate is written as both Childrens Services & Children’s Services, and it’s not clear how this, or the subcateogries, ties into standard classifications for government spending, making comparison across authorities tricky.
    • I wonder what would be the downside to publishing the description details, even, potentially, the invoice itself. It’s probably FOI-able, after all.

    As ever, comments welcome, and of course all the data is available through the API under an open licence.

    C


    New feature: search for information by postcode

    Why was it important that the UK government open up the geographic infrastructure? Because it makes so many location-based things that were tortuous, almost trivial.

    Previously, getting open data about your local councillors, given just a postcode, was a tortuous business, requiring multiple calls to different sites. Now, it is easy. Just go to http://openlylocal.com/areas/postcodes/%5Byourpostcodehere%5D and, bingo, you’re done.

    You can also just put your postcode in the search box on any OpenlyLocal page to do the same thing. And, obviously, you can also download the data as XML or JSON, and with an open data licence that allows reuse by anybody, even commercial reuse.

    There’s still a little bit of tweaking to be done. I need to match up postcodes county electoral divisions, and I’m planning on adding RDF to the data types returned. Finally, it’d be great to show the ward boundaries on a map, but I think that may take a little more work.


    The GLA and open data: did he really say that?

    The launch on Friday of the Greater London Authority’s open data initiative (aka London Datastore) was a curious affair, and judging from some of the discussions in the pub after, I think that the strangeness – a joint teleconferenced event with CES Las Vegas – possibly overshadowed its significance and the boldness of the GLA’s action.

    First off the technology let it down – if Skype wanted to give a demo of just how far short its video conferencing is from prime time they did a perfect job. Boris did a great impromptu stand-up routine, looking for the world like he was still up from the night before, but the people at CES in Las Vegas missed the performance and whose images and words occasionally stuttered in to life to interrupt the windows/skype error messages.

    However in between the gags Boris came out with this nugget, “We will open up all the GLA’s data for free reuse”.

    What does that mean, I wondered, all their data? All that’s easy to do? Does it include info from TransportForLondon (TfL), the Metropolitan Police? To be honest I sort of assumed it was Boris just paraphrasing. Nevertheless, I thought, it could be a good stick to enforce change later on.

    However then it was Deputy Mayor Sir Simon Milton’s turn to give the more scripted, more plodding, more coherent version. This was the bit where we would find out what’s really going to happen. [What you need to realise that the GLA doesn't actually have a lot of its own data - mostly it's just some internal stuff, slices of central government data, and grouping of London council info. The good stuff is owned by those huge bodies, such as TfL and the Met, that it oversees.

    So when Steve said: "I hope that our discussions with the GLA group will be fruitful and that in the short term we can encourage them to release that data which is not tied to commercial contracts and in the longer term encourage them when these contracts come up for renewal to apply different contractual principles that would allow for the release of all of their data into the public domain", all I heard was yada yada yada.

    The next bit, however, genuinely took me by surprise:

    "I can confirm today, however, that as a result of our discussions around the Datastore, TfL are willing to make raw data available through the Datastore. Initially this will be data which is already available for re-use via the TfL website, including live feeds from traffic cameras, geo-coded information on the location of Tube, DLR and Overground stations, the data behind the Findaride service to locate licensed mini-cab and private hire operators and data on planned works affecting weekend Tube services.

    "TfL will also be considering how best to make available detailed timetabling data for its services and welcomes examples of other data which could also be prioritised for inclusion in the Datastore such as the data on live departures and Tube incidents on TfL’s website"

    So stunned was I in fact (and many others too) we that we didn't ask any questions when he finished talking came to it , or for that matter congratulate Boris/Simon on the steps they were taking.

    Yes, it's nothing that hasn't been done in Washington DC or San Francisco, and it isn't as  big a deal as the Government's open data announcement on December 7 (which got scandalously little press coverage, even in the broadsheets, yet may well turn out to be the most important act of this government).

    However it is a huge step for local government in the UK and sets a benchmark for other local authorities to attain, and for the GLA to have achieved what it already has with Transport for London will only have come after a considerable trial of will, and one, significantly, that they won.

    So, Simon & Boris, and all those who fought the battle with TfL, well done. Now let's see some action with the other GLA bodies - the Met, London Development Agency, London Fire Brigade, he London Pensions Fund Authority in particular (I'm still trying to figure out its relationship to Visit London and the London Travel Watch).

    Update: Video embedded below

    Useful links:


    Opening up Local Spending Reports on OpenlyLocal

    As I mentioned in the last post, I’ve recently added council- and ward-level statistics to OpenlyLocal, using the data from the Office of National Statistics Neighbourhood Statistics database. All very well and nice to have it in the same place as the democratic info.

    However, what I was really interested in was getting and showing statistics about local areas that’s a bit more, well, meaty. So when I did that statistical backend of OpenlyLocal I wanted to make sure that I could use it for other datasets from other sources.

    The first of those is now online, and it’s a good one, the 2006-07 Local Spending Report for England, published in April 2009. What is this? In a nutshell it lists the spending by category for every council in England at the time of the report (there have been a couple of new ones since then).

    Now this report has been available to download online if you knew it existed, as a pretty nasty and unwieldy spreadsheet (in fact the recent report to Parliament, Making local public expenditure data public and the development of Local Spending Reports, even has several backhanded references to the inaccessibility of it).

    However, unless you enjoy playing with spreadsheets (and at the very minimum know how to unhide hidden sheets and read complex formulae), it’s not much use to you. Much more helpful, I think, is an accessible table you can drill down for more details.

    Let’s start with the overview:

    Overview of Local Spending by Council for England

    Here you can see the total spending for each council over all categories (and also a list of the categories). Click on the magnifying glass at the right of each row and you’ll see a breakdown of spending by main category:

    Local Spending breakdown for given council

    Click again on the magnifying glass for any row now and you’ll see the breakdown of spending for the category of spending in that row:

    Finally (for this part) if you click on the magnifying glass again you’ll get a comparison with councils of the same type (District, County, Unitary, etc) you can compare with other councils:

    You can also compare between all councils. From the main page for the Local Spending Dataset, click on one of the categories and it will show you the totals for all councils. Click on one of the topics on that page and it will give you all councils for that topic. Well, hopefully you get the idea. Basically, have a play and give us some feedback.

    [There'll also be a summary of the figures appearing on the front page for each council sometime in the next few hours.]

    There’s no fancy javascript or visualizations yet (although we are talking with the guys at OKFN,  who do the excellent WhereDoesMyMoneyGo, about collaborating), but that may come. For the moment, we’ve kept it simple, understandable, and accessible.

    Comments, mistakes found, questions all welcome in the usual locations (comments below, twitter or email at CountCulture at gmail dot com).



    About your local area: ward-level statistics come to OpenlyLocal

    Those who follow me on twitter will know that for the past couple of months I’ve been on-and-off looking at the Official for National Statistics Neighbourhood Statistics, and whether it would be possible and useful to show some of that information on OpenlyLocal.

    Usually, when I’ve mentioned it on twitter it has usually been in the context of moaning about the less-than-friendly SOAP interface to the data (even by SOAP standards it’s unwieldy). There’s also the not insignificant issue of getting to grips with the huge amount of data, and how it’s stored on the ONS’s servers (at one stage I looked at downloading the raw data, but we’re talking about tens of thousands of files).

    Still, like a person with a loose tooth, I’ve worried the problem on and off in quiet times with occasionally painful results (although the people at the ONS have been very helpful), and have now got to a level where (I think) it’s pretty useful.

    Specifically, you can now see general demographic info for pretty much all the councils in England & Wales (unfortunately the ONS database doesn’t include Scotland or Northern Ireland, so if there’s anyone who can help me with those areas, I’d be pleased to hear from them).

    Area Statistics for Preston Council on OpenlyLocal

    More significantly, however, we’ve added a whole load of ward-level statistics:

    Example of ward-level ONS statistics

    Inevitably, much of the data comes from the 2001 Census (the next is due in 2011), and so it’s not bang up to date. However, it’s still useful and informative, particularly as you can compare the figures with the other wards in the council, or compare councils of similar type. Want to know which ward has the greatest proportion of people over the age of 90 years old. No prob, just click on the description (‘People aged 90 and over in this case) and you have it:

    Doing the same on councils will bring up  a comparison with similar councils (e.g. District councils are compared with other district councils, London Authorities with other London Authorities):

    As you can see from the list of ONS datasets, there’s huge amounts of data to be shown, and we’ve only imported a small section, in part while we’re working out the best way of making it manageable. As you can see from the religion graph, where it makes more sense for it to be graphed we’ve done it that way, and you can expect to see more of that in the futrue.

    It’s also worth mentioning that there are some gaps in the ONS’s database — principally where ward boundaries have changed, or where new local authorities have been formed, and if there’s only a small amount of info for a ward or council, that’s why.

    In the meantime, have a play, and if there’s a dataset you want us to expose sooner rather than later, let me know in the comments or via twitter (or email, of course).

    C

    p.s. In case you’re wondering the graphs and data are fully accessible so should be fine for screenreaders. The comparison tables are just plain ordinary HTML tables with a bit of CSS styling to make them look like graphs, and the pie charts have the underlying data accompanying them as tables on the page (and can be seen by anyone else just by clicking on the chart).


    Follow

    Get every new post delivered to your Inbox.