Field Stories

Arms race to liberate Africa’s data

Kenya data centre_USAID
Data clerks conduct a completeness of reports check at the simulation data centre, Kenya. Photo credit: Jef Karang’ae

Open data could add up to $3 trillion worth of economic activity per year worldwide, according to a study by McKinsey Consultants. But in the race to liberate thousands of data-sets from the government and business sectors, the African continent is seen as lagging behind. 

“Nowhere is the need for better data more urgent than in most African countries,” says the Data for African Development Working Group, whose final report cited data inaccuracies, insufficient funding and donors’ conflicting priorities as top factors that have marred projects in the past.

Enter Code for Africa, a new open-data project which provides ordinary citizens across the continent with valuable information based on data never utilised before.

One site helps Ghanaians confront the oil conglomerates who owe them money. Another maps election results in Kenya. Chief Strategist Justin Arenstein recently told an interviewer that he is currently helping to develop a smartphone camera microscope that will analyse contaminants found in local water sources, creating a live map of hazard spots.

eLearning Africa writer Steven Blum talked to Code for Africa’s ‘News Technologist’ Friedrich Lindenberg about the arms race to liberate data, why corruption is not the only reason some data remains hidden and how his organisation is driving civic engagement.

I recently learned that the South African government copyrights their data and uses PDFs to make it harder for computers to analyse government documents.  What are a few methods journalists and activists have been using to circumvent these measures?

It’s important to point out that the South African government is not alone in doing so, but that virtually all governments – including in my home country, Germany – engage in these strategies to make access to public data more difficult. At the same time it’s an arms race, because the tools to extract information from PDFs (like Tabula) and other information sources are getting better all the time. I don’t know that governments will be able to sustain such an ‘obscurity race’ without losing the trust of their constituents.

At the same time, it’s often also a question of engaging in a constructive discussion, where we highlight the benefits of transparency and developing a proper digital engagement strategy. In many cases, the release of data in inaccessible formats is also a question of the technical skill of those working within government; so you want to walk those public servants through the options they have but aren’t aware of.

Can you tell us about the ‘data scraper’ you’re working on? How does it work?

We develop quite a lot of scrapers – little programs that go through all the pages of a website and turn the published information into structured data that can be analysed systematically. Imagine a website listing all bills currently in parliament – it might allow you to filter them by MP, but not by party. Once you download all of the pages, you can do any analysis you want. Often there’s a good journalistic story hidden there, which was locked by a limiting website design before.

Award-winning projects like Ciudadano Inteligente in Argentina and Politifact in the US harness crowdsourcing to check the veracity of political statements. Are there similar initiatives in Africa to hold leaders accountable to their word?

AfricaCheck, which is one of the winners of the African News Information Challenge, does a tremendous job in fact-checking the assumptions behind a lot of reporting in the media. Their fact sheets on a variety of topics are often my first port of call to learn about an important topic, such as the Ebola outbreak in Western Africa.

How does your team identify new projects and objectives? How are local communities involved in this process?

Our approach is to work with existing organisations and networks, whether they are media, CSOs or even government, and to collaborate with them to think through how open data and digital engagement strategies can help them to do their jobs in ways that maximise their impact.

Often, these groups already have questions that they can’t properly address without data-driven analysis or crowdsourcing of information, and sometimes they even have existing data. Some even have large numbers of documents, but lack the means to sift through them – good examples might be Code for South Africa’s work with the Parliamentary Monitoring Group and the work on medicines pricing.

In other cases, information is publicly available, but not in a form that encourages people to take action and engage with their government. Code for Kenya’s GotToVote application – which helped drive citizen participation in the Kenyan elections by making information about voting locations easier to access – is a good example here.

Finally, what’s one innovation in participatory or investigative journalism that you’re most excited about?

There’s a lot of amazing work being done regarding the crowdsourcing of information, such as VozData, a project by Argentinian newspaper La Nacion that invited citizens to extract information from senate disclosure filings. At the same time, I’m really excited about the prospect of creating information resources that will let us investigate the dealings of globalised companies – such as OpenCorporates – which collects information about companies worldwide.

Mr Lindenberg, thank you for this interview

Image: USAID Kenya

Leave a Comment

Your email address will not be published. Required fields are marked *

*