Retention and discoverability: Disappearing tweets, archiving information and peacebuilding

The problem is simply stated.

Peacebuilding today, however defined and conducted, cannot exclude the importance and impact of content produced and shared on Twitter (and other social media platforms). This content is particularly resonant when produced and disseminated within cycles of violence, either during or after war. The volume of this content is growing. The pace with which it is produced and shared is accelerating. Content shared on Twitter includes, inter alia, eye-witness reports, photographs, audio and video, inconvenient truths, perspectives of minority groups, individuals, collectives and institutions at risk, alternative versions of contemporary history, situational awareness, locations of key events, vital testimony, conversations that aren’t replicated elsewhere, updates from the field, engagements with leading political, civil society voices and other voices from the diaspora as well as the recording of government propaganda.

In sum, this is information vital to a country’s historical narratives.

And this is all information that is being lost. History, As Recorded on Twitter, Is Vanishing From The Web, Say Computer Scientists, a recent article published in MIT’s Technology Review website notes,

…the older the social media, the more likely its content was to be missing. In fact, they found an almost linear relationship between time and the percentage lost.

The numbers are startling… 11 per cent of the social media content had disappeared within a year and 27 per cent within 2 years…. the world loses 0.02 per cent of its culturally significant social media material every day.

That’s a sobering thought. Social media plays an important role in the spread of information around the world. Of course, opinions differ over the importance of its role in the Arab Spring. But few would deny that this form of communication defines our time.

And now it’s vanishing.

Twitter seems to have recognised the danger. The company has promised before the end of 2012 the ability to download all tweets, though how, in what format and whether replies and retweets are also included is still unknown.

In an article I wrote for Groundviews today (Comprehensive archive of tweets: 8 May to 1 October 2012) I use two platforms – TweetBook and TweetDownload – to archive the thousands of tweets @groundviews hosts. It’s an imperfect solution to a complex challenge.

The complexity of archiving human rights related as well as civic media generated social media content is an issue I’ve grappled with for a number of years. In March 2012, I was part of Archiving Human Rights for Advocacy, Justice and Memory, an online discussion curated by New Tactics in Human Rights that featured a number of interesting perspectives in this regard.

Recently, I interviewed well known film-maker and photographer Anoma Rajakaruna on why archiving Sri Lanka’s audio and video productions, as cultural artefacts produced during and after war is fundamentally important for a richer historical record of socio-political changes in Sri Lanka.

Four years ago, in Managing the catastrophic loss of information and knowledge, I asked the following questions, answers to which are hard to come by even today.

  1. What will biographers, researchers, social scientists and others do in the future when much, if not all of the communications of their subjects may be rendered inaccessible by a single data centre outage, or lost to even the subjects themselves by failing to backup data?
  2. With bigger hard drives comes the risk of more information loss. This century may create more information than all others before it, but it will also lose more information. What are the technologies that can be used for cheap, reliable, easy to use local data storage that can create mini data centres for communities of users unable to afford comprehensive backup solutions of their own? Is there a case here for e-gov initiatives that actually promote backup solutions amongst citizens (I know of none to date).
  3. What are the data standards that can be used to store information produced today by the machines that will replace PCs and mobiles phones 25 years hence?
  4. Social networking sites are information black-holes as well as rich in personal information. If a site goes out of business, so does the information. How can we prevent this?
  5. For organisations such as the UN and even large NGOs (as well as corporations) information management in an age where there is more produced than can be stored is a nightmare. The organisations I work with can’t even find what they are producing today, leave aside searching for and accessing information produced a few years ago. How can institutional memory survive in a context of inevitable information loss?
  6. How does one harvest knowledge from all this information, much of it useless for the purposes and processes we want to be informed by?
  7. Old news is good news. What’s news to me is not just the latest RSS feed from BBC, but also resources that are pertinent to my life and work that may have been produced years ago. Buried in intranets or behind subscription walls or deep in social networks and websites, what are the technologies that will help locate and deliver these in a timely, easily and intuitively configurable and sustained manner across a range of media and devices?

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s