Vaultpress and content snapshot of Groundviews

As of last week, Groundviews is backed up 24/7 using Vaultpress. In part, this was prompted by the DDoS attack targeted at recently. Not for the first time, this attack demonstrated that even the most robust web companies are subject to network outages, and that these outages can severely, indeed life-threateningly impact mission critical applications that use their services. While I’ve written about risk mitigation strategies in the past for processes that are pegged to cloud based information platforms and services, Groundviews is not on It’s currently on Dreamhost, which has over the years got progressively unreliable as a host, and contemplating a move to Slicehost for greater reliability and uptime. Vaultpress, recommended to me by a friend at InfoShare, looks extremely interesting and just the kind of tool that we needed.

The Basic plan at $15 a month was all we could afford (actually, it comes out of my pocket) but offers for self-hosted WordPress sites a comprehensive backup of the entire WordPress installation including,

  • Realtime Core Table Backup
  • Non Core Table Backups
  • Unlimited Snapshots
  • Downloadable Archives

More or as interesting as the backup service, which once installed via a plugin requires nothing more from the end user (a bit like Apple’s Time Machine) are the statistics of the site that Vaultpress provided. I often tell people that before Nigel Nugawela joined as co-editor of the site late 2009, I read every single comment and article that went up on the site since its inception. To this day, since most contributors still send their submissions directly to me, I still end up reading quite a bit of what’s on the site before it goes live. Though Nigel’s now taken over the greater bulk of comment moderation, I also do it quite a bit.

What this added to I thought was around five million words I had read and approved since the site’s launch.

Vaultpress stats for Groundviews today

What Vaultpress gives is a breakdown of this figure. As of today, it tells me the following:

  • Most productive day: Friday, 21% of posts published on a Friday
  • Most productive time: 6am, 28% of posts published in the 6 o’clock hour.
  • Most popular day: Friday, 18% of comments received on a Friday
  • Most popular time:10am, 6% of comments received in the 10 o’clock hour

The most productive time is a bit of a misnomer, since what this reflects is the time Nigel and I schedule most articles to appear on the site. Knowing the rest of the statistics is an invaluable tool to guide the publication of content to hit the most amount of readers.

Groundviews content snapshot, 20 March 2011

What Vaultpress also does is to give an overview of the content is has backed up. Groundviews posts are around 1,500 words at a conservative average. This does not include the long form journalism section we introduced this year, or the significantly longer contributions the End of War Special Edition featured. 1,788 into 1,500 words a piece is roughly a total of at least 2.6 million words. A comment on Groundviews is around 150 words. Again, this is a conservative figure. A number of commentators who are featured regularly send in thoughts often longer than the original article. That said, 26,608 comments at 150 words each is roughly comes to a total just shy of 4 million words.

That’s a combined total of at least 6.6 million words on the site since its inception 5 years ago.

It goes without saying that even if the site were to shut down today, this rich corpus of information and exchanges alone are a gold mine for anyone – including researchers in particular – looking for exchanges, statements and examples of content that were not published at all elsewhere, or first published on Groundviews. Note that this does not take into account the podcasts, the videos and the photos on the site, which often told their own story, or the site’s Facebook and Twitter updates. And as with the Special Edition on the End of the War, both the original content and commentary are of a timbre not found elsewhere online or in print.

From the blog alone, this is also a phenomenally rich dataset for rigorous semantic analysis, to see for example how conversations and content on the site, when juxtaposed against mainstream media as well as political developments, contested and connected with the status quo particularly during the height of war, and after it. Any takers? For example, I would love Deb Roy’s incredible machine aided language analysis routines to be applied to Groundviews, which even retroactively, can provide meaningful insights into the vexed interplay of media, citizenship and politics.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s