How much of information is too much information?

When it comes to Google, the size of the web and the size of their index are apparently very different.

What’s interesting to recognise here is that Google cannot afford to index ALL of the web. Coupled with the fact that we are losing, irrevocably, information that defines us a larger humanity or as identity groups and individuals, it just begs the question as to whether all this information has contributed to an equal growth in knowledge. 

I think not.

I’ve raised a number of questions that trouble me very deeply as someone deeply interested in saving the knowledge generated, used, abused and ignored in a peace process. Terabytes of information hugely pertinent to researchers, historians and scholars of a process as multi-faceted and complex as peacebuilding are often to be found in disparate proprietary systems with limited access, proprietary formats with encryption keys residing with those at risk themselves of being killed, badly managed archives, perishable media and aren’t backed up – to name just a few of the problems. 

I was caught by the fact that what people consider the web is actually what Google defines as the web:

But it’s also very expensive to index sites. And the fact that Google indexes many news sites, blogs and other rapidly changing web sites every 15 minutes makes all that indexing even more expensive. So they make value judgment on what to actually index and what not to. And most of the web is left out.

Emphasis mine. 

I find that last bit positively frightening.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s