The Persistence of Loss

At some point in my youth, I discovered one of the sad truths of history; mankind has discovered a great many things, and then lost that information again, repeating this tragedy over and over throughout the course of history. In some cases we can only speculate at what may have been lost (consider the construction of the pyramids, for example), in other cases we can confirm with relative certainty have rediscovered the lost information or technology, but nevertheless the historical periods in which mankind was without this information are depressingly long. Of course, to my naïve thinking, computers clearly provided the solution to this problem; store all of the information digitally, and never discard it; hard drives are cheap, Moore's law blah blah etc.

Unfortunately, that's only the beginning of the problem, not the end. For starters, we still don't have the storage facilities to store all of the information produced, although we are in a much better position to deal with it; storing books is no problem, for example, but storing all of the audio/visual information produced is a much more difficult task. However, even as our storage capabilities increase, another problem looms; it's no use storing information if you can't retrieve it again later, and retrieval by location is a severely limited mechanism. What we really need is retrieval by description; in other words, searching and "filtering". In the early days of the internet, even searching this "massive" distributed network was still a manageable problem; and so various search engines sprung up, indexing all of the content on the internet — or at least, all of the content reachable via the hyperlink graph. As the Internet continued to grow, the difficulty of indexing the internet grew along with it, and today search engines like Google rely on truly staggering farms of storage and indexing servers to keep up with it all.

Here we meet the real problem; they *don't* keep up with it all, anymore. Despite the massive infrastructure being applied to the problem, Google (and the other search engines) still only manage to index a small fraction of the Internet today. They offset this by trying to make sure they index the "important" stuff, but the net result is that we're still losing information every day. Most pages will eventually fall off the internet if left undiscovered, but even if the information remains on the network, it's of no use to anybody if they can't actually retrieve it; and to retrieve it, you first need to find it.

There are parallels to this problem in other areas; for example, the information storage mechanisms of the human brain: our "memory". People often have the experience of struggling to retrieve a particular piece of information from their memory; it's still there, but they have to wait until something "jogs" their memory before they can finally retrieve it. Research into the functioning of the brain is only beginning to give us an inkling of how memory storage and retrieval actually works on a neurological level, but certainly the high-level process seems to have many of the same problems that our external information storage systems have. In passing, it's interesting to note some of the scientific theory associated with my intuitive feelings about some of these issues; without departing into complete mysticism, you may find it interesting to look at Holonomic brain theory, as well as the Holographic principle, and maybe even take a look at some of the crackpots trying to unify all of this. I'm intuitively expecting something scientifically sound to emerge in this area, but we'll have to wait and see how that turns out.

That digression aside, I'm not sure where this leaves us. The closest biological model we have seems to suffer from the same problems, so that doesn't help us; and I'm not sure where else we have to look to. Is there some research in this area that I've missed? Some potential new technology that might solve the problem? Let me know if I'm missing out on something.

