Today, the Guardian warns:
“Too many of us suffer from a condition that is going to leave our grandchildren bereft,” Brindley states. “I call it personal digital disorder. Think of those thousands of digital photographs that lie hidden on our computers. Few store them, so those who come after us will not be able to look at them. It’s tragic.”
She believes similar gaps could appear in the national memory, pointing out that, contrary to popular assumption, internet companies such as Google are not collecting and archiving material of this type. It is left instead to the libraries and archives which have been gathering books, periodicals, newspapers and recordings for centuries. With an interim report from communications minister Lord Carter on the future of digital Britain imminent, Brindley makes the case for the British Library as the repository that will ensure emails and websites are preserved as reliably as manuscripts and books.
I don’t have a lot of sympathy for this imaginary plight of future historians, in spite of being a librarian. And it’s not because I don’t see the value in content that’s on the web. There are two sides of the question that I take issue with.
First: “everything should be archived”. This is simply impossible, and is actually misunderstanding what the internet is. If you understand it as a vast publication domain, where things are published every day that just don’t happen to be books, then this desire to archive it all makes sense. But is the stuff of the internet really published? Well, what does “published” really mean?
To be honest, I think the term has no meaning anymore. At one point, “published” meant that a whole team of people thought what you wrote was worth producing, selling, and storing. It comes with a sense of authority, a kind of title. It’s a way we divide the masses into those we want to listen to and those we don’t, in many different arenas. It connotes a sense of value (to someone, at least). Many people object to the idea that there’s value of any kind of the wild open internet, because just anyone can “publish”. I learned in my reference class at library school that one should always check the author of a book to see who they are and what institution they’re associated with before taking them seriously; if you fall outside our institutions, why, surely you have nothing of value to say, and you’re probably lying! Wikipedia: case in point. We have our ways to determine whether we ought to consider what you’re saying not based on the content, but on who and what you are. Apparently this protects us from ever having to have critical reading skills. We are afraid of being duped, so we cling to our social structures.
So many people just turn that “publish” definition on its head and say everything on the internet is “published”, everyone has a pulpit, everyone can be heard in the same way. I object to this as well. Turning an ineffective idea upside down doesn’t get us any closer to a useful definition of a term, or a practice.
Currently, this is how I define “publication”: blocks of text that are published by a company have been vetted and determined to be sellable to whatever audience the company serves. This holds for fiction, for academic work, etc.
Is content on the web “published”? What does that even mean? I think we start shifting to turn that meaning into “available”. If I write something and post it online, it’s available to anyone who wants to see it, but it’s not “published” in any traditional sense. If I take it down, does it become unpublished? Can I only unpublish if I get to it before it gets cached by anyone’s browsers, before Google gets to it? What if I post something online, but no search engine ever finds it and no one ever visits the page? Was it published then? If I put something online but lock it up and let no one see it, is it published?
I think we need a more sophisticated conception of publication to fully incorporate the way we use and interact with the web. I don’t think the traditional notion is helpful, and I think it presumes a kind of static life for web content that just isn’t there. Web content is read/write. It’s editable, it’s alterable. Rather than dislike that about the content, we should encourage and celebrate that. That’s what’s great about it.
There has always been ephemera. Most of it has been lost. Is that sad? I suppose so. As a (former) historian-in-training, I would have loved to get my hands on the ephemera of early modern women’s lives. I would love to know more about them, more about what drove them, what they’re lives were like. But I don’t feel like I’m owed that information. Ephemera is what fills our lives; when that ephemera becomes digital, we need to come to terms with our own privacy. Just because you can record and store things doesn’t mean you should.
And this comes to the heart of the matter, the second element of the desire to archive everything that irks me. The common statement is that we are producing more information now than ever before, and this information needs archiving. The reality is this: we are not producing “more information” per capita. We simply are not, I refuse to believe that. Medieval people swam in seas of information much as we do, it’s just that the vast majority of it was oral, or otherwise unstorable (for them). These are people who believed that reading itself was a group event, they couldn’t read without speaking aloud. (Don’t be so shy if you move your lips while reading; it’s a noble tradition!) Reading and listening were a pair. In our history we just stored more of that information in our brains and less of it in portable media. If you think surviving in a medieval village required no information, consider how many things you’d need to know how to do, how many separate “trades” a medieval woman would need to be an expert in just to feed, clothe, and sustain her family. Did she have “less” information? She certainly knew her neighbours better. She knew the details of other people’s lives, from start to finish. She knew her bible without ever having looked at one. Her wikipedia was inside her own head.
Today we have stopped using our brains for storage and using them for processing power instead. Not better or worse, just different. We use media to store our knowledge and information rather than remembering it. So of course there appears to be more information. Because we keep dumping it outside ourselves, and everyone’s doing it.
Not to say that a complete archive of everyone’s ephemera, every thought, detail, bit of reference material ever produced by a person throughout their life wouldn’t make interesting history. I think it would, but that’s not what we think libraries are really for. We do generally respect a certain level of privacy. It would be a neat project for someone out there to decide to archive absolutely everything about themselves for a year of their lives and submit that to an archive. Temperature, diet, thoughts, recordings of conversations, television programs watched, books read, everything. We you want to harvest everything on the web, then you might as well use all those security cameras out there to literally record everything that goes on, for ever, and store that in the library for future historians. Set up microphones on the street corners, in homes, in classrooms, submit recordings to the library. A complete record of food bought and consumed. Everything. That’s not what we consider “published”, no matter how public any of it is. We draw the line. Somehow if it’s in writing it’s fair game.
But that’s not what people are generally talking about when they talk about “archiving information”. I know this is true because the article ends with this:
“On the other hand, we’re producing much more information these days than we used to, and not all of it is necessary. Do we want to keep the Twitter account of Stephen Fry or some of the marginalia around the edges of the Sydney Olympics? I don’t think we necessarily do.”
There’s “good” information and then this other, random ephemera. I will bet you that Stephen Fry’s twitter feed will be of more interest to these future historians than a record of the official Sydney Olympics webpage. And that’s the other side of this argument.
This isn’t about preserving information for those sacred future historians. This is about making sure the future sees us the way we want to be seen; not mired in debates about Survivor, or writing stacks and stacks of Harry Potter slash fanfiction, or coming up with captions for LOLcats. Not twitter, because that is too silly, but serious websites, like the whitehouse’s. We’re trying to shape the way the future sees us, and we want to be seen in a particular light.
I object to that process.