“Forever” is not a word we usually associate with things electronic. But MIT’s DSpace was conceived to ensure that ideas, at least — even if expressed electronically — can now truly last forever.

DSpace is an electronic archive that captures, stores, indexes, preserves and distributes the intellectual output of MIT’s faculty and researchers. Tens of thousands of pieces of digital scholarly content are being produced at MIT each year for possible addition to the ground-breaking repository system. These include technical reports, conference papers, lecture notes, data sets, theses, photos, videos, software, scanned historical collections, and more. DSpace was designed by the MIT Libraries and Hewlett Packard research staff.

“A digital library or archive system is particularly important at MIT, where great faculty and great researchers are producing massive amounts of digital data,” says MacKenzie Smith, associate director for technology at the MIT Libraries and project director for DSpace.

It’s also likely that the software running DSpace might one day ensure your great-grandchildren’s access to the digital family photos you’re taking right now.

“Everyone these days has digital data they really rely on and would like to make part of their personal archives,” notes Smith. “But the tools, the technologies, and the understanding just haven’t been there. What we’re facing now in the research community is about to hit the individual in a huge way.”

Beyond the Box

The digital photo industry is a perfect example of why we’re worrying about information loss. Nearly every digital camera creates images in a unique format. If the digital photo software your grandson is using 10 years from now can’t recognize the family photos you’re taking with your digital camera today — which is likely — those images will be lost.

Similarly, imagine you’re a faculty member or researcher at MIT. You’re working in a digital environment. Gone are the laboratory notebooks of old, where research results were painstakingly entered by hand. Instead, everything is done by computer, and in many cases those computers — like digital cameras — are running on proprietary technologies. Further, nobody is even thinking about how your data is going to be migrated from one software generation to the next.

The result is that important documents, research results and other data are disappearing. It can happen as easily as a professor retiring or moving to another institution (and taking his laptop with him), or simply upgrading to a new computer.

“Anybody who cares about history, or culture, or the scholarly record should care about this problem,” Smith says.

Fortunately, the DSpace repository is now at work and rapidly growing. It’s expected that, one day, the archive will exceed a petabyte — one million gigabytes — in storage capacity.

Flexible Software

The technology created to drive DSpace has also emerged as a key contribution. This happened because the software code is both flexible and freely available to others for downloading — and for tinkering.

“You simply say to anyone who wants to use it, ‘You can have this for free, but if you improve on it, you should contribute those improvements so that we can all share in them. That’s the deal!'” explains Ann Wolpert, Libraries director and the catalyst for the development of DSpace. The “open source” strategy encourages collaboration among users and accelerates development of the software.

MacKenzie Smith has big plans for what that collaboration could mean.

“DSpace as a digital library of MIT’s research materials is wonderful in and of itself,” she says. “But what if every research university did the same thing? Then we’d have all the research material of the world available online. We could start to build new services that just aren’t possible now — virtual collections within a particular discipline, teaching collections. The possibilities are endless.”

DSpace has already been downloaded more than 5,000 times, and by a mix of academic, government, cultural heritage and business entities around the world. Fifteen research organizations are in full production already, and more than 100 are in the process of implementing the system for their own institutions. And everyone using it has customized it for their own purposes.

What it has been used for so far is mainly the preservation of scholarly material. But Ann Wolpert says that industry is becoming increasingly interested in how the DSpace technology might be used in commercial service to consumers. This signals hope for preserving your personal digital photo collection as effectively as the family shoe box has done for generations.

“Let’s go full circle, now, to your personal computer, and your personal photographs, and the other things you’re building in your own digital space,” Wolpert says. “Our hope is that the work we’re doing here today will not only enable MIT to do the right thing by the output of its faculty, but that it might one day also be helpful to you.”