An introduction, for librarians, to long term digital preservation of the digital academic research record by academic libraries


The advent of the internet and the invention of the hypertext system has ushered in an information  revolution, whereby anything that can be digitised can be easily reproduced and distributed at almost zero cost. However in some cases, the information revolution has been subverted and information has become an expensive and artificially scarce commodity. The most notable case of this subversion is occurring with the publication of academic research articles. The purpose of this paper is to explain how the subversion of the publication of academic research articles occurred and then to suggest long term sustainable remedies.



In March 1989 Tim Berners-Lee published a paper at CERN proposing the world wide web using the inter-networking infrastructure developed by Vincent Cerf and Robert Khan in the 1970’s and published as RFC 675. In January 1997 the HTTP 1.1 protocol is published as RFC 2068.

The original purpose of the world wide web (WWW) was to share and store information related to academic research.


In 1994 Stevan Harnard presented a paper entitled “PUBLICLY RETRIEVABLE FTP ARCHIVES FOR ESOTERIC SCIENCE AND SCHOLARSHIP: A SUBVERSIVE PROPOSAL” whereby he proposed that academic authors archive digital articles on anonymous public FTP servers.

However, this did not happen because academic societies chose to publish via commercial publishing houses. Initially the commercial publishing houses provided cost effective publishing services but then costs started to escalate over the years, more than the annual consumer price index. This rapid cost increase has put enormous pressure on academic library budgets and has led to the “serials crisis”.


In February 2002 the Budapest open access declaration is published. In April 2003 the Bethesda open access statement is published. In October 2003 the Berlin open access declaration is published.

Open access is seen as the remedy to the “serials crisis” and a concept to provide public access to publicly financed research. In addition funders, public and private are beginning to mandate open access to research funded by themselves.


It is proposed that academic libraries provide platforms for storing and preserving the digital outputs of the academic research process. The platforms developed and maintained should be based on open digital technologies to ensure the best possible chance of the technology surviving for the benefit of future researchers. Digital preservation is a long term strategy that has two essential components:

  1. Preserve and maintain a digital container for the digital research objects.
  2. Preserve and maintain the digital research objects themselves.

If it is accepted that academic libraries should preserve the digital academic record then the next question is how to implement such a service and ensure that is available now and in the future?


The platform for storing the research digital objects is normally called an institutional repository. Therefore a digital preservation plan for the repository and the digital objects in the repository should exist. This digital preservation plan should ensure that there is ample capacity for the long term operation and maintenance of the repository.

The capacity for the long term preservation of the repository and the digital objects contained therein can be facilitated by two distinct groups, namely:

  1. An operations team.
  2. A systems team.

The bare minimum for a digital preservation group with the bare minimum for hardware is detailed below.

Operations Team


The operations team consists of the following persons:

  • Director
  • Manager
  • Repository Librarian and Journal Librarian

Systems Team


The systems team consists of the following persons:

  • Manager
  • Java Web App Developer
  • Ubuntu Linux System Administrator

For further details of capacity building and planning, please go to:

Proposed Business Model

From the team definitions above it is possible to determine a provisional business model. The only missing component is the cost of hardware which will be dealt with first.

Hardware Costs

These hardware costs can be amortised over 4 years, which is the normal warranty period for hardware.

Production Server R250,000.00 each R250,000.00
Backup Server X2 R150,000.00 each R300,000.00
Internet Connection Usually for the central IT department to provide. n/a


Personnel Costs


Operations and Systems Managers R450,000 each R900,000.00
Operations Librarians X2 R350,000.00 each R700,000.00
Systems Technicians X2 R350,000.00 each R700,000.00


Therefore annual costs are as follows:

Hardware => (R250,000.00 + R300,000.00) / 4 years = R137,500.00 annually

Personnel => R900,000.00 + R700,000.00 + R700,000.00 = R2,300,000.00 annually

Total Cost => R137,500.00 + R1,850,000.00 = R 2, 437,500.00 annually


The academic publishing problem was defined and a comprehensive remedy was proposed. The following appendix has more detail regarding the open digital technologies proposed. Whether the remedy is adopted academic libraries is an open question for each academic library to ponder, if they want to move forward constructively and become relevant digital research information systems role players in academic research institutions.


For long term digital preservation, the use of open digital technologies is critical, because we have no reliable way of predicting what technologies will be used by future researchers. The best bet to ensure availability in the future is to use open technologies. Now that we have established the need for open digital technologies, the next step is to identify the open digital technologies available today.

File formats (bitstreams)

The first consideration should be the file formats used which should always be uncompressed. Only file formats that have open, royalty free and patent free, published digital format standards and metadata schemas should be used. Examples are provided below:


The following are the recommended open document formats:


The following are the recommended open image formats:


The following are the recommended open audio formats:


The following are the recommended open video formats and containers:


The following are the recommended open database formats:

Systems Software

The next consideration regarding open digital technologies is the software used to build and maintain the repository because the repository itself must also be preserved for future use. A digital repository is built using server hardware and a server operating system on top of which is installed the actual repository software. For the server operating system and repository software it is recommended that open source software be used.

Open Source Server Operating System

The recommended open source server operating system software is Ubuntu LTS. For more details about the selection of Ubuntu as the server operating system, please go to:

Open Source Repository Software

The recommended open source repository software is DSpace. For a detailed listing of available repository software products, please go to:


The red herring that is “Article Processing Charges” or APC’s in open access scholarly publishing.

Lately much discussion has revolved around APC's and hybrid journals. The concern is understandable but the course of the discussion is not. It seems that common sense regarding business models has flown out the window into cuckoo land. I can understand that the commercial publishers want to create a distraction regarding business models so that they can continue to reap glorious profits.

However, I feel I must say something.

But first… this article is an expression of my opinion and is not representative of any other organisation or person.

Ok. Now thats out of the way, what is it I want to say?

In a nutshell.. the APC thing is a red herring to distract us from the core purposes of open access. Please let me explain below.

If APC's are about the economic sustainability of a particular publisher, then what happens when academics no longer publish articles with this publisher?
Or what happens when the academic society ceases to exist?



The important things in life!!

A professor stood before his Philosophy 101 class and had some items in front of him. When the class began, wordlessly, he picked up a very large and empty mayonnaise jar and proceeded to fill it with golf balls. He then asked the students if the jar was full. They agreed that it was.

So the professor then picked up a box of pebbles and poured them into the jar. He shook the jar lightly. The pebbles, of course, rolled into the open areas between the golf balls. He then asked the students again if the jar was full. They agreed it was.

The professor picked up a box of sand and poured it into the jar. Of course, the sand filled up everything else. He then asked once more if the jar was full. The students responded with a unanimous – – yes.

The professor then produced two cans of beer from under the table and proceeded to pour the entire contents into the jar effectively filling the empty space between the sand. The students laughed.

Now," said the professor, as the laughter subsided, "I want you to recognize that this jar represents your life. The golf balls are the important things – – your family, your spouse, your health, your children, your friends, your favorite passions – – things that if everything else was lost and only they remained, your life would still be full."

"The pebbles are the other things that matter like your job, your house, your car."

"The sand is everything else – – the small stuff."

"If you put the sand into the jar first," he continued, "there is no room for the pebbles or the golf balls. The same goes for your life. If you spend all your time and energy on the small stuff, you will never have room for the things that are important to you. Pay attention to the things that are critical to your happiness. Play with your children and grandchildren. Take time to get medical checkups. Take your partner out dancing. Take riding lessons. There will always be time to go to work, clean the house, give a dinner party and fix the disposal."

"Take care of the golf balls first – – the things that really matter. Set your priorities. The rest is just sand."

"One of the students raised her hand and inquired what the beer represented. The professor smiled. "I'm glad you asked.

It just goes to show you that no matter how full your life may seem, there's always room for a couple of beers."


The perfect relationship – a modern day myth?

Miss K ‘the Mackrill’

he shot himself soon afterwards.

“Why do we find the darkness so threatening? What is so inviting about the light that makes us shun the dark? If we all turned and faced it, I’m convinced that we would be so relieved to finally discover that the dark is not as threatening as it appears in our periphery. Rediscovery of self is not an improbable myth, it is real and achievable by those who seek it. .Thank you.”

He saw her nearing the door. Sensing his scrutiny, she paused, yet didn’t turn, no, she had tried, she would not return to the fold. It just wasn’t fair, not to her, not to him, not to those few who were enlightened by her wisdom and love.

“Fear. It thunder-creeps up behind us like a careless thug, and we freeze, memorials to aeons of ignorance and insipidity, whispers of darkness flooding our closed minds. We retract our priceless selves, present a façade of self, and call it ‘light’, hoping in vain that we have fooled fear, no, ladies and gentlemen, we have trapped the darkness within.”

The banality of it. That’s what trounced even her altruistic nature in the end. The utterly prosaic nature of those around her. Those meant to lead, followed, those who followed, lead by their acceptance of the god ‘Social Order’, preached from all pulpits, everywhere, incessantly!

“We shine this tin suit like silver, abandoning our essential identities to rot. We visit the friendly psychologist to chat about the rust, the creaky joints. This modern god then grins his shark-tooth grin and expounds upon ways to care for this tacky refuge, refers us to welders, psychiatrists, who give us the tools necessary to keep the ol’ beauty running: Prozac; lithium for the really tight fit.”

The door swung shut behind her, his voice fading as she returned to reality. It had pained him, but he had understood. Her life had become a monologue really, “better a mediocre idea that inspires millions than a brilliant idea that inspires no-one” damn those wise men.

“So, how do we free ourselves from this pseudo-self, and this fear of the darkness to which we have confined it? We seem to expect the answer to this question to be easy-to-acquire, instant, and heart-warming, like coffee from our office espresso machine. When this answer proves itself to be far more complex, time-consuming, and objectionable, we then protest and rebuff the dusty niche upon which it lay. Yet we hover, uncertain, fearful and defenceless against truth.”

Her smiles hid the eyes that bled for them. She rose, before they switched the lights back on, and left, her earnest serving passion washed out in the poseur’s glare.

What went wrong the “enlightenment”.

I am very grateful that the “enlightenment” allowed those who are admirers of open inquiry to be able to do just that… openly enquire without fear of condemnation.

But looking at things in the world now in general, I am thinking that perhaps this form of inquiry should have been more deeply entrenched in higher education during the intervening period till today.

It seems that the “dumbing down” of the American and now the South African population is successful because the faculty of “open inquiry” was not encouraged and supported during the “enlightenment”.


I have a new found admiration for Diderot. Just wish he was more successful than the others.