Digital Preservation
How to Navigate this Course
Great!
What is Digital Preservation?
Why?
Why Does This Matter to Me?
What are the challenges in digital preservation?
Media Failure
The Lifespan of Storage Media
- CDs/DVDs: 2-20 years(!) - in good conditions
- Floppy disks: obsolete - most computers no longer have readers
- USBs: 10,000 writes
- Hard Drives: 3 years, then 12% annual failure rate
- External Hard Drives: Less time due to wear and tear
- CDs/DVDs: 2-20 years(!) - in good conditions
- Floppy disks: obsolete - most computers no longer have readers
- USBs: 10,000 writes
- Hard Drives: 3 years, then 12% annual failure rate
- External Hard Drives: Less time due to wear and tear
Physical Loss
An example of physical loss by
natural disaster.
- Poor storage environment
- Overuse
- Infrastructure failure / Inadequate maintenance
- Hardware failure or malfunction
- Natural disaster
- Human error
- Sabotage
File Format Obsolescence
- Proprietary software upgrades lead to a new format version, and the old format is no longer fully supported
- The software supporting a file format is no longer widely available
- The market is dominated by a new format (e.g., Microsoft Word's .doc/x has made other word processing formats obsolete)
- The format simply fails on the market, due to any number of reasons (e.g., unfixed bugs, poor usability)
Media Format Obsolescence
What YOU Can Do
Migrate Files from Old Media
Choose the Right File Format
- Is it proprietary or open source? Proprietary formats often require software purchases, and are not updated as often to fix bugs. Open source formats are generally supported by freely-available software and are updated by a community of users.
- Is the format widely used?
- Are older versions of the format still readable? This suggests that a format has backward compatibility.
- Does it support your needs?
Make Your Digital Files Findable
- Keep it short.
- Add structural elements that will help you locate the file
- Only use underscores (_) and hyphens (-)
- Order information by importance
Archive Your Email
Email messages often contain pertinent information about departmental and institutional work. They can contain project- or collection-related information that needs to be preserved.
- Identify messages that have long-term value
- Export selected messages- save in a recommended format such as PDF
- Organize messages for findability- save with related project or collection if possible
- Identify messages that have long-term value
- Export selected messages- save in a recommended format such as PDF
- Organize messages for findability- save with related project or collection if possible
Ensure the Security of Your Information
- Ensure that no single user has access to all redundant copies of important digital files
- Immediately remove users after staff turnover
- Maintain a comprehensive list of users and their permissions
Managing Digital Assets
What are Digital Assets?
Digital Assets vs. Everyday Files
Just as you title your files to ensure findability, it is important to also separate the most important files, or digital assets, from the everyday files with no enduring value. This is because digital assets require different digital preservation steps and need to be easily identified. It is also important to separate them in order to prevent any accidental deletion of important files.
Create a Comprehensive Inventory
In order to keep track of your digital assets, it is important to create a comprehensive inventory of the files, their formats, and their locations.
This can be done manually, but there are specific tools, such as DROID, that will carry out the inventory much quicker. The inventory should be carried out yearly (or as needed within the institution/department).
A file-level inventory will assist in carrying out other activities related to digital asset preservation.
Implement File Format Restrictions
To make your digital assets easily manageable, it is important to maintain them in only a few file formats. By implementing file format restrictions, you can also ensure that your collections follow the best practices suggested for each media type. For more information, see JISC's <Digital Media guides>.
To determine the best file formats to use, you can also refer to the criteria explained in the previous section.
Implement Format Restrictions
An example of file format restrictions, as defined in a simple TXT document, called a README file, saved within the project folder.
Migrate Files into Selected Formats
Migrate Files into Selected Formats, Cont.
Maintain Redundant Copies
Maintain Redundant Copies
Check File Fixity
To ensure the continued readability of files, a checksum should be generated on ingest (addition to the repository) of each digital asset. A checksum is a number representing the sum of digits in a piece of data, against which later comparisons can be made in order to detect any data loss or change.
Potential fixity tools can be found here:
<http://digitalpowrr.niu.edu/tool-grid/>
Check File Fixity, Cont.
Metadata
In order to be useable and useful, a digital asset should include as much descriptive information, or metadata, as possible. This includes the checksum described in the previous slide, as well as information like:
- original filename, location, and other administrative metadata;
- technical metadata describing the file structure; and
-
copyright information.
Some administrative metadata can be automatically generated using settings on the original device. Pertinent metadata should be saved redundantly to a project log.
Document Your Storage Plan
Document Your Storage Plan, Cont.
An example of a retention schedule.
Retention schedules can also include more detail, depending on the collection and its potential users.
National Digital Stewardship Alliance
Levels of Preservation
http://www.digitalpreservation.gov/ndsa/activities/levels.html
The NDSA Levels of Preservation
Self Assessment Example
How Do You Measure Up?
In Sum...
Image Credit
All images are licensed under Creative Commons.
Workshop on Digital Preservation
By Heidi Dowding
Workshop on Digital Preservation
Slides adapted from a March 2014 workshop at Dumbarton Oaks.
- 3,108