layout | title | published |
---|---|---|
default |
Sustaining Media Art |
true |
Feng Mengbo Long March: Restart (2008). Video game (color, sound), custom computer software, and wireless game controller. Dimensions and duration variable. The Museum of Modern Art, New York. Given anonymously, 2008. © 2015 Feng Mengbo. Installation view, Scenes for a New Heritage: Contemporary Art From the Collection, The Museum of Modern Art, New York (March 7, 2015 - April 10, 2016). Digital image © 2015 The Museum of Modern Art, New York. Photo: Thomas Griesel.
Our aim is to provide information that is useful to anyone who is caring for their own collection of video artworks in small, medium or large organizations as well as outside of an institution. Core principles are accentuated and you will find different approaches for different collection requirements highlighted throughout the text as quotes. We invite you to fill out a [survey]({{ site.url }}/sustaining-your-collection.html#Know-Your-Collection), which will act as a tool to help you outline the needs of your collection. The results will form the basis for an overall preservation system design.
- Do to digital works as you would to any artwork: identify, catalog, describe, treat, document and track
- Be prepared to be unprepared: the necessary skills evolve constantly and will be found both within and beyond your walls
- Build storage for your present collection but lay groundwork for the future
- The budget to build collection storage is nothing without the budget to sustain it
- Digital collections require active maintenance and will not survive passive storage
The first step to planning what you need is understanding what you have. And how it might grow. This will form the basis to any further decisions regarding your infrastructure, staffing needs and budget. To help you with this initial assessment, we created a survey to gather all the core information.
This survey has been designed to help you to establish an overview of your digital collection. Although these pages only address the needs of digital video, the presence of other types of digital artwork or digital components will have an impact on decisions regarding your systems, for example the needs for storage.
In order to plan for the storage and management of your collection, it is useful to define the categories of collection items and associated documentation that you hold. This might include the following:
-
MastersMaster material as provided by the artist, gallery, or donor. This may be a digital file, tape, film reel, disc or other.
-
Preservation MastersClones or derivatives of the artist’s master material made by your institution or a gallery for preservation purposes. This includes master files made through tape to file transfers.
-
Exhibition and Research CopiesDerivatives of the master material created either by the artist or by your institution for access, exhibition or loan.
-
Ancillary MaterialDocumentation created or received relating to the creation process and intended display of the artwork. This can include the artist’s installation instructions and exhibition documentation.
You will then need to decide what level of preservation is needed for each category and how the various objects need to be linked to each other. Depending on your needs, you may want to keep all of this information together in one place, or you may wish to keep it in different locations or systems which are interconnected.
![](img/image for community resources.jpg)
Expert meeting of media conservators and archivists on digital repositories and user needs hosted by the Museum of Modern Art, New York. Digital image © 2014 The Museum of Modern Art, New York.Caring for digital files is not fundamentally different from caring for art objects in other media, and the steps needed have parallels to the ones in more traditional conservation specializations.
For example, if a collection acquires a painting it will likely collect information about its creation, history, and condition, define and maintain the best conditions to store it, and ensure that handling and exhibition are appropriately managed. Museums and other institutions have become very good at ensuring that all these things happen to an agreed standard, and have highly developed teams dedicated to that end. It is just as relevant to understand what a digital file is, how it was created, its history and condition, and ensure that storage and display are appropriately managed.
If you are taking care of your own work or if you are in a small organization, relationships with external providers may be a key way of accessing those who have the necessary skills.
To enhance or acquire the skills needed to care for your collection, consider the following:
-
CollaborationNot everyone will be able to acquire _all_ of the necessary skills, so collaboration is crucial to the successful care of your digital collection. This includes working with external specialists and facilities in the required fields, for example in video migration, but also connecting across an institution, or even sharing resources among institutions with similar requirements.
-
Conferences and WorkshopsAttending related conferences is a good way to keep abreast of developments, and can also be used to help existing members of staff develop the skills needed. A more targeted way of developing those skills can be to attend or organize workshops about a specific subject.
-
Professional NetworksParticipation in professional networks and leveraging community resources can be a very helpful way to exchanging one's experience and connecting with colleagues who are facing similar challenges.
-
Hiring New StaffDepending on the size of your collection it may be necessary to create new roles or hire new staff to ensure the expertise is created and shared.
“We have a full time manager of the collection who oversees all aspects of the acquisition, lending and general care of the art material. In addition, we have a support staff of part-time employees who specialize in cataloguing, art handling and general care/maintenance of the collection. As we have acquire more digital material, we have hired consultants to get us up to speed and adapt our collection management software to accommodate this material and set us up with the cloud storage and workflow for backing everything up.”
No matter if you are a private collector or a large institution, significant costs are involved in collecting and maintaining digital artworks. When calculating a budget, there are two different types of costs to consider: capital costs (one-time purchases of fundamental infrastructure or tools that will in theory last several years) and ongoing or recurring costs that will become part of a regular operating budget and may increase as your collection grows.
Drawing from the results of the [survey]({{ site.url }}/sustaining-your-collection.html#Know-Your-Collection), you can start to develop how much storage you will need and begin costing for digital storage, infrastructure and support. This can range from budgeting for hard drives to developing costs for a robust repository supported by IT staff. In order to be able to establish a budget you will need to have gathered the following:
-
Information about your Collection
- What formats are present in your collection?
- How much material will you need to store?
- What is the total size of the files, if the material is stored as digital files?
- What is the duration of the material, if it is on tape?
- In what timeframe are you planning on migrating your tapes to file?
- How fast is your collection growing?
-
Influencing Decisions
- Will you keep all the copies of an artwork (i.e. artist supplied masters, archival masters, retired masters, exhibition copies, access copies)?
- Will you keep each component stored on its own with its own metadata or will you keep all the artwork's components together?
- What type of metadata and documentation will you keep with your components and what will be kept elsewhere?
- Will you be working with external facilities and contractors?
Will you hire new staff?
- Details about the cost of any consumables
- Information about how long processes will take of someone’s hands-on time (some processes might take many hours of computer time to process but they only need to be monitored occasionally). These processes may include:
- It is also important to estimate a timeframe for implementation. It can be helpful to determine how long it will take to complete the process from start to finish for a single item and extrapolate from there.
- If you want your budget to link to planning you might also want to know how long it will take to complete the process from start to finish for a single item.
<li>- Initial assessment of material</li>
<li>- Cataloguing and documentation</li>
<li>- Migration from analog to digital</li>
<li>- Preparing your files for ingest or transfer to storage</li>
<li>- Ingest or transfer to storage</li>
We created an excel sheet for you into which you can enter your costs and calculate total amounts, illustrated by graphs over time:
Below are three examples of different budgets and the necessary considerations to develop these budgets:
For each storage option, and depending on the size of your collection, you may need to consider:
-
The initial cost to set up your digital storage
-
The ongoing cost of sustaining your digital storage - depending on the option you choose this may include annual maintenance fees as well as incremental expansion costs
-
The staff and vendor time involved to establish and maintain the system
A budget can be set annually or in relation to specific projects. A budget needs to be monitored so that it can be adjusted if you find your assumptions are inaccurate; for example things may be taking longer than expected, or costs change.
The collection management system is typically a database containing basic information about each artwork. It enables collection managers to perform the following essential functions:
-
Maintain an inventory of all collection items
-
Record acquisition details, including information about the artist, any donor or purchase information and any key provenance information
-
Keep a history of exhibitions or displays, including loans
-
Track the location of physical and digital parts of collections items, including the ability to track versions of digital files as they get transcoded or migrated
-
Record condition information and technical information about an artwork to inform its ongoing management
Depending on your context and the size of your collection, a collection management system will look very differently. For a small institution or individual collector, collection management functionality could be carried out using a database or spreadsheets, templates and standardized metadata. For a larger institution, a managed database will be necessary in order to:
-
provide a central information point with a consistent level of information about an entire collection regardless of medium or type
-
facilitate certain core workflows (for example an acquisition process or loans process)
-
provide auditing and reporting functions
-
allow for access to information and the ability to edit information to be defined at a user level
-
allow for multiple users to update information and keep it current
The larger the number of users and objects and the more complex the workflows the more necessary a specialized database, ie. a collection management system, will become. Software to maintain an inventory could range from an Excel spreadsheet, a database like Filemaker, an open source collection management software such as CollectiveAccess or Omeka to the type of collection management system employed by museums, for example The Museum System (TMS), Mimsy or a bespoke system developed by your institution.
It is uncommon to find one system that will facilitate both the collections management activities outlined above, and also digital preservation. Therefore it is often the case that different systems and tools are used for serving these two different core needs – for instance the use of a digital asset management system or a digital repository in addition to a collection management system. Before building or implementing any specialized systems beyond a collection management system, however, it is important to consider how this specialized system will integrate and communicate with your existing collection management system.
In a completely manual environment, you would be able to incorporate any additional information needed for the digital repository functions (for example checksum monitoring, format registry etc) and your core collection management information (for example location tracking) into one database. For larger digital collections there are significant gains to be made in terms of time and accuracy in taking advantage of tools that automate certain functions within the workflow. This often creates information about your digital artworks which you will wish to record and maintain. Where specialist (often proprietary) databases exist, integration with any repository software or digital asset management systems can be a challenging, expensive and skilled operation.
In most cases the information about an art collection will be formed of a rich cluster of records that have been compiled and edited over time. Often only a fraction of this information is held within a central database. Many institutions will also have records management systems for this associated documentation.
The challenge here is to ensure that when you pull a digital object from storage after many years, you will have the information you need to understand what the materials are, how they can be viewed properly, their purpose in relation to the artwork to which they belong, and finally the ability to validate and prove their authenticity.
There is a certain amount of information you should make sure is held with the files, so that even if other information is lost, you still know what artwork it is and that the file itself has not changed.
-
Core descriptive information
- Artwork title and artist's name. If you have a collection management system the descriptive information should also include an identifier to that artwork’s record. These identifiers must be persistent, meaning they are permanent and will never change.
-
Fixity
- Creating checksums allows you to ensure that a file has remained unchanged. For more information, please refer to the Fixity section
-
Technical Information
- Each digital file has technical information embedded in its header or wrapper. If you have a large collection, consider extracting this information and storing it within your collections management system, so that you can search across your collection.
There is further information that will need to be kept in order to ensure an artwork can be preserved and displayed in the future. This can be either kept with the file or in other systems, like a database or collection management system.
Collections management systems however are just a part of the picture of digital stewardship, as they do not generally facilitate core digital preservation actvities. To meet these needs, organizations typically employ a system, or suite of systems referred to as a digital repository. There are many functions that can be carried out by a digital repository system in order to monitor and ensure the ongoing preservation of your collection objects. Just as for collection management systems, the way these functions are implemented will depend on individual circumstances. For a small, homogenous collection, many of these functions could be carried out manually. For a larger and more varied collection, there are significant benefits to having automated workflows and systems in place.
This is an area which has developed very rapidly in recent years and even large organizations have struggled to develop the infrastructure to keep pace with their growing digital collections. It is therefore important to understand that the term repository refers to a number of functions which need to be carried out and which are underpinned by some of the key ideas to emerge regarding digital preservation practice.
These functions can include:
-
Automated workflows for the ingest of digital objects, including the extraction of metadata and packaging of the object and metadata for storage
-
Generation of a persistent identifier for each digital object and a permanent relationship to associated metadata
-
Generation of normalized masters where necessary. A normalized master is a preservation copy of a file in a standardized format. There is some debate regarding when the creation of normalized copies is advisable. In general, video file formats are less vulnerable to obsolescence than tape formats. This is partly because it is easier to support software playback over time, especially given open source tools such as ffmpeg.
-
Generation of access derivatives and an interface for user access to these derivatives, alongside associated metadata
-
Auditing of system and user activities
-
Active monitoring of file integrity
-
Recording and reporting on collection characteristics, such as file formats
-
Monitoring of preservation risks, e.g. file obsolescence and software dependencies
This section outlines the key elements associated with the design and build of reliable storage for long term digital preservation. The content has been arranged to address a range of situations; whether you are an individual who is designing a low risk personal storage solution, through to a large organization. No matter the size of your collection the following core principles need to be considered when designing your storage:
- Geographic Redundancy - Multiple copies of data should be held at different geographical locations, and a disaster recovery plan should be in place.
- Fixity Checking - Regularly monitoring digital files in order to detect corruption or unwanted changes to your data.
- Access and Security - The speed and restriction of access to data needs to be appropriate for its intended use and the level of protection required.
- Technology Monitoring - Trends in storage technology should be monitored to assess when migration to new storage media will be necessary.
When scoping your current storage infrastructure for the purposes of digital preservation, it is helpful to understand the difference between standard storage setups and those suitable for digital preservation. Standard storage systems are designed for digital objects that are in active use and while backup procedures are usually included, they generally do not meet the more stringent requirements to ensure long term preservation of data. For example, within a normal institutional information technology set up, it is standard practice for backup tapes to be wiped and re-used after a few months. Active use storage is also unlikely to have a system in place to identify that information has changed or been lost. When data is changing all the time, it is not possible to easily detect the difference between intended and accidental changes. In contrast, preservation storage systems require the active monitoring of data in order to detect unwanted changes, such as corruption or damage. Their high level of redundancy with copies in several locations enables the data to be restored should a problem arise. Ideally, they will also have a disaster recovery plan.
The lifetime of a hard-drive varies from three months to five years. If you only have one hard drive and it breaks down, data recovery is very costly, and can be catastrophic. As a rule of thumb: One copy is no copy. Save three copies of your data on at least two mediums (e.g. hard drive, server, LTO tape, flash drive, cloud) and in at least two geographic locations.
There are several reasons for maintaining duplicate copies of files, such as ensuring high availability and the ability to recover from a disaster situation or accidental modification or deletion. The type of storage you choose for copies of your data will depend on how quickly you will need to access the data should you lose the primary copy. If speed is important, this will normally require an exact duplicate of the primary infrastructure at an off site location, with an equivalent connection to the outside world or access points. This infrastructure would contain a complete and up to date copy of your entire collection.
When maintaining a copy of data for disaster recovery purposes (for instance, in the event of fire, flood, or earthquake), the goal is simply to be able to retrieve, rebuild, and access your data. Within reason, ease of access is not a priority. An LTO tape stored off-site is one example of a suitable medium for a disaster recovery backup. There are many storage options that will support geographic redundancy. The right option for you will depend on your budget and the size of your collection.
Hardware: potential options (2016)
- 0 to 5TB in the next 5 years and small budget -RAID 1
- 5TB to 25TB - RAID-5 or 6 or 7
- 25TB and higher - some kind of enterprise storage - OR - multiple RAID 5 or 6 or 7 that support daisy chaining
Keeping multiple copies in sync: potential options
- Manual
- Peer-to-peer file sharing (P2P)
- Cloud service (i.e. dropbox, crashplan, http://www.cloudwards.net...)
“I want to keep this as simple as possible so to achieve my 3 copies and multiple geographical locations, I purchased three RAID 1 drives. One for my studio, one for my home, and one for my friend’s home. To keep my three RAID 1 drives in synchronization I manually sync my studio drive with my home drive on a weekly basis, and then manually sync with my friend’s drive twice a year.”
A duplicate copy of your data may be maintained for high availability, disaster recovery, or recovery from accidental modification or deletion (or a combination of these purposes).
High-availability: This is a redundant copy of your data maintained for the ability to provide easy access to the data with no downtime in the event of loss of the primary copy. Typically this means an exact duplicate of the primary infrastructure at an off site location, with an equivalent connection to the outside world or access points. This infrastructure would contain a complete and up to date copy of your entire collection. With high-availability copies of data, ease of access is paramount.
Disaster recovery: When maintaining a copy of data for disaster (for instance, in the event of fire, flood, or earthquake) recovery purposes, the goal is simply to be able to retrieve, rebuild, and access your data. Within reason, ease of access is not a priority. An LTO tape stored off-site is one example of a suitable medium for a disaster recovery backup.
When creating a secure storage environment for your data, you will want to make sure the data itself stays safe and does not change without your knowledge. This process is known as fixity. In digital preservation, this is achieved by generating checksums for your files which are monitored by re-checking, on a regular basis.
Simply put, your file is run through a certain algorithm (the most commonly used algorithms are MD5 and SHA) that produces a unique alphanumeric sequence. The slightest change to your file will produce a completely different checksum. With this simple process, it is possible to identify any changes to your files. The types of changes which can be identified with this method are those which indicate corruption, loss of data, or unintended manipulation. If you have an automatic monitoring system in place, it would alert you if such a change occurs.
Calculate checksums as soon as you’ve received or created a file. This could mean creating checksums as you export a file from the hard drive on which an artwork was received, or as soon as you have exported a file from an editing program or after digitizing a tape.
RAID and enterprise storage systems provide what could be considered a basic form of fixity. Data in these systems are monitored for integrity at the block-level – meaning the smaller blocks of data that a file is composed of. If one hard drive fails, the system can restore the lost data by using the redundant data block and the checksum. This, however, only detects data corruption that occurs as the fault of the storage device itself – it would not be aware of the accidental corruption, modification, or deletion of a file as enacted by a user or piece of software. In digital preservation we use checksums at the file-level in addition to block-level checks.
A checksum enhances these block-level checks insofar as it is a portable piece of evidence that can travel with a file, and be checked, regardless of storage the type of storage system it resides on – it can prove that a file has not been modified since chain of custody was established.
These tools can simply generate checksums – out of the box, they will not necessarily store the checksum value, or facilitate the verification of a checksum.
-
md5 - This can be run via Terminal on Mac or Linux.
-
shasum - This can be run on Mac, Linux and Windows.
-
FCIV (File Checksum Integrity Verifier) – windows command line tool for generating MD5 or SHA1 checksums.
These tools create checksums, store the values, and facilitate the verification of checksums after the fact.
-
Checksum+ (OS X) - With this tool you can just select a file and it will create a checksum for you, stored within an .md5 file, in the same location, with the same name as your file. Double-clicking this .md5 file will run a fixity check and tell you, if everything’s ok. If you open it with a text editor, it will show you the MD5 checksum as well as the file it is pointing to. It is important that both the .md5 and the original file are stored in the same folder.
-
Fastsum - for windows.
-
BagIt - Developed by the Library of Congress, this tool is used in a command line interface. BagIt was originally created to safely transfer files from one place to another, by packaging the original data in a 'bag' (folder) and creating checksums for each file within that bag. It also stores information about the date and software version as text files within the bag and creates checksums of all these text files, including the checksum files themselves. Find a series of tutorial videos here.
-
Fixity - AVPreserve has created a tool hat enables the user to identify seven directories that the program can check automatically on a monthly, weekly or daily basis.
-
Archivematica - comes with a command-line based tool for running fixity checks.
-
Nagios - The open source industry standard in IT infrastructure monitoring and alerting. Various checksum plugins are available.
-
For more possible checksum tools click here
Depending on who needs to access your collection, you may also wish to consider storing compressed derivatives of your master files in a more accessible location. This will not only be more practical than accessing large uncompressed masters, it will also reduce the risks involved with allowing access to your masters.
For example, if your storage system incorporates LTO tapes you will need to monitor any manufacturing issues for your brand of tape, updates to the LTO standard, and the availability and compatibility of LTO drives.