Describing digital content

If you want your digital content to be stored, found, and used over time, it needs to have good file naming and associated metadata that describes what the content is, where it came from, and who can use it.

Make it Digital has a detailed Describing Digital Content guide: Metadata Resources

What is metadata?

Metadata is any information that describes digital content. It can describe the attributes and characteristics of digital content in standardised ways, or in less structured ways through the use of general descriptions and tagging. Labels, captions, and file names are all examples of metadata.

Metadata can usefully describe any kind of digital content, such as an image, video file, audio file, or text. It can also describe all sorts of things about the digital content, for instance the person or organisation that created the content, the date it was created, its length (e.g. “duration: 3:27 at 15 fps”), and technical details such as who entered the metadata and its processing history.

Metadata is not just applied to an individual digital 'object' (such as an image, video, or document). In most digital content management systems, metadata is also applied to groups of similar and related objects (e.g. a set of diaries and memorabilia of a person), and also at the level of a collection of items (e.g. the painting collection of a museum).

Almost all structured metadata used for digital content has been designed to follow particular standard formulas, or schemes.

Standards play a significant role in formulating and structuring the way that good metadata is documented. All commonly used metadata schemes follow open standards.

Benefits of structured metadata

When metadata is added to digital content, groupings, and collections in a standardised and consistent way, it can be managed and organised so users are better able to discover, share, and use that content. Metadata:

  • describes digital content and its relationship to similar content

  • enables sharing and reuse of content

  • enables management of digital content

  • serves as a record of ownership

  • makes it possible to exchange digital content

  • reduces duplication of effort

  • is an asset to an organisation or individual project.

To gain maximum benefit from metadata, it helps to consider the use and purpose for which it is intended and to plan for this accordingly. Choosing metadata standards that are fit for purpose and in common use will be economic, reduce risk, and will help protect the future value of your descriptions and content.

Planning for metadata use

Planning for use of metadata is an important activity and there are many and varied aspects to consider in the planning process. If you are creating metadata for an individual, family, or community project, ask yourself:

  • Where will metadata be stored - with the content and/or separately?

  • How will metadata be created – automatically, manually?

  • How will the metadata be maintained in the future, and by whom?

  • Which community will the metadata be shared with?

  • Lastly, selecting which metadata standards to use is one of the most important planning activities you can undertake.

If you are creating metadata for an organisation, in addition to the above, ask yourself:

  • Does the organisation require a metadata plan for the entire organisation?

  • Could various parts of the organisations have different requirements?

  • Are there areas where requirements overlap?

  • What happens now and could it be improved?

  • Is there a need for different metadata for different kinds of content, services, and activities?

Metadata for user communities

Metadata schemes

Metadata is most often pre-packaged and ready to use by professional subject communities and sectors in what is known as a metadata scheme.

A metadata scheme provides a standard and consistent way to create, manage, and share metadata. A scheme is generally made up of:

  • a set of specifications, which can contain information about the purpose for which the scheme is intended

  • its maintenance agency (the organisation responsible for it)

  • the names of the metadata elements (also known as labels) with their meaning (semantics) and ways the elements can be used

  • recommended values for the elements themselves, such as thesauri use and encoding schemes

  • an abstract or entity-relationship model illustrating a high-level purpose or view of the scheme.


Examples of metadata schemes

  • Dublin Core Metadata Initiative element set version 1.1 (ISO 15836, DCMES) identifies a standard set of metadata terms - now known as properties - that can be used to describe digital content. Some of the Dublin Core properties include: title, creator, date, subject, and description. The Dublin Core scheme is a cross-domain scheme, which means that it can be used as a core scheme to map other metadata schemes to.

  • VRA Core data standard 4.0 developed by the Visual Resources Association’s Data Standards Committee is intended for image management aimed at the management of complex visual collections.


Metadata profiles

Metadata schemes can be created for an entire domain or subject community and a metadata profile can be created based on that scheme for a specific purpose within that community.

A metadata profile further refines and interprets a metadata scheme.


Example of a Metadata Profile

The United States Federal Geographic Data Committee (FGDC) has created a metadata standard for digital geospatial metadata. This metadata scheme has been developed for the entire geospatial sector. It is known as the Content Standard for Digital Geospatial Metadata.

Two metadata profiles have been developed for sectors within the geospatial domain:


Application Profiles

Application profiles allow for the mixing and matching of metadata schemes. A particular metadata scheme may immediately suit a metadata implementer, but on occasion, elements, vocabularies, and terms from another metadata scheme may need to be used. By developing an application profile, a metadata implementer can create metadata for their unique purpose and use.


Example of an Application Profile

The Food and Agriculture Organisation has developed an application profile known as the AGRIS Application Profile for the International Information System on Agricultural Sciences and Technology.

It uses metadata terms from the following metadata scheme:

  • Dublin Core Elements and Qualifiers

  • Agricultural Metadata Element Set

  • Australian Government Locator Service Metadata Set

The AGRIS Application Profile is available at the Food and Agriculture Organisation.


Content standards for metadata schemes

Most metadata schemes, while specifying metadata elements and their meaning, do not contain content standards.

Content standards provide instruction on how to populate metadata elements. They provide standard and consistent ways to transcribe and describe attributes of the digital content within the metadata scheme. For example, the content standard for libraries, Anglo American Cataloguing Rules (AACR), provides instruction on how to write the content when transcribing an author as <last name, first name>.

Content standards, like metadata schemes, may also make recommendations on values for the elements themselves, such as thesauri use and encoding schemes.

Content standards are most often stand-alone documents that are tied to a metadata scheme. This is because metadata schemes and content standards have been developed within subject communities and specialties. The museums, libraries, and education communities and many of the various science sectors have developed their own metadata schemes and companion content standards.

Cultural heritage sector

The content standard Cataloguing Cultural Objects (CCO): a guide to describing cultural works and their images is used with the VRA Core 4.0, published by the Visual Resources Association, and is used in the cultural heritage sector.

Libraries

The content standard Anglo-American Cataloguing Rules, published by the American Library Association et. al., is used with MARC21 (MAchine-Readable Cataloguing Record) published by the Library of Congress and is used in the library sector.

Archives

The content standard Describing Archives (DACS), published by the Society of American Archivists, is used to describe archival materials. DACS can be used with the metadata schemes MARC21 and Encoded Archival Description (EAD). The Society of American Archivists and the Library of Congress publish EAD and it is a standard for encoding archival finding aids using eXtensible Markup Language (XML).

Types of metadata schemes

Metadata schemes are created for different purposes. A primary purpose maybe the discovery of digital content on the web; another, that digital content is managed for long-term preservation, and so on.

Descriptive and discovery metadata

Descriptive and discovery metadata are created in order to:

  • describe digital content – for example, an abstract element may summarise what the digital content is all about; and

  • [is there a bullet point missing here about discovery?]

Many of the metadata schemes and content standards that are available are used for discovery and description purposes. The Dublin Core Metadata Element Set is an example of a metadata scheme for discovery. It can be used by many sectors as a common layer to map their own schemes to when making digital content available on the web.

The following is an example of a metadata scheme for description:

“Categories for the Description of Works of Art” published by the J. Paul Getty Trust and College Art Association [add image]

This metadata scheme, like some others, allows for both the description of a single (or item) level description and multiple levels of description.

Administrative Metadata

Administrative metadata is designed primarily to manage digital content. Those managing content over time need to be able to undertake activities such as:

  • archive digital content

  • track digital content and its representations

  • ensure file formats can be read and transformed

  • ensure the authenticity and integrity of digital content over time

  • identify the source of the metadata and updates.

A scheme known as Preservation Metadata: Implementation Strategies (PREMIS) is an example of a metadata scheme for administration. This scheme can be used for the long-term management of any type of digital content. Organisations managing digital repositories are its primary users.

Rights management information is also administrative metadata. Generally known as “rights languages”, they are used to express rights information over content.

An example of a language for digital rights management is the Open Digital Rights Language (ODRL), an international effort aimed at developing and promoting an open standard for rights expressions for digital content in publishing, distributing, and consuming of digital media across all sectors and communities.

A machine-readable language for expressing rights for digital content on the web has been developed by Creative Commons, an international non-profit organisation. Creative Commons provides free copyright licences that can be applied to in-copyright works by the copyright holder.

Technical Metadata

When digital content is created, whether an image, music, sound recording, or video, the file it creates contains some form of embedded technical metadata. This kind of metadata generally defines the technical characteristics or attributes of digital content. The technical metadata may contain information such as:

  • the compression, resolution and pixel dimension of an image

  • date and time a document was created

  • the number of bits of sample depth of an image

  • the number of frames per second of a video.

Examples of technical metadata formats include:

Technical metadata is found in a range of file (mime) types and their corresponding file formats. Three open standards containing some technical metadata include:

  • TIFF (Tag Image File Format) is a high-quality uncompressed file format used for creating archival copies (raster image) and creating derivative files. TIFF can be used by many scanners, printers, and computer display hardware. It does not favour particular operating systems, file systems, compilers, or processors.

  • JPEG2000 is a developing standard also intended for archival use.

  • MPEG-7 is an ISO/IEC standard developed by MPEG (Moving Picture Experts Group) with metadata for description, technical, and administrative use.

In 2006 the American National Information Standards Organisation (NISO) published “Technical Metadata for Digital Still Images” to define a set of non proprietary and open technical metadata elements for digital still images.

In the library sector, the Library of Congress has developed some technical file formats as XML schemes for audio, video, text, and images to ensure interoperability amongst libraries:

  • AudioMD: Audio Technical Metadata Extension Scheme

  • VideoMD: Video Technical Metadata Extension Scheme

  • TextMD: technical metadata for text-based digital objects

  • Image (MIX): NISO Metadata for Images in XML Scheme

Another type of technical metadata is the kind that brings together separate component parts, e.g. scanned pages of digital files into one logical unit, e.g. a book. This kind of metadata is also known as structural metadata.

Combining and encoding metadata schemes

The various types of metadata – descriptive, administrative, and technical – as well as different metadata schemes can be combined and used together when encoded into mark-up languages.

Mark-up languages are used on the World Wide Web and structure the metadata in a consistent way, so that web technologies can use and reuse the metadata in different ways. Guidelines are available on encoding metadata schemes into mark-up languages. The Internet Engineering Task Force have developed RFC (Request for Comments) 2731 for Dublin Core Metadata using HTML4.0.

The Metadata Encoding & Transmission Standard (METS) can encode descriptive, administrative and technical metadata.