Account of Archivematica Camp 2018 @ IISH Amsterdam

The so called Archivematica Camp is an informal and highly informative three day conference that is organized by Artefactual – the company behind Archivematica. The aim of the camps are described as such: “to provide a space for anyone interested in or currently using Archivematica, to come together, learn about the platform and share their experiences.” An Archivematica Camp normally consists of one introduction day and two days of more in-depth knowledge exchange.

This time the Camp was organized at the International Institute of Social History (IISH) in Amsterdam. Attendance exceeded expectations – it is the largest Camp to date. Participants came – as predicted – for the most part from the Netherlands and Belgium (about 50%). The rest of the participants came from Sweden, Norway, Germany, the UK, Italy, Switzerland and even two from Canada.

Below you will find a somewhat subjective account of the Camp containing elements I found interesting and worthy of sharing. This blog is therefore by no means a complete account of the Camp.

Day 1

The Camp was officially opened by Afelonne Doek – Director of Collections and Digital Infrastructure of the IISH – after which day 1 of the Camp commenced. Evelyn McLellan – president of Artefactual – fleshed out the broader digital preservation context in which Archivematica operates. She gave a nice definition of what a digital preservation system like Archivematica actually does: “A system built from tools that perform a variety of functions to ensure the integrity and authenticity of digital content”. Think of functions like virus checking, file format identification, characterization and validation, extraction of file header information, fixity checking and normalization of file formats.  

Evelyn also showed the relation between Archivematica and the OAIS model using this neat display:

The black header is from the (web)interface of Archivematica – in other words Archivematica aspires to be an OAIS compliant preservation system. Archivematica endeavours to use as many open (preservation) standards as possible. For instance: Bagit, METS, PREMIS, Dublin Core and the PRONOM registry of file formats.

In essence Archivematica is a workflow application which uses many microservices (read = external tools and scripts) to finally create an AIP (Archival Information Package) and a DIP (Dissimination Information Package):

It is important to mention that the AIP’s and DIP’s that Archivematica produces are: “system agnostic AIPs, meaning that you do not require a particular system to store and read AIPs in the future”.

Evelyn then showed how Archivematica can integrate with storage, access and repository systems. Many integrations have already been realized. For instance with storage systems:

And with access and repository systems:

In the next session Justin Simpson, technical director of Artfactual, gave insight into the technical design of Archivematica. As previously said, Archivematica is a workflow tool that uses many microservices that 1 by 1, are activated in the processing of a digital archive finally creating an AIP. For this goal a 'Gearman' open source application framework is used. Other infrastructural tools to make up Archivematica are:

Justin alo showed the Camp how during the processing configuration (under the Administration tab) the archival workflow can be adapted and specific tools can be selected. Here the desired degree of automation of the workflow can be configured. This means, for instance, that the workflow can be set to stop, or not for the selection of a file recognition tool. Or the user has the choice to store the pre-SIP in the backlog (before contining to the ingest phase). 

The preservation planning tab offers the user may possibilities to configure the preservation workflow and the associated tools:

The participants were then allowed to work with Archivematica in the second half of the day, creating a SIP and adding metadata to it, starting an ingest, normalizing files, creating an AIP and downloading this.

During the next two days more background and technical information about Archivematica was given and three in depth technical parallel sessions were offered. As I wasn’t part of these sessions this blog will give only an impression of the slightly less technical part of the program.

Day 2

On day 2 there a lot of time was spent on the design of personal workflows and offered a lot of hands on experience. The rationale behind these specialized workflows are, for example:

My colleague Lucien van Wouw and I were also given the opportunity to tell the group a little about the implementation of Archivematica at the IISH. The core message of this presentation (which can be downloaded below) was that the IISH implementation proved to be far more that the implementation of the Archivematica alone:

We also included the Archivematica automation tools in our presentation. These tools offer the possibility to automate pre- and post-ingest workflow steps.

The IISH uses these tools, at the moment, to automate the delivery of archives that are to be ingested to Archivematica. In practice this means we can FTP an archival bag to a ‘hot folder’ after which the automation tool will detect the bag and start the Archivematica transfer.

Attention was also given to the appraisal and selection functionality within Archivematica. On the appraisal tab it is possible to – among other things - deselect unwanted content, combine or change the order within pre-SIP’s. Via an API it is also possible to connect ATOM or Archive Space to the appraisal tab so the re-ordering of the archive can be combined with creating an archival description.

Marco Klindt of the Berlin Zuse Institute closed day 2 with a presentation about the Archivematica implementation at his organization. Marco showed that Archivematica is part of a bigger infrastructure of applications and hardware that form the digital repository of the Zuse Institute:  

Day 3

On the third and last day a lot of time was spent on the exploration of the AIP. Special attention was given to the PREMIS metadata and the METS file in which this metadata is wrapped. PREMIS, the standard for the storage of digital preservation metadata, is an inalienable part of Archivematica. The idea behind PREMIS:

An important part of the output of the preservation tools that are used within Archivematica end up in PREMIS fields. A problem with this, is that the METS file – when an archive consists of many files – can become really big. One of the future improvements suggested is therefore a linked data solution for the PREMIS metadata. This solution will go towards solving the verbosity of the METS file.

The day, and Camp, was closed with a discussion about the Archivematica community. Among other things it touched upon the (un)desirability of integration with systems and their functionality with Archivematica and the role the community could play in this. Wit regards to this the list of improvements on the Archivematica Wiki was also discussed.

