Reference rot, a digital preservation issue beyond file formats

Author #1
Author #2

Description

In the era of ‘born digital’ ETDs, librarians and institutional repository curators need to reframe our responsibilities regarding digital preservation that go beyond file formats.

Documents that reference the live web are subject to reference rot: the combination of linkrot, the potential for a webpage to cease existing, and content drift, where a webpage’s content changes over time. Both phenomena contribute to long-term access of scholarly content and its context on the live web, or lack thereof. We examined PhD dissertations published in Concordia University’s Spectrum Research Repository, from 2011 to 2015, for evidence of reference rot.

Our poster will show the degree to which reference rot affected 720 ETDs in our repository, showing correlations between factors such as the age of the dissertation and the dissertation’s broad discipline. It will also show results of locating mementos found in the Wayback Machine, and whether content drift has occurred.

We chose to use Tableau Desktop software for our visualizations, looking to gain deeper insights into our results and acquire experience in data analysis while evaluating the tool for broader library assessment use.

We will also touch on mementos, or snapshots of webpages in time, as a tool to mitigate reference rot. As we look to make recommendations we question where the responsibility to create mementos lies, with the writer or publisher?

 
May 12th, 11:15 AM

Reference rot, a digital preservation issue beyond file formats

Fireplace Lounge

In the era of ‘born digital’ ETDs, librarians and institutional repository curators need to reframe our responsibilities regarding digital preservation that go beyond file formats.

Documents that reference the live web are subject to reference rot: the combination of linkrot, the potential for a webpage to cease existing, and content drift, where a webpage’s content changes over time. Both phenomena contribute to long-term access of scholarly content and its context on the live web, or lack thereof. We examined PhD dissertations published in Concordia University’s Spectrum Research Repository, from 2011 to 2015, for evidence of reference rot.

Our poster will show the degree to which reference rot affected 720 ETDs in our repository, showing correlations between factors such as the age of the dissertation and the dissertation’s broad discipline. It will also show results of locating mementos found in the Wayback Machine, and whether content drift has occurred.

We chose to use Tableau Desktop software for our visualizations, looking to gain deeper insights into our results and acquire experience in data analysis while evaluating the tool for broader library assessment use.

We will also touch on mementos, or snapshots of webpages in time, as a tool to mitigate reference rot. As we look to make recommendations we question where the responsibility to create mementos lies, with the writer or publisher?