As an archivist, one of the misconceptions I regularly encounter about my line of work is the idea that archivists are in the business of acquiring old stuff: old books, old letters, old photographs—just about anything so long as it’s old and dusty. For those who hold this idea, it may come as a surprise that I spend most of my day working with digital records, not paper ones. That’s because for the past three years the focus of my work at the Ontario Jewish Archives (OJA) has been digital preservation.
Those of you who read my last piece for Niv may recall that the OJA’s mission consists of documenting the province’s Jewish history. When the OJA was founded in 1973, that meant acquiring paper-based records of the kind many people associate with archives. That made sense, because from the earliest days of Jewish settlement in the province in the 19th century up until the later part of the 20th century, most of the records created by the community had paper as their medium.
Record-making practices started to change in the 1980s with the advent of the Digital Revolution. From this point on, a growing percentage of records generated within the Jewish community would be created using digital technology. And while paper records continue to be revered (ketubot come to mind), most records produced by the Jewish community today are born digital. If the OJA is to continue telling the story of Ontario’s Jewish communities, it will need to acquire and preserve these born-digital records.
This is where things become complicated, because preserving born-digital records presents challenges distinct from preserving other kinds of records. One challenge is the complacency people feel about preserving digital records, especially those stored online. Contrary to the received wisdom that once something is online it is there forever, online information is often quite vulnerable. Major web hosting services like GeoCities (rest in peace) can be acquired and then abandoned, for instance. And while it’s easy to scoff at GeoCities as archaic, it’s worth keeping in mind that there is no reason to think that Facebook or YouTube will not go the same way given enough time. Similarly, third-party cloud storage providers, such as Dropbox, could go out of business, leaving the digital records saved there at risk.
Based on the above, one might be tempted to think that physical storage media would offer a safe(r) alternative, but that too comes with risks. As the Digital Preservation Coalition notes in its excellent handbook, “Storage media can decay over time, leading to corrupted files. Storage media may become obsolete and unsupported by contemporary computers and the software that understands and provides access to them.” (Anyone old enough to remember floppy disks should be able to grasp this last point.)
The risk of data corruption is closely related to another threat, which is deliberate tampering. One of the appeals of digital records is the ease with which users may manipulate them, copy them, and distribute them to other users. Unfortunately, this very quality makes digital records potentially unreliable, since tampering is not always easy to detect, especially if it is done well. (This is only going to become a greater issue as deepfakes—a digital alteration of someone’s likeness in a video or photo—become ever more realistic.) Thus, one of the goals of digital preservation must be ensuring the authenticity of digital records over time. This is especially important when one is migrating a digital record from an obsolete file format to a newer one.
Thankfully, the OJA is aware of these challenges and is taking a proactive approach to ensuring Ontario’s digital heritage is preserved for generations to come. In 2019, the Ontario Jewish Archives, Blankenstein Family Heritage Centre’s Digital Preservation was among 52 projects to receive funding from the Government of Canada’s Documentary Heritage Communities Program, which is administered by Library and Archives Canada. The OJA used this funding to hire a digital preservation consultant/digital archivist to craft a digital preservation policy, conduct an institutional review of the OJA’s digital holdings, and help us choose a digital preservation system that would meet our requirements.
The OJA ended up choosing the digital preservation system called Preservica, and it facilitates our work by automating many of the activities that would otherwise have to be performed manually by our small team of archivists. As a result, we expect that all of our digital assets, whether born-digital or digitized, will be ingested into Preservica this September—a major milestone for us.
In order to get a sense of what digital preservation looks like in practice, let’s take an example of Toronto-based organization No Silence on Race.
On June 30, 2020, just a little over a month after police officer Derek Chauvin murdered George Floyd, Sara Yacobi-Harris, Akilah Allen-Silverstein, and Daisy Moriyama published “No Silence on Race: An Open Letter from Black Jews, Non-Black Jews of Colour, and Our Allies to Jewish Congregations, Federations, Foundations, Organizations, Nonprofits and Initiatives.” The letter called on the Jewish community “to uphold the tenet of justice and to commit to the creation of a truly anti-racist, inclusive and equitable Jewish community” and marked a significant intervention in the Jewish community’s internal dialogue around racism. For this reason, the OJA was keen to preserve it.
The open letter currently resides on the group’s website; however, it is not clear how long said website will be viewable. With the group’s permission, I saved the web page in the WARC (Web ARChive) file format and ingested it into Preservica. After adding our own metadata to the digital assets, such as the date of acquisition and the name of the donor, Preservica characterizes the digital assets (determines their technical properties), runs a check for malware, and calculates checksums. According to the same handbook I quoted earlier, “A checksum on a file is a ‘digital fingerprint’ whereby even the smallest change to the file will cause the checksum to change completely” and “can be used to detect if the contents of a file have changed.” By comparing checksums, one can determine whether a digital asset is the same as the original and thus confirm its authenticity.
Inevitably, the above file format will become obsolete and it will become necessary to migrate the web page to a new file format. Without a digital preservation system in place, this would pose a threat to the authenticity of the digital asset, but since Preservica preserves the original and keeps a detailed audit trail of any activity undertaken on the original, we can always return to the original to verify that the new version faithfully reproduces the essential qualities of the original. This, in turn, gives our users confidence that the digital records they are viewing have not been tampered with.
Although the details of digital preservation may not excite everyone, we believe that the community will benefit from the increased access to digital records that digital preservation facilitates. After all, all the work that goes into preserving records, be they digital or otherwise, is meaningless if the records do not end up being used. With an intelligent digital preservation strategy in place, the OJA can make the records in its holdings available to more people in more places all the while ensuring that the records people are accessing are faithful reproductions of the originals. In this way, the OJA will continue to share back the history of Ontario’s Jewish communities for a long, long time to come.
As an archivist, one of the misconceptions I regularly encounter about my line of work is the idea that archivists are in the business of acquiring old stuff: old books, old letters, old photographs—just about anything so long as it’s old and dusty. For those who hold this idea, it may come as a surprise that I spend most of my day working with digital records, not paper ones. That’s because for the past three years the focus of my work at the Ontario Jewish Archives (OJA) has been digital preservation.
Those of you who read my last piece for Niv may recall that the OJA’s mission consists of documenting the province’s Jewish history. When the OJA was founded in 1973, that meant acquiring paper-based records of the kind many people associate with archives. That made sense, because from the earliest days of Jewish settlement in the province in the 19th century up until the later part of the 20th century, most of the records created by the community had paper as their medium.
Record-making practices started to change in the 1980s with the advent of the Digital Revolution. From this point on, a growing percentage of records generated within the Jewish community would be created using digital technology. And while paper records continue to be revered (ketubot come to mind), most records produced by the Jewish community today are born digital. If the OJA is to continue telling the story of Ontario’s Jewish communities, it will need to acquire and preserve these born-digital records.
This is where things become complicated, because preserving born-digital records presents challenges distinct from preserving other kinds of records. One challenge is the complacency people feel about preserving digital records, especially those stored online. Contrary to the received wisdom that once something is online it is there forever, online information is often quite vulnerable. Major web hosting services like GeoCities (rest in peace) can be acquired and then abandoned, for instance. And while it’s easy to scoff at GeoCities as archaic, it’s worth keeping in mind that there is no reason to think that Facebook or YouTube will not go the same way given enough time. Similarly, third-party cloud storage providers, such as Dropbox, could go out of business, leaving the digital records saved there at risk.
Based on the above, one might be tempted to think that physical storage media would offer a safe(r) alternative, but that too comes with risks. As the Digital Preservation Coalition notes in its excellent handbook, “Storage media can decay over time, leading to corrupted files. Storage media may become obsolete and unsupported by contemporary computers and the software that understands and provides access to them.” (Anyone old enough to remember floppy disks should be able to grasp this last point.)
The risk of data corruption is closely related to another threat, which is deliberate tampering. One of the appeals of digital records is the ease with which users may manipulate them, copy them, and distribute them to other users. Unfortunately, this very quality makes digital records potentially unreliable, since tampering is not always easy to detect, especially if it is done well. (This is only going to become a greater issue as deepfakes—a digital alteration of someone’s likeness in a video or photo—become ever more realistic.) Thus, one of the goals of digital preservation must be ensuring the authenticity of digital records over time. This is especially important when one is migrating a digital record from an obsolete file format to a newer one.
Thankfully, the OJA is aware of these challenges and is taking a proactive approach to ensuring Ontario’s digital heritage is preserved for generations to come. In 2019, the Ontario Jewish Archives, Blankenstein Family Heritage Centre’s Digital Preservation was among 52 projects to receive funding from the Government of Canada’s Documentary Heritage Communities Program, which is administered by Library and Archives Canada. The OJA used this funding to hire a digital preservation consultant/digital archivist to craft a digital preservation policy, conduct an institutional review of the OJA’s digital holdings, and help us choose a digital preservation system that would meet our requirements.
The OJA ended up choosing the digital preservation system called Preservica, and it facilitates our work by automating many of the activities that would otherwise have to be performed manually by our small team of archivists. As a result, we expect that all of our digital assets, whether born-digital or digitized, will be ingested into Preservica this September—a major milestone for us.
In order to get a sense of what digital preservation looks like in practice, let’s take an example of Toronto-based organization No Silence on Race.
On June 30, 2020, just a little over a month after police officer Derek Chauvin murdered George Floyd, Sara Yacobi-Harris, Akilah Allen-Silverstein, and Daisy Moriyama published “No Silence on Race: An Open Letter from Black Jews, Non-Black Jews of Colour, and Our Allies to Jewish Congregations, Federations, Foundations, Organizations, Nonprofits and Initiatives.” The letter called on the Jewish community “to uphold the tenet of justice and to commit to the creation of a truly anti-racist, inclusive and equitable Jewish community” and marked a significant intervention in the Jewish community’s internal dialogue around racism. For this reason, the OJA was keen to preserve it.
The open letter currently resides on the group’s website; however, it is not clear how long said website will be viewable. With the group’s permission, I saved the web page in the WARC (Web ARChive) file format and ingested it into Preservica. After adding our own metadata to the digital assets, such as the date of acquisition and the name of the donor, Preservica characterizes the digital assets (determines their technical properties), runs a check for malware, and calculates checksums. According to the same handbook I quoted earlier, “A checksum on a file is a ‘digital fingerprint’ whereby even the smallest change to the file will cause the checksum to change completely” and “can be used to detect if the contents of a file have changed.” By comparing checksums, one can determine whether a digital asset is the same as the original and thus confirm its authenticity.
Inevitably, the above file format will become obsolete and it will become necessary to migrate the web page to a new file format. Without a digital preservation system in place, this would pose a threat to the authenticity of the digital asset, but since Preservica preserves the original and keeps a detailed audit trail of any activity undertaken on the original, we can always return to the original to verify that the new version faithfully reproduces the essential qualities of the original. This, in turn, gives our users confidence that the digital records they are viewing have not been tampered with.
Although the details of digital preservation may not excite everyone, we believe that the community will benefit from the increased access to digital records that digital preservation facilitates. After all, all the work that goes into preserving records, be they digital or otherwise, is meaningless if the records do not end up being used. With an intelligent digital preservation strategy in place, the OJA can make the records in its holdings available to more people in more places all the while ensuring that the records people are accessing are faithful reproductions of the originals. In this way, the OJA will continue to share back the history of Ontario’s Jewish communities for a long, long time to come.