Raisin’ Cane (and web archiving too!)

Mount Dora High School is “Cane Country.” Located in Mount Dora, Florida, this high school is one of fifty learning  sites that comprises the Lake County School district.[1] After just one visit to the Mount Dora High School website, you’ll probably notice two things:

1) Mount Dora’s colors are orange and white.

2) Mount Dora’s mascot is a Hurricane, and this school is proud of it!

What you might not notice, however, is the “Our School” tab toward the top of the home page. A gentle drag over this tab results in a drop-down menu displaying a few options, one being the Media Center. Go ahead, click “Media Center.” Seriously, do it 🙂 Okay, once you make it to the Media Center’s page, you’ll see a few options on the left hand side that you might expect, like “Media Center Information,” “Reading blog,” “Research Sites and Guidelines,” etc. But look again, and you’ll probably see something that you won’t find on a typical media center website. You’ll see “Web Archiving Project.”

View Mount Dora’s awesome web archiving trailer!! (Click on the link below)


Mount Dora High School has been an Archive-It K-12 partner since October 2010. Thanks to Ms. Patricia Carlton, one of Mount Dora’s wonderfully dedicated media specialists and the sole driver of the school’s web archiving partnership with Archive-It, Mount Dora students have created thirty web archive collections and preserved over three-hundred and fifty URLs. These web archive collections are diverse and range in topic from “Politics 2012” to “Modern Music” to “Social Networking.”[2]  I recently had a lovely Skype conversation with Ms. Carlton to chat about her experience implementing the Archive-It K-12 program at Mount Dora High School. Here’s an overview of our great conversation:

How long have you been an educator?

Ms. Carlton has been an educator for thirty years, and she’s been a media specialist for twenty-two of those years. “I love libraries,” Ms. Carlton said. Prior to working on the library side of education, Ms. Carlton taught English and art, and she has worked with students of all ages.

How many years have you been involved in the Library of Congress/Archive-It K-12 Web Archiving Program?

Seven or eight years. 

How did you find out about the web archiving program?

Ms. Carlton became acquainted with the web archiving program after attending one of the Library of Congress’ summer institutes for educators. She had the pleasure of meeting Neme Alperstein, a teacher in a New York City public school, P.S. 174, who encouraged her to get involved with web archiving. Ms. Carlton began implementing the web archiving program with AP Geography students at Eustis High School. When she transferred to Mount Dora, she implemented the program in an English class. Ms. Carlton happily stated that “each year seems to get better and better.”

What made you interested in the program?

Ms. Carlton took her time with this question. “That has evolved,” she said. She was initially intrigued by the program because of its national connection with the Library of Congress. Once she started working with the program, however, her interest deepened. “I wanted to become better at teaching it.” Now, “the idea of preservation” is what really fascinates her. “What interests me now is this whole new way of looking at websites,” she explained. Both she and her students are “learning a lot about digital literacy” through the program while also gaining a new perspective on history. Ms. Carlton summed it up perfectly: “It’s about us saving our heritage.”   

How do students select websites?

Ms. Carlton begins by showing students examples of web archive collections from previous years and schools. She explains to her students that “archives are collections.” She tells them, “Think of your local community [and] what is important to teen culture. We are going to represent ourselves to the world.” Basically, her students “have to write what their collection is going to be and why it’s significant to them. Then they surf the web.”  Ms. Carlton wants the collections to be as student generated as possible, but occasionally she does provide guidance. “Over the years, I kept seeing the same collections—music, sports, etc.” She felt like her students could go farther than that. “This past year, I had them think about global issues that might affect them personally.” Ms. Carlton found that her students began “taking their subjects seriously” and came up with sites focusing on the environment, ecology, and other social issues.

Do students catch on to the technological requirements for the project easily?

“Yes, mainly the team leaders,” Ms. Carlton said. She admitted that there are sometimes technical issues with the network, causing students to lose the metadata they created. Hence, she always encourages them to write it out first. Another technological learning experience that her students had was with “budgeting.” Ms. Carlton’s students have a specific budget of data and space they can use during the web archiving project. Some students “used up their budget very quickly because the sites they chose had a lot of embedded videos,” she said. Students had to learn that “you don’t just simply archive a website … you limit it.”  

How has web archiving benefited your students?

“They feel important,” Ms. Carlton said without a doubt. Her students love knowing they are part of a big program. In addition, Ms. Carlton noted the many educational benefits her students have received through the program. Through web archiving, students learn how to evaluate and deconstruct a website’s content; they learn that websites have messages and are created by many authors; students also learn that they have to be responsible when archiving sites because “every bit of data is space.” At a certain point, they must decide, “what’s important and what’s not.”

What challenges have you run into?

  Ms. Carlton said that getting school and community support can be a challenge. She noted, however, that working with supportive teachers helps a lot. At Mount Dora High School, one of the history teachers she worked with was extremely enthusiastic, and this “enthusiasm carried over to her students.” Ms. Carlton explained that getting support is “always an ongoing thing.”   

Are there any curriculum materials you wished you had to help you further integrate web archiving into the school curriculum?

“That would help,” Ms. Carlton affirmed. “When I introduce it, I tend to rely on the Library of Congress’ Teaching with Primary Sources.” Beyond that, “I [don’t] really have anything to go off of.” Ms. Carlton would like to see resources that provide her with another way of teaching the history of the web and the structure of the web. “I’ve pretty much learned as I’ve gone along, periodically looking at other teachers’ collections and resources. “I would love to collaborate with other teachers,” she mentioned. Ms. Carlton also suggested the use of rubrics and tutorials to help educators make web archiving a part of the school curriculum.

Ms. Patricia Carlton and the students of Mount Dora High School have truly done an amazing job with web archiving. The collections are diverse, thoughtful, and really speak to the issues and passions that are unique to teenagers and to residents of Mount Dora, Florida. One of my favorites is the “Life in Mount Dora, Florida” collection. As someone who has never lived in Florida and who knows very little about Mount Dora, this collection gives me a peek into the lives that these students live every day. Check out this collection along with many more created by Mount Dora students by using these links:



And of course…Go Canes!!

mount dora logo

[1] “About Us,” Lake County Schools, accessed June 23, 2015, http://www.lake.k12.fl.us/domain/7745

[2] “Mount Dora High School,” Archive-It, accessed June 23, 2015, https://archive-it.org/organizations/498.


Hold up, wait a minute!

It recently came to my attention that I may have jumped the gun a little bit. In my eagerness to talk about web archiving, particularly, Archive-It’s web archiving initiative targeted for K-12 students, I failed to explain how this process actually works. After all, managing born-digital content is a relatively new venture in itself, and sometimes it can be hard to conceptualize. Hence, we’re going to take a closer look at how Archive-It renders its web archiving services. And yes, I’m sure a few inquiring minds out there want to know how in the world K-12 students are able to do this whole web archiving thing at such a young age. Don’t worry…we’ll talk about that too 🙂

What is a web archive?

Archive-It defines a web archive as a “collection of archived URLS grouped by theme, event, subject area, or web address.”[1] The goal of a web archive, however, isn’t just to make web content available. A web archive strives to recreate the same (or as close to the same) web experience a user would have gotten the day that site was archived. Basically, if you’re visiting an archived site, and it has the same background you remember, the right title, those fonts you used to love…yet when you scroll down you see a bunch of these everywhere

bad imagebad image 2

that’s not good.

What is Archive-It?

Archive-It is a web archiving service powered by the Internet Archive. It was established in 2006 with the goal of working alongside partner organizations to archive their specific web content. I know what you might be thinking…”Doesn’t the Internet Archive do all of this same web archiving already?” Well, yes and no. The Internet Archive does archive the web, just on a broad scale. It’s called, the “General Archive.” It’s automated, free, and it takes one snapshot of the web roughly every two months, capturing about 3 billion web pages per snapshot.[2] The downside is, there is no guarantee that any specific web content will be preserved on a regular basis. Individual organizations don’t have control over when these snapshots take place, and the General Archive collection is not easily searchable. Archive-It, on the other hand, is a user-friendly subscription service that allows organizations or individuals to maintain control over when and how their web content is preserved. The record creator becomes the archivist, selecting sites for preservation, capturing them, and creating metadata to facilitate the future use of these sites.

How does Archive-It actually archive it?

Archive-It is a web-based application, meaning it does not require software. It users open-source technology developed at the Internet Archive.[3] Here are the worker bees:

Heritix—The web crawler that crawls and captures web pages

Umbra—Assists the crawler

Wayback—The access tool that renders sites and lets us view them

NutchWAX—Facilitates full-text searching

SOLR—Facilitates metadata searches

As an Archive-It partner, what’s my job?

All Archive-It partners get a login account. Within this account, partners can manage how their collections are preserved and accessed. Once a partner is logged in, the home site will look something like this:

Archive-It home site pic

Within a login account, partners can…

  • Create a new collection
    • Choose a name for the collection
    • Choose the frequency of crawls
    • Write metadata
    • Select topics/categories
    • Add “seeds”—seeds are the starting point URLs for the crawler (the seeds determine which sites are “in scope” for your crawl)
    • Manage the extent of the crawl
  • View summaries and reports about archived collections and other data that has been captured
  • Run test crawls
  • Start crawls
  • Manage collections and seeds
    • Can change the previous settings chosen in “create a new collection”
  • Manage access settings and searches
    • Can make access public or private for entire account, certain collections, or even specific URLs, crawls, pages, or IP addresses
    • Can browse archived content 24 hours after capture is complete through the Wayback machine
    • Full-text search is available after 7 days

Where is all of this data stored?

I’m glad you asked. Archive-It uses multiple methods of storage. When you create a collection through Archive-It, two copies of the archived data are stored at the San Francisco Data Center. Collections are then transferred to the General Archive as a third copy. You also have the option of obtaining a copy of the archived data on a hard drive, and you have the ability to download files from the Internet Archive server, aka the “PetaBox” storage system (shown below).[4]

IA petabox

Archive-It is also working with other digital preservation initiatives including the Stanford University-based program Lots of Copies Keep Stuff Safe (LOCKSS) and DuraCloud, a service of DuraSpace.

How on earth can K-12 students do web archiving? Isn’t it too complicated for them?

You’d be surprised. While most of the students involved in the program are in middle and high school, students as young as fifth grade have been involved in the Archive-It K-12 web archiving program and have succeeded wonderfully. Of course, their participation requires guidance from a teacher who has attended the necessary Archive-It training sessions and who is committed to helping them through the process. The students’ main responsibilities in the web archiving program include

  • Deciding on themes or topics for collections
  • Evaluating and selecting websites they want to preserve
  • Working with their teacher to use the Archive-It web application and enter URLs for preservation
  • Writing descriptive metadata about their collections
  • Using the online access interface to review their crawls and see what worked and what didn’t
  • Completing a short survey[5]

It’s really amazing to see what these students can do https://archive-it.org/explore?fc=organizationType%3Ak12ProjectSchools

What if I want more information?

In fact, you should want more information, because I just scratched the surface here. I highly recommend that you sign up for the free informational webinar that Archive-It hosts via WebEx every few weeks. I took the webinar twice as a part of my internship, and it was wonderful both times. That’s where I got most of the information I just shared with you. The next webinar is scheduled for July 14, 2015 at 11:30 a.m. Pacific Daylight Time. If you’re interested, visit this link: https://archive-it.org/contact-us.

I hope this helps a bit!

[1] Lori Donovan and Scott Reed, “Archive-It Archiving and Preserving Web Content,” (webinar presentation, June 2, 2015).

[2] Jefferson Bailey, “Educational Partnership Training,” January 24, 2015.

[3] Donovan and Reed, “Archive-It Archiving and Preserving Web Content,” 2015.

[4] Donovan and Reed, “Archive-It Archiving and Preserving Web Content,” 2015; “Petabox,” Internet Archive, accessed June 18, 2015, https://archive.org/web/petabox.php.

[5] Archive-It, K-12 Web Archiving Program, accessed June 18, 2015, http://aitlearnmore.archive.org/files/2014/07/k12_webarchiving_overview.pdf

Show me your web search history, and I’ll show you who you are.

There are a few things in this world that I love to hate. One of them is my web search history. I hate it because it provides such a raw, unedited look into my life. While the majority of my search history is filled with the same, repeating domains (www.google.com, www.clayton.edu, www.gmail.com, www.facebook.com, www.linkedin.com, www.arl.org, www.youtube.com, www.outlook.office.365.com), it doesn’t change the fact that my search history reminds me of everything I’ve looked at online, uncouth or not (yes, I am human…I’ve made some uh…”questionable” site visits in my lifetime). Basically, my web search history is a look into my personal life that I didn’t ask to be accessible.


I can’t help but love my web search history, essentially, for the same reasons I hate it. Without asking my permission, my laptop stores a list of URLs  that it thinks are important to me based on the sites I visit each day. If someone next week, a year from now, or even 100 years from now managed to get a hold of this web search history and could still see the sites, what would they think of me? What would this web history tell them about my personality, my interests, my activities, my life?

The Archive-It K-12 Web Archiving Program asks these same questions to primary and secondary students. Today’s students are growing up in an age where the Internet and the Web are vital components of their personal, academic, and societal development. Websites facilitate learning, social interaction, shopping, recreation, creativity, and much more. Remember that saying, “show me who your friends are, and I’ll show you who you are”? Well, I think the saying, “Show me your web search history, and I’ll show you who you are” is becoming more and more appropriate. While most students enjoy interactions with the Web as a part of daily life, they may not realize that the websites they know and love will not be there forever. According to the archivist Jackie Dooley, “without periodic harvesting of all this [web content] information, the content is gone, gone, gone.”[1] As websites vanish before our eyes, the culture we’ve created around these sites becomes more and more at risk with every passing day.

The K-12 Web Archiving Program (https://archive-it.org/k12/introduction.html) began in 2008 thanks to a partnership between the Library of Congress (www.loc.gov) and Archive-It (https://archive-it.org), a web archiving service powered by the Internet Archive. Through this program, students have the opportunity to select, capture, and preserve websites for future generations of researchers. The program gives students the chance to save websites that represent their lives and that are important to them. It gives K-12 students a voice in the historical record while empowering them with knowledge and skills in digital preservation and archival literacy. Recently, I had the great pleasure of chatting with Cheryl Lederle, an Educational Resource Specialist at the Library of Congress. Ms. Lederle has played a major role in the growth, development, and promotion of the student web archiving program. I’ve included some pieces of our conversation below.

How did the K-12 Web Archiving Program come about?

“One of the biggest user  groups [K-12 students] was not represented in the Archive-It collections,” Ms. Lederle said. After the pilot program in the spring of 2008, “there was a general consensus that [students] should be a part of the decision making” of web preservation.

What were some of the program’s initial goals?

According to Ms. Lederle, the program was intended to be “eye opening for many of the students” by helping them realize “how ephemeral many websites are.” Since Ms. Lederle is a former English teacher herself, she also noted the importance of allowing students to write the metadata that accompanies their archived web collections. “Even very young people—with the right support—can do it,” she said confidently.

What challenges did you foresee/experience in the program’s early days?

Ms. Lederle said it was challenging to figure out “how to make [the program] connect to curricular needs.” In fact, this is still a challenge. She explained that most schools involved in the web archiving program do so as a part of library programs or after-school programs that remain outside of “tightly prescribed” school curricula. However, Ms. Lederle mentioned that some educators, including Paul Bogush [link to: http://www.digitalpreservation.gov//multimedia/videos/k12preservingpresent.html] from James H. Moran Middle School [link to: http://www.digitalpreservation.gov/multimedia/videos/k12webarchiving.html]  in Wallingford, CT, have been able to incorporate the web archiving program into history and social studies courses. Web archiving teaches students that “who is preserving history is in fact writing history.”

How did you react when you saw the first student web archive collection?

In addition to the expected feelings of surprise and excitement, Ms. Lederle emphasized how pleased she was to look at the collections and say, “This [collection] is student work.” Ms. Lederle observed that the collections generally represent the students themselves, not the adults guiding them.

In what ways can this program benefit students and their awareness of the world around them?    

According to Ms. Lederle, web archiving “empowers students to take a long view of history” and encourages them “to be a part of the recording of history.” She also said web archiving helps students to understand that “preserving digital materials is trickier” than preserving analog ones . The web has an ephemeral nature that analog materials do not. Ms. Lederle gave the example of a box of letters. While the letters may be hidden or misplaced at some point in history and need to be found again, they will not vanish. Web archiving teaches students that digital materials are different—”this stuff might simply disappear.”

One of the best parts about the K-12 Web Archiving Program is that the student collections are easily available to the public for browsing and searching. The Archive-It website has made over two hundred students collections available for research, and these collections are searchable by subject, collector, creator, date, etc. (https://archive-it.org/explore?fc=organizationType%3Ak12ProjectSchools).  Each school also has its own web archiving page for each year it participates in the program. My next few blog posts are going to take a deeper look into these K-12 collections. I’ll also include some of my discussions with teachers who have participated in the program with their students. Get excited, y’all!

[1] Jackie, Dooley, “Going, going, gone: The imperative for archiving the web,” hangingtogether.org (blog), OCLC, April 22, 2015, http://hangingtogether.org/?p=5140.

Web 1.no?

I know I’m not the only one who wished for a time machine growing up. A real, bona fide, Back-to the Future-rivaling time machine, mad scientist and all. Even though my parents and grandparents always tried to convince me that life “back in the day” was much harder and would have required me to walk to school, use “ma’am” and “sir” EVERY time, and manually change the television channels for adults at their every whim, I still desired to experience the past. I think what most attracted me to the idea of a time machine was the ability to see and understand what people from the past liked to do, what interested them, how they affected change, and what visions they had for the future.

Even now, at my ripe old age of twenty-four (#millennialsrock!), I shudder to think that there is an entire generation out there, breathing, eating, and tweeting, with zero knowledge of a concept that some of us Internet oldies like to call Web 1.0. In fact, Butch Lazorchak and Cheryl Lederle allude to this very phenomenon on their recent blog post featured on the Teaching with the Library of Congress blog. They state:

If you believe the Web (and who doesn’t believe everything they read on the Web?), it boastfully celebrated its 25th birthday last year. Twenty-five years is long enough for the first “children of the Web” to be fully-grown adults, just now coming of age to recognize that the Web that grew up around them has irrevocably changed.”[1]

As a true “child of the web,”  I can attest to the fact that yes, the web has definitely changed. So much so that I feel the need to explain my prior mention of the term “Web 1.0.” When I say Web 1.0, I’m speaking of a time before you could share, like, comment, retweet, or repost anything on the web. I’m speaking of a concept commonly known as the “read-only” web. In other words, Web 1.0 is “the first implementation of the web” that only allowed users to search for information, access it, and read it—that is all.[2]

For those of us who were around to experience Web 1.0, let’s be real…we hardly remember it.  In my own experience, Web 2.0 seems to have an incredibly effective way of just wiping my memory clean of anything that came before it. Even the constant changes within Web 2.0 applications are hard to keep up with. I don’t even remember what Facebook looked like when I joined it in 2007. As we embark on Web 3.0 and the advent of the Semantic Web, I’m sure it’s just a matter of time before Web 2.0 becomes an ancient artifact.[3] But just because we may not remember our own web history does not mean that it hasn’t been preserved…

In 1996, Brewster Kahle got the feeling that this whole “web” thing might be a bit of a game changer for documenting human culture. Although the web was still in its infancy, Kahle founded the Internet Archive (www.archive.org) with the goal of building an Internet library. But the Internet Archive isn’t your average digital library. It’s bigger, like, a lot bigger. It doesn’t just contain books, movies, music, and images—it also has archived web pages. Since 1996, the Internet Archive has preserved about 480 billion web pages and over 80 million websites, making it the largest publicly available web archive in existence.[4] For those of you purists who like things in terms of bits and bytes, as of December 1, 2014, the Internet Archive’s web page collection contained almost 9 petabytes of data, which supersedes the amount of text at the Library of Congress.[5] We can view these archives using the Internet Archives’ WayBack Machine (https://archive.org/web/). If you haven’t visited the WayBack Machine, please go. You will have WAY too much fun. After all, it’s as close as we’re going to get to that time machine!

Living out my childhood dreaming of having a time machine is all well and good, but there are a few heavier implications here as well. Preserving the web is not just something to do for fun; it is a cultural necessity. A recent study conducted by Andy Jackson at the British Library and presented at the 2015 International Internet Preservation Consortium (IIPC) General Assembly confirms this point. After ten years of studying content from the UK Web Archive, “[Jackson] found that after one year[,] half of the content was either gone or had been changed so much as to be unrecognizable. After ten years almost no content still resides at its original URL.”[6]

Web archiving is important because it preserves knowledge, culture, and a human experience that will not always be there, despite what we may think. Images of the “Occupy” movements, presidential election websites, the 2011 Egyptian Revolution—these are all examples of significant cultural and political events, documented largely on the web, and hence, at risk of eventually vanishing from societal memory if they are not actively and intentionally saved. That’s why web archiving initiatives are so imperative. The Internet Archive is certainly not the only web archiving initiative out there. The scholars Daniel Gomes and Miguel Costa reported that as of April 2013, there were at least sixty-four web archiving initiatives thriving worldwide.[7] You can find a list of these initiatives on their wiki page: http://en.wikipedia.org/wiki/List_of_Web_archiving_initiatives.

I know, this web stuff is heavy, right? As we millennials like to say, “ish” just got real. But no need to panic 🙂 Through web archiving, we have the tools to let the same “ish” be real for someone one hundred years from now, even if by that time they are working with Web 30.0 or something. How cool is that?

1990s meme

[1] Butch Lazorchak and Cheryl Lederle, “The K-12 Web Archiving Program: Preserving the Web from a Youthful Point of View,” Teaching with the Library of Congress (blog), May 21, 2015, http://blogs.loc.gov/teachers/2015/05/the-k-12-web-archiving-program-preserving-the-web-from-a-youthful-point-of-view/.

[2] Tom Fleerackers, “Web 1.0 vs. Web 2.0 vs. Web 3.0 vs. Web 4.0 vs. Web 5.0 – A bird’s eye on the evolution and definition,” Flat World Business (blog), accessed June 5, 2015, https://flatworldbusiness.wordpress.com/flat-education/previously/web-1-0-vs-web-2-0-vs-web-3-0-a-bird-eye-on-the-definition/.

[3] I still don’t understand Web 3.0 all that well, but if you want to read more about it, there’s a good article on howstuffworks.com (http://computer.howstuffworks.com/web-30.htm). The World Wide Web Consortium also has great information about the Semantic Web (http://www.w3.org/standards/semanticweb/).

[4] Lori Donovan and Scott Reed, “Archive-It Archiving and Preserving Web Content,” (webinar presentation, June 2, 2015).

[5] “Frequently Asked Questions,” Internet Archive, accessed June 5, 2015, https://archive.org/about/faqs.php#The_Wayback_Machine.

[6] Abbey Potter, “Dodge that Memory Hole: Saving Digital News,” The Signal: Digital Preservation (blog), June 2, 2015, http://blogs.loc.gov/digitalpreservation/2015/06/dodge-that-memory-hole-saving-digital-news/.

[7] Daniel Gomes and Miguel Costa, “The Importance of Web Archiving for Humanities,” International Journal of Humanities and Arts Computing 8, no. 1 (2014), http://eds.b.ebscohost.com/eds/pdfviewer/pdfviewer?vid=7&sid=16f903a5-d7c9-41bd-9926-be84687647c8%40sessionmgr114&hid=117.

Semantically Sinking

It hasn’t taken me long to realize that I am semantically sinking in the world of digital archives. And when I say “sinking,” I mean thoroughly confused! I don’t know if anyone else feels this way or has felt this way, but if so, get ready to put on your life jacket. We’re going to float ourselves back upstream and tackle the semantic differences and similarities between some hefty, ubiquitous phrases now ruling the digital archival landscape. These phrases include digital curation, digital preservation, digital archives/archiving, and web archiving. Before I go any further, please know I’m not trying to insinuate that I hold any type of authority on how these phrases are defined or how they relate to one another. This post contains a mix of scholarship that I found helpful, but it is not comprehensive of all scholarly views out there. I’m merely a grad student/intern trying to understand this stuff to the best of my ability. Now, with that said, let’s have some fun.

Digital Curation is “hot”…

 If there’s one thing I learned about digital curation during this semantic sojourn of mine, it’s that digital curation is new, hot, and everyone has a different opinion on it. Like the archivists Christopher (Cal) Lee and Helen Tibbo noted in 2011, “the term ‘digital curation’ has recently come into use, reflecting the increasing confluence of previously distinct communities.”[1] Even though digital curation may be a new, shiny term, the actual word “curation” has quite a history. Lee and Tibbo’s article provides a brief look at how the word curation has evolved over its six hundred years of existence. Here’s a little breakdown:

  • 14th century: curation=healing; caretaking of one’s affairs
  • 1960s and 1970s: curation=care of scientific specimens
  • Mid 1970s: curation=term used in archaeology referring to the “anticipated performance of different activities”
  • 1980s and 1990s: curation=data curation, which referred to the management of scientific data
  • 2000 – present: digital curation=term used archives, library, and information science fields; “umbrella term” spanning across disciplines and activities[2]

Later in Lee and Tibbo’s article, they reference Elizabeth Yakel’s definition of digital curation from her 2007 OCLC article. She defines digital curation as “the active involvement of information professionals in the management, including the preservation, of digital data for future use.”[3] Yakel’s article also provides some different definitions of digital curation that other organizations have come up with. The organizations range from the Digital Curation Centre in the UK to good old Wikipedia. Here are some of the different definitions:

Digital Curation Centre

“Digital curation involves maintaining, preserving, and adding value to digital research data throughout its lifecycle.”[4]


**Definition when Yakel wrote the article:

“Digital curation encompasses all of the actions needed to maintain digitised and born-digital objects and data over their entire life-cycle and over time for current and future generations of users. Implicit in this definition are the processes of digital archiving and digital preservation but it also includes all the processes needed for good data creation and management, and the capacity to add value to data to generate new sources of information and knowledge.”[5]

**Current definition:

“Digital curation is the selection, preservation, maintenance, collection and archiving of digital assets. Digital curation establishes, maintains and adds value to repositories of digital data for present and future use. This is often accomplished by archivists, librarians, scientists, historians, and scholars. Enterprises are starting to utilize digital curation to improve the quality of information and data within their operational and strategic processes. Successful digital curation will mitigate digital obsolescence, keeping the information accessible to users indefinitely.”[6]

e-Science Curation Report

Curation is “the activity of managing and promoting the use of data from its point of creation, to ensure it is fit for contemporary purpose, and available for discovery and re-use. For dynamic data sets this may mean continuous enrichment or updating to keep it fit for purpose.”[7]

From what I gather, digital curation is the big picture. It is the entire archival endeavor beginning from a record’s creation and extending into its archival life. A curator is involved in the creation, appraisal, selection, use, maintenance, preservation, archiving, access, promotion…everything pertaining to a body of records and its life span. A curator oversees the entire workflow of a collection and is also concerned with aggregates of collections.

…but calling “digital curation” by any other name is not

If you’re going to use the terms, “digital curation” and “digital archiving” interchangeably, don’t do it when Adrian Cunningham is around. In his article, “Digital Curation/Digital Archiving: A View from the National Archives of Australia,” Cunningham makes clear his stance that curation is different from archiving, and hence, digital curation is different from digital archiving.[8] I’ll paraphrase a few points from his article:

  • Digital curation and digital archiving should not be used interchangeably
  • Curation of digital records deserves a separate term, which is “digital archiving”
  • Digital archives are not the same as digital libraries or museums
  • Digital curation of archival materials is not just about digital collection management
  • Digital preservation, digital libraries, digital archiving, and data management are all a part of digital curation[9]

This is just one opinion, but it is something to think about.

Under the umbrella…

Under this massive umbrella term, “digital curation,” we find ourselves swimming around with a bunch of other narrower terms, including digital archiving, digital preservation, and web archiving. According to the American Library Association, “digital preservation combines policies, strategies, and actions that ensure access to digital content over time.”[10] The California Digital Library defines digital preservation as “the managed activities necessary for ensuring the long-term retention and usability of digital objects.”[11] As for web archiving, the Internet Archive defines this term as “the process of collecting ports of web content, preserving the collections, and then providing access to the archives – for use and re-use.”[12]

So how does this all fit together?! Seth Shaw, one of my awesome professors at Clayton State University, gave me some advice for understanding these terms. He said:

“Consider each of the terms without “digital” (or “web”) in front of it. What then are the relationships between each of them (the denotative and connotative)? The same holds true if you restore the “digital” or “web” modifier (working with a subset of content based on form).”[13]

Whoa. Mind blown.

So maybe, just maybe, we can try something like this…

Digital curation is the biggest concept. It starts at a record’s creation and continues into its archival life. Digital curation INVOLVES digital archiving. Digital archiving includes all of the tasks that accompany traditional archiving (appraisal, selection, acquisition, arrangement, description, preservation, reference, access, outreach, management)—the only difference is that the records in question are digital, not analog. Web archiving is ONE WAY OF DOING digital archiving. Web archiving mostly focuses on selecting, capturing, and preserving web pages so they will be accessible in the future. As for digital preservation, it is ONE TASK INVOLVED in digital archiving. Digital preservation asks what it takes to preserve a digital object over time and how to ensure that the digital object is usable in the future. To come full circle, all of this is UNDER THE UMBRELLA of digital curation.

Whew, that was a lot. Hopefully you’re not still sinking, even if you don’t fully agree with some points I made or some points that the scholars made. You might still be floating in the water, but my hope is that after reading this post, you can at least see the shore.

[1] Christopher Lee and Helen Tibbo, “Where’s the Archivist in Digital Curation: Exploring the Possibilities through a Matrix of Knowledge and Skills,” Archivaria 72 (2011): 123-168, http://ils.unc.edu/callee/p123-lee.pdf.

[2] Ibid., 124-126.

[3] Elizabeth Yakel, “Digital Curation,” OCLC Systems & Services: International digital library perspectives 23, no. 4 (2007): 335,

[4] Ibid., 337.

[5] Yakel, “Digital Curation,” 337.

[6] “Digital Curation,” Wikipedia, last modified December 23, 2014, http://en.wikipedia.org/wiki/Digital_curation.

[7] Yakel, “Digital Curation,” 338.

[8] Adrian Cunningham, “Digital Curation,” American Archivist 71, no. 2 (2008): 531, http://ezproxy.clayton.edu:2076/stable/pdf/40294529?acceptTC=true.

[9] Ibid., 530-543.

[10] “Definitions of Digital Preservation,” American Library Association, last accessed June 3, 2015, http://www.ala.org/alcts/resources/preserv/defdigpres0408.

[11] “Glossary,” California Digital Library, last accessed June 3, 2015, http://www.cdlib.org/gateways/technology/glossary.html#d.

[12] Lori Donovan and Scott Reed, “Archive-It Archiving and Preserving Web Content,” (webinar presentation, June 2, 2015).

[13] Seth Shaw (Assistant Professor of Archival Studies, Clayton State University) in email discussion with JoyEllen Freeman, June 2, 2015.

Beautiful beach and sea