Levels of Preservation: A Report Card

I’ve made no secret this term that our repository The Center for the Study of Tobacco and Society has a very… special way of going about its collection management, by this I mean there is no coherent schema that any professional collections manager would recognize. For some quick background our institution is almost completely geared toward producing online exhibitions, so much so that we don’t have an online content management system beyond the WordPress site where we upload material for display in our online exhibitions. Unsurprisingly this is a very poor way of providing access, infact there is no access to any of our material that is indexed and searchable the only way to get to our material is to find it in the exhibition in which it is featured.

Before this background bleeds into foreground I’ll sum it up quickly, CSTS is a mess. Now what does this have to do with the levels of preservation? Well I was looking at this chart here from the lecture and deciding how many of these levels we actually manage to cross. Whether by purposeful effort, clever negotiation, or just tripping into them; and which ones we fail miserably and how we can rectify that situation.

So let’s create a checklist then:

  • Storage and Geographic Location
  • File Fixity (Permanence) and Data Integrity
  • Information Security
  • Metadata
  • File Formats

With 4 levels all with varying conditions and directives…

  • Protect Your Data
  • Know Your Data
  • Monitor Your Data
  • Repair Your Data

So let’s make a quick assessment of how our institution does, then I will go into some detail with a few of these points, focusing on one we ace, one we fail and one we’ve either nearly made or nearly missed.

So here’s our report card: Green (+1 Point), Yellow (+.5 Point), Red (0 Point). Quick note, I’m giving us credit for Level 4 of Storage and Geographic Location, because we really need the points.

13.5/20 = 67.5%, not bad that’s almost a D+!

File Formats, Storage and Geographic Location:

We do exceedingly well at file format management, but this is a natural outgrowth of just doing our job. We don’t store things in formats they shouldn’t be kept in and that’s a convenience for us more than an institutional policy. Furthermore one of the aspects of my job is media capture and conversion so I have a stake in ensuring that our converted media is kept in such a ways as to be accessible and useable for a good long while to come. Do you know how boring it is to do digital capture of a VHS of a House Sub-Committee Meeting on the effects of Cigarette smoking and the tobacco settlement? DO you know how mind blowingly surreal it is to see congressman in a video clip from the mid 1990s on one monitor in your office and on your laptop being interviewed about mid-terms in 2018? Damn skippy I don’t want to have to do this again so yes file fixity, integrity, security the works is all in play here even if we fall flat on everything else.

The geographic contingencies have more to do with how monstrously disorganized our director is. There’s a good chance anything we have here has been duplicated three or more times, (hell we’ve found a literal stack of one document copied 30 times over, all in one place defeating the purpose, but still!) Also our “satellite locations” include, a pair of storage facilities (u-store its) in Houston and an undisclosed location in Pennsylvania (not top secret I just can’t remember), our Director’s house, and his office on the hill here on campus at the University Medical Center. So if one or many of these places exploded or vanished to the shadow realm we would have “contingencies”, so I’m giving us credit for those.

File Fixity, and Information Security:

We’re limited in what we can do with our file system as we are dependent on CCHS for our storage. This applies to our web storage beyond google drive. We have redundancy and integrity contingencies using UABox (University of Alabama’s Cloud Storage Service) Google Drive, CCHS’s Sharedrive, and a physical back up SSD that sits on my desk. Write protection is enabled by default which is good, but it doesn’t protect files outside of application use so that’s bad. Much of this is beyond our control and reflects very much how our collection is effectively a hobby or a luxury that UA CCHS tolerates and promotes when it suits them. There’s talk of a new facility built from the ground up but what we need are actual archivists to help us direct what will go into this new infrastructure as our director is again more interested in creating a museum space and space for his exhibitions, actual physical manifestations of the collection that SHOULD be the outgrowths of a properly managed collection.

Metadata

As I’ve mentioned in the past, CSTS doesn’t really have a proper CMS. We’ve had a mix of services and platforms but nothing to really help us create a consistent metadata schema and manage it properly by doing the actual data entry and processing needed to create a system that can pass muster beyond level 1. We have data fixity and security and a method of identifying what goes into our servers and onto our website, but that’s where the good news ends. Without actual metadata to speak of we can’t begin to store or transit it in level 2, nor can we store technical and descriptive metadata outside of placeholder file names that have the following format:

[Date-YYY-MM-DD] – [Publication/Author] – [Title/Description].{file}

It’s sufficient for a placeholder and can be searched using windows file explorer but is not entirely secure as it can be easily edited by changing filenames. So it’s a failure on all fronts save one and even that is a caveat check mark.

Conclusion

This report card underscores the fact that the Center for the Study of Tobacco and Society is a collection not an archive. Tangible steps have to be taken and will be taken in the next year to rectify these glaring issues and reinforce the successful policies that have been implemented going forward.

Advertisements

Digital Library Interfaces: Usability and Intuition | An Evaluation of Three Sites

Part I: The Essentials

Like oh so many of those recipe blogs where the author feels like they have to indulge the reader in a trip down memory lane; either to grandmama’s kitchen or the potentially embarrassing first date that was saved by whipping up the easy and delicious dinner 5 paragraphs and three audio-enabled ads down, I will let you all in on a harrowing tale of intrigue, despair, realization and introspection that I went on this week.

Oh, and I will also talk about what makes a digital library interface versatile, engaging, useable and intuitive, I might even decide which of these three things is most important to a digital interface… four, I meant four. Or did I?

Okay, so this week we were putting the finishing touches on our new exhibition “The Makin’s of a Nation: Tobacco in World War I” this exhibition draws not only on our own collection here at the Center for the Study of Tobacco and Society but also on material from the Library of Congress, the New York Public Library and the National Archives. Working on this exhibition has been something I have wanted to do since I started at the Center and draws on my previous experience working with military artifacts.

Our Collection Manager who helps with proofing the exhibits and managing the exhibition key came to me and wanted to know where I had gotten some of the images in our exhibition from. Since we are drawing from external sources we need to verify and note when we use publicly available items from LOC, NARA or the NYPL. I looked at the mages in question and realized I had not included their origin, so I would need to locate them online to verify their origin.

Now let me talk for a moment about four concepts:

Versatility

This is a concept wherein a website can do many things without breaking a consistent flow, all available elements of the site are readily available within one domain or at the very least with a common visual theme or prompt. In more technical terms it means being able to utilize more diverse functions on one platform. These functions lead into our next point…

Engagement

Not only having galleries or databases for items for users, but interactives and media that enhance user experiences as well as forums (even, God help you, comment sections) for community engagement.  Engagement can be done using CSS wizardry making a website feel more alive and less static. A menu system that has dropdowns that appear on a mouse over is more likely to give the user confidence that site is responsive and engaging.

Usability

Tying back into the last topic of engagement usability is ensuring that all the functions actually work and do something. Failing in this category can be catastrophic to conveying competence and capability. Rosalie Lack in The Importance of User-Centered Design: Exploring Findings and Methods discusses usability in her survey findings, but mostly focuses on the ability of users to find resources. I’m more discussing functionality in general but they are integral nonetheless.

Intuitiveness

Finally intuitiveness is the result of how we se web resources and this goes beyond our use of archival resources but down into how we utilize social media, streaming services, and search engines and much more. The internet regularly redefines its rules of the road, as new services introduce new concepts that give us new gestures, processes and methods to access services and information. Keeping up with the modern aesthetic and the mechanics of contemporary web design is crucial.

Part II: Evaluation

To not labor this post for too long lets look at a site that does it right and then some that had some hiccups.

Doin’ it Right: The Library of Congress

LOC Crop
Exquisite work from the web designers at the Library of Congress

My God, it’s glorious. Library of Congress looks like something that could be your homepage, because it is designed to look like a homepage. No seriously look at this thing the concept is great it looks like a search engine homepage. Now, it doesn’t look like the homepage for Google (click that link to see what I mean), and there’s a good reason for that. Google’s homes page has become very clean and the only information you get is maybe a scroll at the bottom and the Google doodle where the logo is. Very clean and to the point. This has to do with how Google masks its money on the backend via searches not on sponsored content like other search pages, in fact LOC looks a lot like Yahoo.

You get a big lovely image from the collection, an easily found search box sharing the marquee with the LOC branding. Below the you are given links to various departments and affiliates of the Library, but look what comes up below that, a trending menu. I’d be very interested to know how the LOC determines what is trending, it would likely be by capturing popular search terms or perhaps the most accessed pages at that particular time. Below that much like Yahoo’s homepage you have articles on collections and materials and lastly “Your Library” where an individual can access services and plan to go to the LOC. Beyond that are sections for exhibitions (with one out of date) Events, and News.

The search function works very well and will most importantly present you with he most relevant resources first regardless of what they are, a stumbling block for some other institutions that I will highlight later.

Doin’ it Good: New York Public Library Digital Collections

screenshot-www.youtube.com-2018.10.26-14-47-58
There’s a search bar in that blank spot above “Featured Collections” not sure why it didn’t render in the screen grab.

I really dig the NYPL’s interface. It’s clean, modern, visually engaging and let’s you know exactly what you are getting into. It reminds me very much of youtube in some ways and of Bing in others. The Bing comparison has mostly to do with the emphasis on graphics and images in the interface. These Digital Collections focus primarily on photographs so that makes the most sense, resultingly every subsequent page and interaction with he user is tailored to producing vivid readily available items in an interfaces that brilliantly bridges classic functionality requirements with updated interfaces.

The only knock I have for the interface is how interminably long the front page is, I’ve spared you it’s entire length by providing just a cropped screen shot. They bombard you with items from the collection which are fascinating but relevant links and important information and articles get buried a few laborious scrolls to the bottom of the page. It’s not a crippling deficiency but it doesn’t quiet come up to the level of effectiveness that the LOC exemplified.

This is covered in Lack’s article on user centered design when she discusses in part five of her results that navigation options need to be prominent and in an expected location saying:

It is easy for users to feel lost or disoriented while navigating through information-rich resources such as those on archives and library sites It is helpful to make common navigation elements such as links to Home, Search, About, and Contact Us available and prominent on all pages.”

Using the footer is a good universal option for this but when you have so much content that the footer is dropped out of sight and far away from the user’s immediate reach the gesture is meaningless. As a web designer here’s a helpful guide for you to remember about what we call “eye-flow” or how users have been conditioned to view pages.

Web Layout Diagram

Doin’ it Not as Good: The National Archives and Records Administration

screenshot-www.archives.gov-2018.10.26-14-40-30
So far so good, but there’s some cohesion issues already.

 

Okay so here’s what really inspired me to write this blog post on this topic. I had the most trouble finding what we were looking for at the National Archives and Records Administration, as a matter of fact the interface was so clunky and un-intuitive I never could find what were looking for.

Now I want to give full credit to NARA, it’s a tremendous design, but there are somethings that are out of place and two in particular that I can point to immediately. The search bar is tucked away in the top right corner, where one could expect it, but it feels to me that this bar could be of more use featured our front as it was with NYPL, and the LOC. Now in fairness the LOC search bar is tucked in exactly the same place, but it is far more prominent in it’s presentation whereas NARA’s search bar is more muted and subdued.

Also NARA has a great deal going on in the content area that drags the user’s attention away immediately from the functional header. Merely flipping the featured item and the Archives News sections with the five featured items images would redirect the eye to focus on the top rather than quickly browsing all those choices and becoming disoriented. Think of it like a pyramid, the order should be, 1 then 2, then 5 so that the eye flow moves naturally. In the current layout 1 then 5 then 2, the user gets distracted and disoriented with options that may not necessarily be what they came to NARA for.

So what was my hang up? Well I was looking for images and could only get results for documents. I mean I know it’s the National Archives and Records administration but are you telling me there’s no images at all!? Well if I hadn’t known better I would have simply assumed as much, shrugged shoulders and left it be. No it turns out you have to click a tab on your search results to view images.

Left: Initial Search Results, Right: The Images after clicking the inconspicuous “Images” tab.

Now you might say, “But hey Kevin, you said that the internet teaches us these practices and methods, how is that any different from google images you have to click a tab to get to those smart guy.” To which I would mutter, “How dare you do this to me.” You’re right that is true, but I wasn’t expecting to have to do that in this interface, as I haven’t had to do this at other repositories. It’s a little thing I call resonance. Now I hear you saying, “You just made that up…” and you’re right but I think it works here. Let me explain, resonance is this concept of cohesive flow but expanded into other sites of like purpose and intent. Social media sites have essentially the same interfaces and options located in relatively the same areas. Now sometimes we have to consider the purpose of the website and weigh that against user needs. Perhaps this is something I can leave for you all to discuss in your responses?

The conclusion of the gripping saga of the missing photos, a Hardy Boys Mystery.

So where did those photos come from? Turns out they were from a book that was sitting on my desk. I wasted a whole bunch of time, sweat, blood and yes even tears looking for something that was right in front of me.

Hope you all have a better week than me.

Disorder & Digitization: Establishing protocols and practices in an amateur archival setting

I thought instead of reviewing one of the readings for my blog post, I might provide some insight from a unique experience in developing a digital collection as compared to a previous experience in curating a physical collection and how order can be established from seeming chaos in digital practice.

Background:

At Center for the Study of Tobacco and Society (CSTS) I’ve had to develop, in conjunction with our collections manager, a considerably ad hoc version of an archive. First thing to establish is that we cannot properly organize our physical collection for a host of reasons too contentious to get into here. Second and more to the point of this article, we do not have access to any proper collections management software, and the only archival program we have access to is ArchivesSpace which is not the digital asset management system we need. I assume someone searched for “archival software” on Google, found this program, saw it was free and then found out its limitations and never bother with it again.

Why digitize:

It’s my job. But more broadly, this material in a practical sense needs to be digitized so that it can be disposed of as it is (for the most part) little better than fodder for the recycler. Doubtless there are incredibly rare and valuable items in the collection but the vast majority are print outs, contemporary documents, and entire magazines often saved for a single ad. Digitizing this material would allow us to dispose of much of the material that we desperately need to get rid of pending a move out of our current facility.

Mechanics of Digitization:

The very short of how we do our capture is any way we can, most commonly our material is scanned using a photocopier and that is sufficient enough for our purposes. More delicate items are of course handled with care when scanned, more substantial and particularly recent items (documents, etc.) are sheet fed to expedite the process of digitization. We use cameras when needed, microphones if we are taking oral histories, a tape recording capture machine, a VHS converter, all manner of conversion and capture devices.

Metadata:

So, how do we manage our digitized material then? Short answer, poorly. Long answer, by trying to provide as much identifying information as Windows File Explorer will allow us to do using our Server space on the university’s server. To provide enough information we’ve developed a method for file naming that provides the most essential data we need.

[Date YYYY-MM-DD] – [Publication/Author] – [Title / Subject].<file>

Example:
1912-04-15 – New York Times – Titanic Sinks Four Hours After Hitting Iceberg.pdf

It’s not the most perfect solution but the hope is that by having this information in a standardized format when we do begin to translate this material over to a Content Management System it will make metadata creation easier.

Project Planning:

Content Selection – Most often our task is to digitize a physical exhibition that our director chooses. The immediate point of the website and therefore the digitization efforts of the Center are to translate these exhibitions into online formats. Since my arrival we’ve produced several exhibitions and will launch several more before the end of the year.

Rights Evaluation – The vast majority of our material is either original to the Center and the organizations associated with it, public documents, or vintage enough to fall under a comfortable fair use policy. The rest we post under the auspices of fair use for the purposes of education and research.

Digitization Strategy – This largely depends on what type of materials are included in the exhibition. Most often these are single page documents or ads from magazines, so in general its Scan, Send, Crop, Process, Name, Upload, Post, Tag. There’s other steps such as designing the online exhibition pages, writing copy, coordinating with external contributors and so on. This process can take several months to complete one exhibition.

Descriptive Metadata Profile – I would offer some wisdom here but as I mentioned previously our metadata schema consists of filenames with regimented descriptors. Sorry.

Conclusion:

That’s about all I have to say about this, I wish I could provide more insight but probably the most insightful thing I can mention is to echo what Jerell Jones said in the slides that “there’s no ‘one way’ to digitize something properly.” Obviously our organization needs a CMS and with it more regulated methods for creating meta-data, but in the main you will find as you move into different positions with varying degrees of organization that methods and practices will always vary depending on a wide range of causes.