Talk About: Ebook Preservation
Sue Polanka, Wright State University Libraries; Ken Breen, EBSCOhost; Rolf Janke, SAGE Reference
Heard a librarian once say that when she purchases ebooks, she wants them to be available for the next 500 years. But how do we do the next 5 years? Easier with print. Usability, Authenticity, Discoverability, Accessibility over the long term.
What does it mean to preserve ebooks? Only the text? The experience of any enhanced content? What format? Will the format we choose today serve us in the future? Who is responsible? If I don’t own it, is it my responsibility to archive it?
Trigger events – things are kept secure until something necessitates access. Are trigger events the same for publisher as an aggregator? What if I decide I don’t like the platform but I own the books. Do I have the right to take the files and put them elsewhere? Are they preserving only so something operates on their platform, or for operation on any platform? (State Library of Kansas v Overdrive issue.)
Who pays for preservation? Is it built into the price of the book or access fees? Or the cost of doing business?
Rolf, SAGE Reference
Beginning, middle and end of content creation. He was trying to figure out what our perspective is. Thinking about his iTunes library – what happens if something goes wrong with iTunes and he can’t have his stuff anymore?
Perpetual ownership. SAGE owns the copyright, but you own the product.
Every model has limitations. Not everybody will win. Publishers have a basic responsibility to participate in as many of the initiatives as possible around digital content preservation. Make sure there’s an archive of their content “just in case.”
They have a responsibility to protect the version of record. But what is that when the content is in a dynamic form? The one that’s identical to the print? What if an online edition has a change to citation? Errata, multimedia?
Contractual challenges – we don’t read find print either! Legal language will be the ultimate protector. [Wait, if you're not reading it either how does that work?]
Publishers need to start thinking about preservation before dissemination. Be proactive and have a plan for change. Need to be more collaborative with customers and partners. Find the resources to create an operational infrastructure around preservation.
What does it mean to preserve ebooks? Helpful definitions. Light archive vs dark archive. Light archive are basically today’s databases. Safe in the short run as long as these companies are still around. Dark archive is where the conversation is focused. Create a repository so there is still the ability for users to have access to the content. Access to these archives tightly guarded – inventory, audit, testing, etc.
Is the text all we need to preserve? All aggregators have the same file from the publisher and then do their own things to differentiate. Ultimately the publisher owns the original file – it’s their decision as to what becomes the version of record.
Portico takes publisher’s original file and normalizes it. Can then migrate to another format in the future if necessary. CLOCKSS takes a capture of the last user interface version of the ebook and preserves that.
Who is responsible? Publisher owns the rights. Rights transferred over time, so maybe publisher no longer retains rights. In this case? If you bought it, your perpetual access continues, but no new copies can be sold. Removed from subscription services.
What if an aggregator exits the business? Not a trigger event for a dark archive – other aggregators to get the item from. Publishers could make deals with aggregators – if you go out of business, our customers going through you can access their purchased titles through another aggregator.
Weeding? Platforms designed to be additive – unlimited shelf space. Need to find a way to allow you to weed. Do you have a way to hide them from users but can re-enable access later? Permanently deleting? If you purchased in perpetuity and permanently deleted it, are you entitled to it if a trigger event causes an archive to release it?
Q: Aside from CLOCKSS and Portico, have you considered talking to digital archivists? A: Rolf: No. Learning curve on all of this and what these bodies represent. Scalability.
Q: Hi, I’m from CLOCKSS. In charge of ebook and ejournal preservation. Have you considered archiving your content but also the ability to render it in your original presentation? We could show it to the end user in the same way they had it when they purchased it. A: Rolf: Yes. Q: I want the engine that renders your content, don’t make us cobble together a rendition engine. A: Rolf: Can I change my answer to No? What does this involve from an infrastructure POV.
Comment: With Philly PL. I’m starting to ask for a copy when dealing with the publisher directly. They’re giving me the file or XML on disc. No idea what we’ll do but we have them in a box on my desk.
Sue: My library is doing the same thing, very difficult to normalize for our own interface. Takes a lot of work.
Q: Small private academic. Ebrary subscription. 70,000 MARC books, yay! Talking to faculty and decided if we have it in ebook, we won’t get print. Today someone emailed me saying McGraw-Hill just pulled 2,500 books out of ebrary. We’ve had it for two months. If we’re commiting the money, aggregator & publisher need to commit to availability for a certain amount of time. A: Ken – decision by the publisher across all aggregators. Happened to us, too. They own the content, though. Risk of a subscription.
Q: Bookmarks, notes in the margin. Adds value, but patrons want to keep the book. Now that they’ve added stuff they want to purchase it. Could you look into this? A: Rolf: If it makes money, I like it. Just need a shopping cart. Could also do just a chunk of the book if that’s all someone wants. Thinks this is coming.
Q: Light v dark archive – academic equivalent is offsite storage. This involves bringing all the content back – what if we just want one book? Reasonable, timely manner, not onerous process. Thoughts on that? A: Ken: That’s a good question but I don’t know the answer.