Amazon.com Web Services and Library Catalogs
Over the past few years, since I wrote the original Amazon.com module for Koha, I’ve received literally hundreds of complaints, mostly from librarians, about the legality of Koha’s use of Amazon.com’s Web Services. In fact, it’s fair to say I’ve spent considerably more time responding to these questions than I did writing the original module.
So … first of all, shocking as it may seem, Koha has the capability to use Amazon.com content in the OPAC search results and detail pages. To see this in action, feel free to visit the Athens Public Library’s OPAC:
http://search.athenscounty.lib.oh.us
It’s perfectly legal to aggregate the content in web applications such as Koha. In fact, Amazon.com expressly created the web services program so that people would write applications around it. Their business angle is no different than any other content provider — they expect to make money. The difference is that they don’t want to make the money from the people aggregating the content. Instead, they are hoping that the content will drive users to the Amazon.com website and that those users will purchase something.
If you have hesitations about this business model and don’t think your library should be involved in it, no problem, you can simply turn it off in your Koha installation and purchase similar services from other content providers with more traditional compensation methods. The Koha community is not trying to force you to use Amazon.com.
However, if, like many of the libraries that LibLime supports, you are on a tight budget, yet want to provide your patrons with this content, Amazon.com’s alternative service model gives you that ability. Here’s how it works and why it’s legal.
Let me preface this by adding that I’ve had extensive conversations with Amazon.com’s US legal department about Koha’s use of Web Services, and they have confirmed that Koha does not violate the terms of their agreement. This point is worth making: they want your library to use their content :-).
First off, a bit of background on Amazon.com’s Web Services Program. The basic idea is that Amazon provides machine-readable access to content they have for sale. That content is indexed by ISBN number, which makes it trivial to identify a relationship between an item in a library catalog and an item on Amazon.com. Web Services data includes:
- Item Jacket Cover Images;
- Item reviews by Amazon.com patrons;
- Item ratings by Amazon.com patrons;
- Professionally written item descriptions and reviews.
Koha’s Amazon module can interact with Amazon.com’s web services program in several possible ways, in accordance with the license agreement that every Web Services user must abide by:
- Koha can be configured to periodically download content en masse and locally cache the content on one of your library services and serve it to your users via the OPAC;
- Koha can download the content in real-time as a search result set or detail page is loaded.
The Web Services agreement has very specific requirements about usage and discusses both of these methods in great detail. The most relevant points to this discussion are:
- if content is cached locally, it must be updated every 24 hours;
- if you download in real-time, you can only download up to 1000 items per IP address per day;
- if you download in real-time, you cannot download more than one item per second per IP address.
- if you use their content, you must provide a link back to any Amazon.com page
Since Koha’s system supports both caching and real-time downloads of the content, based on a library’s usage patterns, they would need to determine which method or combination of methods would work best for their situation. Keep in mind that images are downloaded from the user’s browser, not from the Koha application, so the 1000 queries per day per IP address and 1 download per second rules don’t apply to the Koha server(s).
If a library didn’t want to cache data locally, yet had more than 1000 views of their detail pages, it would be very trivial to simply track the number of times that Amazon.com content was syndicated, and turn it off after the day’s cap. It would be similarly trivial to keep track of the number of queries to detail pages per second and only permit one per second; or to use javascript to download the content from the browser rather than the server.
So the bottom line is that it’s not at all difficult to use Amazon’s program without abusing it. It’s up to each library to make an informed decision about whether and how to use it.
