Skip to Main Content

Internal Database Training: HathiTrust

Video Recording

Notes

Overview

  • Open access and digitization of corpus of a lot of different libraries and different tech giants were involved with this project
  • Established in 2008 as non-profit collaboration between academic/research libraries (Big 10)
  • Organization provides lawful reading access to more than 17 million digitized items, computational access to the entire access of HathiTrust
    • 7 million in public domain
    • FHSU is not a member institution (KSU, KU and WSU are members) - they can access more than public domain
      • during COVID, emergency temporary access available to member institutions
      • US Federal documents program
      • copyright review program
      • steward collections within the aim of scholarly interest, making works accessible online
  • Goals
    • Digital preservation repository
    • main goal is preservation
    • access is secondary priority
  • Content
    • from initial Big 10 universities, CAL system, University of Virginia, Google, Internet Archive, and Microsoft as well
    • includes info in public domain - can be viewed by any user
    • items held in copyright are only viewable to member institutions and based upon what those institutions have on the shelf (what they've contributed)

Search

  • Ex. "Ulysses" as keyword search (defaults to full text + all fields)
  • Filter
    • Item Viewability: Full View
    • Filters for subject, place of publication, author, date, original format, contributing library
  • Advanced search to search field
    • Title
    • Author
    • Subject
    • Publisher
    • Only Full Text
    • Full Text + All Fields
    • Series Title
    • ISBN
  • From results,
    • click "Catalog Record" to see the listing and see which library was contributing
      • see their MARC record if you're interest in that
      • Click "Find in Library" to take you out to World Cat
    • Click "full view" to see the scanned book
      • scanned numbers begin with first image (cover)
      • change view to plain text or the image
      • Change how you look at it (flip vs. scroll)
    • When viewing, About this Item lists the type of rights and other catalog details previewed
    • Download options
      • PDF
      • .txt
      • JPEG
      • TIFF
      • indicate page ranges if you want to download (if you're logged in and or if the book is public domain you have more options)
    • Jump to Section
    • Get this Item
      • Find in a library --> World Cat
      • Google Books
    • Collections
      • member institutions can create collections within HathiTrust or individuals can add the item to their personal collection to organize
    • Share
      • permalink
      • social sharing icons
      • embed
  • If you create an account ... 
    • create personal collection (limited to things that are fully viewable)
  • Collections
    • start out with featured collections
      • book series, topics, subjects, University Press collections, etc.
  • HathiTrust is not currently enabled in Primo
    • it's a non-starter. 17 million pieces would inundate our catalog from a search standpoint
    • about 5 years ago it was activated in backside of Primo to search and a search of Shakespeare would bring all HathiTrust results first.
    • wouldn't add to Alma database because its too many records and too much to manage
    • we need to encourage people to go to it directly - especially if they have niche topic or hard to find or something older/fragile that it couldn't come out as ILL request
    • some license have creative commons

Research Center

  • Launched by Indiana University and University of Illinois 
  • focused on humanities and arts and sciences
  • enables computational analysis of HathiTrust corpus
    • text data mining
    • freely accessible datasets
    • get visualization tools, worksets, derived data
    • could be useful for linguistics

Featured Database

Resources