Table of contents
The search capabilities at Librivox.org are limited, so it is better to use Internet Archive (IA) to search for Librivox audiobooks. I have gathered what I think I know about the Internet Archive search engine to assist users in their searches, and to assist Librivox volunteers who make audiobooks to get them found more often in searches.
For more information about maximizing your Librivox audiobook’s performance in searches, also see my webpage “Make Sure Your Librivox Audiobook is Heard for Years“.
Categories of Information in a Librivox Audiobook
Before a new audiobook is submitted for ‘publication’, the Book Coordinator has a ‘Project Template’ in which they fill in fields that become categories in the book’s metadata. The info they put in those fields may determine whether the book is heavily used in future, or will be largely ignored within months after its release. The fields relevant for search are:
- Title
- Author
- Topics
- Summary
- Genre
‘Genres‘, the last field, was designed for use at Librivox.org, and genres don’t appear in Internet Archive searches. They are not discussed in this paper.
File Names, which are usually book chapters and sometimes include names of authors, appear in the top half of IA book pages (not in the metadata). I have not found any instances in which they affect search results, so I believe it is not possible to search for words in file names.
Reviews are sometimes submitted by listeners and posted on book pages. Words and phrases in reviews are included in certain searches, and are mentioned below.
Types of Searches at Internet Archive
Title search. Start at the IA Librivox Collection page (https://archive.org/details/librivoxaudio), and enter into the ‘Search this collection’ line just above the books: title:”your title here”.
Do NOT leave a space after the colon at the end of the field name title:. Put quotation marks around the title when you know the exact title to ensure it searches for the words in the same order. Not using the quotation marks will give you books with the same words, but in any order. Searching without quotation marks is useful when you’re not sure of the exact title.
Author search. At the same location (above), enter creator:Author’s name.
IA doesn’t recognize the word ‘author’; it requires ‘creator:‘. I suggest trying it without quotation marks because authors’ names don’t always appear in exactly the same form. Some authors use initials, so you may need to try that as well.
Multi-author books. The Librivox collection contains many books that are collections of short stories, poems, and essays, with many authors. In these books, authors are usually listed only in the ‘Summary‘ field. See the section “Searching the Summaries” below.
Summary search. (1) Use the field name description: (2) Click “Search text contents” below the search field to search the summary and show the search phrase in context in results. Usually it is best to wrap your phrase in quotation marks, to get exact matches.
Subject (Topic) Search
How Subject Searches are Usually Carried Out
Internet Archive uses the term ‘Topics’ for a field of terms that describe the subjects of a book, but requires you to use the field name subject: when doing a search of topics.
I believe that currently the most common way of doing a subject search at Internet Archive for a Librivox audiobook is to use the ‘Search’ field at the Librivox Collection page to search for a word or several words, without including the field name subject:. By default, the setting under the Search line is set at “Search metadata“. Here is what the search engine does when you do NOT use field names like title:, creator:, or subject:.
- It searches only among Librivox audiobooks.
- It searches titles, authors, summaries, and reviews.
- If the search term is multi-word, it does a fuzzy search. Results include books that contain ALL the words found anywhere in those four categories, regardless of whether the words are adjacent to each other.
This type of search tends to get a lot of results; many of which aren’t very relevant. The summary fields of audiobooks often contain profiles of authors and a lot of other information that is not a main subject of the book. If the search engine finds each of the words in a search phrase in different parts of the summary, or in the author or title fields, the book will appear in the search results. Even words in reviews by readers are included, adding further to the irrelevant results.
Carrying Out a Subject (Topic) Search
Getting comprehensive results from a subject search in this audiobook collection is difficult. Book Coordinators have used several different approaches when adding the topics that will be searched. The most common approach has been to use one-word topics. Many of these are not very helpful in subject searches because they are only marginally related to the main subject of the book, so search results are cluttered with books that aren’t really what the searcher is looking for.
Another frequent issue is that different Book Coordinators have used various words for the same subject. In order to find most of the books on that subject, it is necessary for the searcher to first guess all the possible words for it and search for each, often in both singular and plural.
I suggest you try searches of topics and summaries. Here are a few tips:
- If you have several possible search terms for the same subject, you can put them all in the search line and put OR (capitalized) between the words or phrases, to look for all of them in one search action.
- To use a multi-word search phrase, put quotation marks around it. This makes the search engine search for that exact phrase only, excluding cases where the words are scattered throughout the metadata.
- Try your search with subject:, and then with description:, to search the Topics field and then the Summary field. Instead of using description:, you can click “Search Text Contents” (below the Search field) to display the phrase in the Summary field in Search results.
- If you know authors or titles relevant to your subject, find them in Internet Archive Books, and check their entries for useful search terms.
Library of Congress (LOC) Subject Headings
Although not many Librivox audiobooks have LOC subject headings yet, this will become increasingly common in future audiobooks. To find the LOC subject heading that fits your subject, search through the Internet Archive 4-million book ‘Texts to Borrow” collection with subject:, and note the multi-word topics used in the books in your search results. Then try them in subject searches in the Librivox collection.
How to Get Search Results Only from the Librivox Collection
You may have clicked on a topic in a Librivox audiobook and were confronted with thousands of results, from IA’s entire collection of books, movies, audio, etc. I clicked on ‘Turkey’, and my search results screen showed subject:”Turkey” with 117,000 results! This was useless for me, but the issue can easily be fixed.
Add (space) AND collection:(librivoxaudio) to the Search line, after subject:“Turkey”, like this: subject:”Turkey” AND collection:(librivoxaudio). Click ‘Search‘. This time the results showed only Librivox audiobooks that have Turkey as a topic.
I’m hoping that Internet Archive will limit by default all searches that take place within the Librivox collection, so the only results we see are Librivox audiobooks.
Using Advanced Search at Internet Archive
You’ll find a link to Internet Archive’s Advanced Search page by clicking inside the little search box in the top right corner of any screen.
- To limit results in Advanced Search to Librivox audiobooks, enter librivoxaudio in the ‘Collection‘ field.
- Putting words into the ‘Description‘ field will search the book summaries.
- To search for Topics, type subject into a ‘custom‘ field.
Scroll down the Advanced Search page to find explanations of all the fields.