Evergreen Search Tune-Up
Kathy Lussier, MassLNC Coordinator
klussier@masslnc.org
2015 Evergreen International Conference
5/15/2015
How I Learned About Evergreen Search
- Training from Equinox on configuring bibliographic indexes.
- Blog Post from Dan Scott on adjusting relevancy in 1.6
- Worked with MassLNC partners to adjust indexes and relevancy
- Learned from several mistakes along the way
Two Key Places to Look in the Database
- config.metabib_field
- Where keyword, browse, and facet indexes are configured
-
Also available in the staff client Server Administration ->
MARC Search/Facet Fields
- metabib schema
- Several tables storing the index terms for each record in the system
config.metabib_field
Proper Title configuration
Proper Title configuration
Interpreting MODS
Interpreting MODS
marcxml
You don't need to use MODS for your indexes. Using the marcxml format, all of our identifier indexes are based on MARC tags, subfields and/or indicators.
Metabib Entries
To see how the title is indexed for record 3183740, run the following SQL:
SELECT * FROM metabib.title_field_entry WHERE source = 3183740
Metabib Schema
- This schema contains several tables that store index terms for each record in the database.
- Each search class has its own metabib table:
- metabib.author_field_entry
- metabib.identifier_field_entry
- metabib.keyword_field_entry
- metabib.subject_field_entry
- metabib.title_field_entry
Stock Indexes
- Title class
- Abbreviated
- Translated
- Alternate
- Uniform
- Title Proper
- Author class
- Corporate
- Personal
- Conference
- Other
Stock Indexes
- Subject class
- Geographic
- Name
- Topic
- Temporal
- All Subjects
- Series class
- Series Title
- Identifier Class
- An entry for identifiers in tge record (e.g. ISBN, ISSN, UPC, TCN, etc.)
Stock Indexes
- Keyword class: The blob
Keyword blob
The world of the Hunger Games The world of the Hunger Games The world of the Hunger Games Hunger Games Egan, Kate. creator text eng print 192 p. : col. ill. ; 21 cm. A companion guide to Panem, the world in the "Hunger Games," as portrayed in the motion picture based on the novel by Suzanne Collins. Welcome to Panem -- Life in the Districts -- Life in District 12 -- People of District 12 -- Katniss Everdeen -- At home with Katniss Everdeen -- Reaping Day -- Life in the capitol -- People of the capitol -- Tributes in the capitol -- Training for the Hunger Games -- Creatures of Panem -- Perils of the Hunger Games: the Cornucopia ; Fear ; Injuries ; Alliances ; Defiance ; Rule changes ; Love ; Lies ; Last move -- The game of Love -- After the games -- The Hunger Games glossary. adolescent by Kate Egan. Hunger games (Motion picture) Hunger games (Motion picture) Juvenile literature Hunger games (Motion picture) Hunger games (Motion picture) PN1997.2.H865 E345 2012 791.43/72 Hunger Games Collins, Suzanne. 9780545425124 (trade) 0545425123 (trade) 2011945839
Adding almost everything from MODS to the keyword index does not mean we are adding all MARC tags.
Adding a New Index to Evergreen
Insert a new entry in config.metabib_field
INSERT INTO config.metabib_field (field_class, name, label, xpath, format) VALUES (
'keyword',
'kw_isbn',
'Keyword ISBN',
$$//marcxml:datafield[@tag="020"]/marc:subfield[@code='a' or @code='z']$$,
'marcxml'
);
Does the field need normalization?
- config.index_normalizer contains all of the normalizers used during indexing
- Wiki contains a good description of each of these normalization functions - http://bit.ly/evgils_normalize
- In config.metabib_field_index_norm_map, you need to map your new index definition (by ID) to the ID for the normalizer(s) that should be used.
Normalization Example
We want to map out new ISBN index to the ISBN10-to-ISBN13 (and vice versa) normalizer, which has an ID of 12.
Assuming the ID of our new keyword ISBN index is 1001, we would use the following SQL to map it:
INSERT INTO config.metabib_field_index_norm_map (field, norm, pos) VALUES
(1001, 12, 2);
Reingest
MARC tags we've added
-
020a and 020z - keyword class. When mapped correctly, provides 10/13 ISBN conversion in keyword searching
-
028a Music number - identifier class
-
086 Gov Doc number -identifier class
-
222a Key Title - title class
-
260b Publisher - keyword class
-
245c Statement of responsibility - keyword class
-
505t Contents title - title class
-
505r Contents author -author class
-
740 ind2 Title analytic - title class
Pitfalls
- When a user enters a keyword from the newly-added MARC field along with other keywords from the record, the system will not retrieve the record because the search terms are in two different metabib entries.
- If added to the keyword class, the newly-added indexes may unintentionally receive more weight in relevance ranking.
Indexing Alternate Graphic Fields
marc21expand880 format- title
marc21expand880 format- author
Pitfall of marc21expand880
It adds new metabib entries for all of your records, even if they don't contain 880 fields.
Adjusting Relevance
Cover Density Algorithm
opensrf.xml
Adding Weight to an Index
Weight is a field in config.metabib_field
Because the keyword index is one big blob, we need to add indexes with a keyword class if we want some MARC fields to be weightier than others in keyword searching.
Even if we kept the weight of those keyword indexes at 1, the fields in those new indexes would become weightier because of rank_cd (cover density)
Keyword entries for Survival Guide
Keyword entries for Free Spirit
Weighting for Stemmed/Non-Stemmed Terms
- config.metabib_class_ts_map
- Server Admin - > MARC Search/Facet Class FTS MAPS
config.metabib_class
Server Admin -> MARC Search/Facet Classes
On Combined Searches
- By default, you cannot retrieve records if the user's search terms appear in different entries for a particular search class.
- Subject searches are the exception.
- For example, in our Free Spirit publisher example, the user never would have retrieved the below record if they had typed "free spirit survival" as their search terms.
Enabling Combined Searches
- Enabling Combined Search for a class in config.metabib_class will allow search terms to cross indexes.
- BUT you essentially end up turning your keyword search back into a giant blob again, eliminating the benefit of adding specific fields for weighting.
Future Improvements to Relevance Ranking
Further Reading
- Bibliographic Indexing in Evergreen
http://wiki.evergreen-ils.org/doku.php?id=documentation:indexing
Questions?
Evergreen Search Tune-Up
By Kathy Lussier
Evergreen Search Tune-Up
- 2,675