DLF Forum 2013


METADATA FIRST

     Using Structured Data Markup & the Google Custom         Search API to Outsource Your Digital Collection Search Index

Jason Clark
@jaclark

Scott Young
@hei_scott





Kenning Arlitsch
Jason Clark
Patrick O'Brien
Scott Young


Today's Talk



Creating Indexable Content

For web search



Reusing Indexed Content

For local search








Creating Indexable Content


The Inside-out Library


"The challenge is not now only to improve local systems, it is to make library resources discoverable in other venues and systems, in the places where their users are having their discovery experiences."
— Lorcan Dempsey, September 2013


Duke University Library




"Discovery Turned Inside-out"




“It is imperative for libraries to ensure access to their content through search engines by engaging in the optimization of their content for higher SERP rankings.”
                   - Onaifo, D. (2013). Increasing libraries’ content                                      findability on the web with search engine                         optimization. Library Hi Tech, 31(1), 87–108.
 

Digital Collections + Web Discovery


Donors and funding agencies want more accountability and demonstrated value.


Over 80% of students begin their research using internet search engines.


Improving web discovery via search engines leads to increased numbers of visitors & increased downloads.



Foundations of Indexable Content



A software tool that provides:


      • Item pages at a stable resolvable URL
      • Standards-based HTML(5) markup
      • Structured Data Markup
      • Navigable architecture with clear design 




Traditional SEO




      • Title tag & <meta> description
      • Sitemaps & robots.txt alignment
      • Server responses & error pages
      • Google Analytics & Webmaster Tools











Beyond Traditional SEO



Structured Data

schema.org 
microdata/RDFa Lite


Semantic Components 

Linked Data 
Social Tags

















Reusing Indexed Content



“The scope of library discovery services continues to evolve. We might characterize the situation we are in now as
full collection discovery.”
— Lorcan Dempsey, September 2013




Full Collection Discovery




Solr/Blacklight



Solr/Blacklight Advantages



Faceted Search
Flexible Results
Stable URLs
Contemporary Design


Solr/Blacklight Barriers




Development Time




Alternatives




Google Custom Search


GCS for Digital Collections


Enables local discovery by reusing web-scale index

Onramp to digital collections discovery layer

Efficient for libraries (both small and large)

GCS Advantages 




Leverages an index already optimized for web search
Integration with leading commercial search engine
Faceted Browsing
Flexible Design
Search Analytics
API

GCS Barriers




Development Time
API
Cost




Business Case for Outsourcing

GCS Efficiencies 




1. Build indexable content for bots and humans

2. Reuse index locally




GSC Implementation

http://arc.lib.montana.edu/digital-collections/







For MSU Library, we generously estimated 20,000 queries a month to our specific search index, which would lead to about $1,200 per year as a cost.


(Summon layer = 10,000 queries a month)


Takeaways



The foundations of digital collections are built on
interoperable indexable content (metadata first)

With rich metadata and structured markup, powerful and flexible discovery platforms such as GCS become available


[http://www.lib.montana.edu/~jason/files/digital-collections-custom-search-api.zip]




Thank you!

Questions




Jason Clark
@jaclark

Scott Young
@hei_scott

DLF Forum 2013 — Metadata First

By Scott W. H. Young

DLF Forum 2013 — Metadata First

  • 4,222