- 2004: Created by Yonik Seeley (CNET)
- 2006: Apache Software Project
- 2008-9: v1.3-1.4, usage explodes
- 2010: Merged with Lucene
- 2011: v1.4 -> v3.1
- 2014: currently at v4.1
- Search engine/server/platform/whatevs
- Written in Java
- Open source, Apache 2.0 license
- Runs as a Jetty or Tomcat servlet
- Provides REST-ish HTTP API (XML or JSON)
- Highly customizable via configuration, plugins
- Also embeddable via EmbeddedSolrServer (although not considered a "best practice")
- fulltext searching
- result facets
- term highlighting
- query & index analysis filters
- text extraction (from PDF, Word, etc.)
- Nice admin UI
- easy relevance tuning
- sharding, replication
- NoSQL-ish
- more-like-this, did-you-mean, auto-complete
- nested documents (as of v4.5)
- Two ways to get content in
- POST xml or json via HTTP
- DataImport: import from data source
- Indexing process
- field-based
- Defined in schema.xml
- fieldType defines how field content is processed
- analysis phase: tokenize, filter, transform
- storage options: index only, index + content, term frequency, positions, normalizations, vectors
-
Inverted Index format: terms -> documents
- via HTTP GET: "?q=term1"
- default = Lucene query syntax
- free-form: term1 term2
- fielded: foo:term1
- phrases: "term1 term2"
- wildcards: "foo:term*"
- proximity: "term1 term2"~4
- ranges: [1 TO 1000]
- ExtendedDisMax
- search across range of fields with varying "boost" factors
- title:foo^5 fulltext:bar^0.5
- http://lucene.apache.org/solr
- http://www.solrtutorial.com/index.html
- https://wiki.apache.org/solr/FrontPage
- https://slides.com/jamesluker/solr-101
Made with Slides.com