Elasticsearch: Advanced Query
Han Yi
May 1, 2018
Types of Advanced Query
- Aggregations
- Suggesters
- Scripts
- Search Templates
Basic Types of Aggregations
- Bucketing: group by
- bucketing aggregation can be nested using bucketing and metric
- Metric: calculation, like avg, sum, min, max
- Matrix: calculate numeric statistics over a set of fields
- Pipeline: aggregation chain
Aggregations Structure
"aggregations" : {
"<aggregation_name>" : {
"<aggregation_type>" : {
<aggregation_body>
}
[,"meta" : { [<meta_data_body>] } ]?
[,"aggregations" : { [<sub_aggregation>]+ } ]?
}
[,"<aggregation_name_2>" : { ... } ]*
}
- Meta: Being put into individual aggregations at request time that will be returned in place at the response time
Bucket Aggregations
GET /_search
{
"aggs": {
"nested_aggs": {
"nested": {
"path":"child"
},
"aggs": {
"filtered_aggs": {
"filter": {
"bool": {
"must": [
{
"term": {
"child.color":"Red"
}
}
]
}
},
"aggs": {
"lvl1": {
"terms": {
"field": "child.category.lvl1",
"order": {
"count":"desc"
}
},
"aggs": {
"count": {
"reverse_nested": {}
}
}
}
}
}
}
}
}
}
- Terms/Nested/Reverse Nested aggregation
- Group by nested field
Bucket Aggregations
GET /_search
{
"aggs" : {
"price_ranges" : {
"range" : {
"field" : "price",
"ranges" : [
{ "to" : 100.0 },
{ "from" : 100.0, "to" : 200.0 },
{ "from" : 200.0 }
]
}
}
}
}
- Range aggregation
- Group by range
Metric Aggregations
- Top Hits Aggregation
- Retrieve documents from bucket
GET /_search
{
"aggs": {
"nested_aggs": {
"nested": {
"path":"child"
},
"aggs": {
"filtered_aggs": {
"filter": {
"bool": {
"must": [
{
"term": {
"child.color":"Red"
}
}
]
}
},
"aggs": {
"lvl1": {
"terms": {
"field": "child.category.lvl1",
"order": {
"count":"desc"
}
},
"aggs": {
"count": {
"reverse_nested": {},
"aggs": {
"top_hits": {
"top_hits": {
"sort": [{
"price": {
"order": "desc"
}
}],
"_source": {
"includes": [ "name", "price" ]
},
"size" : 1
}
}
}
}
}
}
}
}
}
}
}
}
Metric Aggregations
- Max/Min/Avg Aggregation
POST /product/_search
{
"aggs" : {
"max_price" : { "max" : { "field" : "price" } }
}
}
POST /product/_search
{
"aggs" : {
"min_price" : { "min" : { "field" : "price" } }
}
}
POST /product/_search
{
"aggs" : {
"avg_price" : { "avg" : { "field" : "price" } }
}
}
Matrix Aggregations
- Statistics Aggregation
GET /products/_search
{
"size": 0,
"aggs": {
"statistics": {
"matrix_stats": {
"fields": ["price"]
}
}
}
}
//sample response
"aggregations": {
"statistics": {
"doc_count": 4553,
"fields": [
{
"name": "price",
"count": 4553,
"mean": 47.31886243291309,
"variance": 8779.347529532348,
"skewness": 19.845533881336312,
"kurtosis": 537.8599726243962,
"covariance": {
"price": 8779.347529532348
},
"correlation": {
"price": 1
}
}
]
}
}
Pipeline Aggregations
- Chain of Aggregation
POST /_search
{
"aggs": {
"my_date_histo":{
"date_histogram":{
"field":"timestamp",
"interval":"day"
},
"aggs":{
"the_sum":{
"sum":{ "field": "lemmings" }
},
"the_movavg":{
"moving_avg":{ "buckets_path": "the_sum" }
}
}
}
}
}
Aggregation Summary
- Traditional aggregation operations include distinct, count, average, group, etc
- Elasticsearch becomes popular because of aggregation rather than search
- Aggregation pipeline/Nest aggregation is most flexible capability in Elasticsearch
- Aggregation is calendar aware and location awareness
- Type keyword is better for running aggregation, sorting, etc
Suggesters
- Term and phrase suggester
- Make suggestions based on the existing documents in case of typos or spelling mistakes
- Completion suggester
- Make suggestions to predict the query term before user finishes typing
Suggesters
- Term suggester
GET products/doc/_search
{
"_source": [],
"suggest": {
"term_suggester": {
"text": "jackat",
"term": {
"field": "name"
}
}
}
}
Suggesters
- Phrase suggester
GET products/doc/_search
{
"_source": [],
"suggest": {
"term_suggester": {
"text": "donw jackat",
"phrase": {
"field": "name",
"max_errors": 2,
"collate": {
"query": {
"inline": {
"match_phrase": {
"{{field_name}}": {
"query": "{{suggestion}}",
"slop": 1
}
}
}
},
"params": {
"field_name": "name"
},
"prune": false
}
}
}
}
}
Suggesters
- Completion suggester
- Need to create specific field whose type is "completion"
- copy_to is usually used to create separate field from existing field
GET products/doc/_search
{
"_source": [],
"suggest": {
"my_suggestion": {
"prefix": "jack",
"completion": {
"field": "name"
}
}
}
}
Scripts
- Extremely flexible to achieve many features not supported by existing DSL API
- painless
- expression
- mustache
- java
GET products/_doc/_search
{
"query": {
"script": {
"script": {
"lang": "painless",
"inline": "doc['color'].value == 'Black'"
}
}
}
}
Search Templates
- Can use mustache template engine to create search template
- Template is stored in Elasticsearch server and can be called directly
GET _search/template/find_product_by_name
{
"query": {
"match": {
"name": "{{ product_name }}"
}
}
}
GET products/_doc/_search
{
"id": "find_product_by_name",
"params": {
"product_name": "down jacket"
}
}
Thanks
Elasticsearch: Advanced Query
By hanyi8000
Elasticsearch: Advanced Query
- 1,998