Clara Jiménez Recio
Clara Jiménez Recio
Clara Jiménez Recio
2
3
Search Index
Analyzers
1
MongoDB Atlas Search
4
Search Query
5
Autocomplete
6
Hacks
2
3
Search Index
Analyzers
1
MongoDB Atlas Search
4
Search Query
5
Autocomplete
6
Hacks
7
Vector Search + LLMs
1
MongoDB Atlas Search
1
MongoDB Atlas Search
1
MongoDB Atlas Search
2
Search Index
{
"name": {
"type": "String"
}
}
{
"name": {
"type": ["String"]
}
}
{
"analyzer": "lucene.standard",
"mappings": {
"dynamic": false,
"fields": {
"name": {
"type": "string"
}
}
}
}
string, array of strings
2
Search Index
{
"translations": [
{
"lang": {
"type": "String"
},
"name": {
"type": "String"
}
}
]
}
{
"analyzer": "lucene.standard",
"mappings": {
"dynamic": false,
"fields": {
"translations": {
"dynamic": false,
"type": "embeddedDocuments",
"fields": {
"name": {
"type": "string"
}
}
}
}
}
}
array of objects
2
Search Index
{
"translations": [
{
"lang": {
"type": "String"
},
"name": {
"type": "String"
}
}
]
}
array of objects
{
"analyzer": "lucene.standard",
"mappings": {
"dynamic": false,
"fields": {
"translations": {
"dynamic": false,
"type": "embeddedDocuments",
"fields": {
"name": {
"type": "string"
}
}
}
}
}
}
{
"analyzer": "lucene.standard",
"mappings": {
"dynamic": false,
"fields": {
"translations": {
"dynamic": false,
"type": "embeddedDocuments",
"fields": {
"name": {
"type": "string"
},
"lang": {
"type": "string"
}
}
}
}
}
}
2
Search Index
{
"analyzer": "lucene.standard",
"mappings": {
"dynamic": false,
"fields": {
"translations": {
"type": "document",
"fields": {
"es": {
"type": "document",
"fields": {
"name": {
"type": "string"
}
}
},
"en": {
"type": "document",
"fields": {
"name": {
"type": "string"
}
}
}
...
}
}
}
}
}
{
"translations": {
"es": {
"name": {
"type": "String"
}
},
"en": {
"name": {
"type": "String"
}
}
...
}
}
dictionary
3
Analyzers
pienso
gatos
esterilizados
pienso
especial
para
gatos
esterilizados
🔎 Pienso gatos esterilizados
{
"name": "Pienso especial para gatos ESTERILIZADOS"
}
Analyzer
Analyzer
Tokens
🕵️♀️
🕵️♀️
3
Analyzers
Analyzer | Separator | Transformation | Case sensitive | Only exact matches |
---|---|---|---|---|
Standard | word boundaries (language-neutral) | lowercase | No | No |
Simple | non-letter characters | lowercase | No | No |
Whitespace | whitespaces | none | Yes | No |
Keyword | none | none | Yes | Yes |
3
Analyzers
Name | Standard | Simple | Whitespace | Keyword |
---|---|---|---|---|
Applaws | applaws | applaws | Applaws | Applaws |
True Origins | true, origins | true, origins | True, Origins | True Origins |
Forza 10 | forza, 10 | forza | Forza, 10 | Forza 10 |
Nature's Variety | nature's, variety | nature, s, variety | Nature's, Variety | Nature's Variety |
Tokens 🇦🇧🇨
3
Analyzers
Analyzer | Tokens | Matches |
---|---|---|
Standard | applaws | ✅ |
Simple | applaws | ✅ |
Whitespace | Applaws | ✅ |
Keyword | Applaws | ✅ |
🔎 Applaws
3
Analyzers
Analyzer | Tokens | Matches |
---|---|---|
Standard | applaws | ✅ |
Simple | applaws | ✅ |
Whitespace | Applaws | ❌ |
Keyword | Applaws | ❌ |
🔎 applaws
3
Analyzers
Analyzer | Tokens | Matches |
---|---|---|
Standard | true, origins | ✅ |
Simple | true, origins | ✅ |
Whitespace | True, Origins | ✅ |
Keyword | True Origins | ✅ |
🔎 True Origins
3
Analyzers
Analyzer | Tokens | Matches |
---|---|---|
Standard | true, origins | ✅ |
Simple | true, origins | ✅ |
Whitespace | True, Origins | ✅ |
Keyword | True Origins | ❌ |
🔎 True Origins Wild
3
Analyzers
Analyzer | Tokens | Matches |
---|---|---|
Standard | nature's, variety | ✅ |
Simple | nature, s, variety | ✅ |
Whitespace | Nature's, Variety | ✅ |
Keyword | Nature's Variety | ✅ |
🔎 Nature's Variety
3
Analyzers
Analyzer | Tokens | Matches |
---|---|---|
Standard | nature's, variety | ✅ |
Simple | nature, s, variety | ✅ |
Whitespace | Nature's, Variety | ✅ |
Keyword | Nature's Variety | ❌ |
🔎 Nature's
3
Analyzers
🔎 Forza 10
Analyzer | Tokens | Matches |
---|---|---|
Standard | forza, 10 | ✅ |
Simple | forza | ✅ |
Whitespace | Forza, 10 | ✅ |
Keyword | Forza 10 | ✅ |
3
Analyzers
🔎 10
Analyzer | Tokens | Matches |
---|---|---|
Standard | forza, 10 | ✅ |
Simple | forza | ❌ |
Whitespace | Forza, 10 | ✅ |
Keyword | Forza 10 | ❌ |
4
Search Query
Operators:
text
phrase
compound
embeddedDocument
autocomplete
{
"$search": {
"text": {
"query": "Pienso",
"path": ["name", "description"],
"fuzzy": {
"maxEdits": 1,
"maxExpansions": 10,
"prefixLength": 0
}
}
}
}
{
"$search": {
"phrase": {
"query": "Pienso gatos esterilizados",
"path": ["name", "description"],
"slop": 2
}
}
}
{
"$search": {
"compound": {
"must": [
{
"text": {
"query": "Pienso",
"path": "name",
"fuzzy": {
"maxEdits": 1,
"maxExpansions": 10,
"prefixLength": 0
}
}
}
],
"should": [
{
"phrase": {
"query": "Pienso gatos esterilizados",
"path": "description",
"slop": 2
}
}
]
}
}
}
{
"$search": {
"autocomplete": {
"query": "Pien",
"path": "name",
"fuzzy": {
"maxEdits": 1,
"maxExpansions": 10,
"prefixLength": 2
}
}
}
}
{
"$search": {
"embeddedDocument": {
"path": "translations",
"operator": {
"text": {
"path": "translations.name",
"query": "Pienso"
}
}
}
}
}
5
Autocomplete
5
Autocomplete
{
"analyzer": "lucene.standard",
"mappings": {
"dynamic": false,
"fields": {
"name": [
{
"type": "autocomplete",
"tokenization": "edgeGram",
"minGrams": 2,
"maxGrams": 8,
"foldDiacritics": true
}
]
}
}
}
Autocomplete Index (string, array of strings)
5
Autocomplete
{
"analyzer": "lucene.standard",
"mappings": {
"dynamic": false,
"fields": {
"name": [
{
"type": "autocomplete",
"tokenization": "edgeGram",
"minGrams": 2,
"maxGrams": 8,
"foldDiacritics": true
}
]
}
}
}
standard
pienso, para, gatos
edgeGram
pi, pie, pien, piens, pienso, pienso[space], pienso p, pa, par, para, para[space], para g, para ga, para gat, ga, gat, gato, gatos
rightEdgeGram
os, tos, atos, gatos, [space]gatos, a gatos, ra gatos, ra, ara, para, [space]para, o para, so para, nso para, so, nso, enso, ienso, pienso
nGram
pi, pie, pien, piens, pienso, pienso[space], pienso p, ie, ien, iens, ienso, ienso[space], ienso p, ienso pa, en, ens, enso, enso[space], enso p, enso pa, enso par, ns, nso, nso[space], nso p, nso pa, nso par, nso para,... 😵
Autocomplete Index (string, array of strings)
5
Autocomplete
{
"analyzer": "lucene.standard",
"mappings": {
"dynamic": false,
"fields": {
"translations": {
"dynamic": false,
"type": "embeddedDocuments",
"fields": {
"name": {
"type": "autocomplete",
"tokenization": "edgeGram",
"minGrams": 2,
"maxGrams": 8,
"foldDiacritics": true
}
}
}
}
}
}
Autocomplete Index (array of objects)
5
Autocomplete
{
"analyzer": "lucene.standard",
"mappings": {
"dynamic": false,
"fields": {
"translations": {
"type": "document",
"fields": {
"es": {
"type": "document",
"fields": {
"name": {
"type": "autocomplete",
"tokenization": "edgeGram",
"minGrams": 2,
"maxGrams": 8,
"foldDiacritics": true
}
}
},
...
}
}
}
}
}
Autocomplete Index (dictionary)
🇪🇸🇫🇷🇩🇪🇺🇲🏴🇵🇹🇳🇱🇨🇳
😶🌫️
5
Autocomplete
{
"analyzer": "lucene.standard",
"mappings": {
"dynamic": false,
"fields": {
"translations": {
"type": "document",
"fields": {
"es": {
"type": "document",
"fields": {
"name": {
"type": "autocomplete",
"tokenization": "edgeGram",
"minGrams": 2,
"maxGrams": 8,
"foldDiacritics": true
}
}
},
...
}
}
}
}
}
Autocomplete Index (dictionary)
🇪🇸🇫🇷🇩🇪🇺🇲🏴🇵🇹🇳🇱🇨🇳
😶🌫️
{
"$search": {
"autocomplete": {
"query": "Pien",
"path": ["translations.es.name", "translations.en.name", ...]
}
}
}
compound 😵💫
5
Autocomplete
{
"analyzer": "lucene.standard",
"mappings": {
"dynamic": false,
"fields": {
"translations": {
"type": "document",
"fields": {
"es": {
"type": "document",
"fields": {
"name": {
"type": "autocomplete",
"tokenization": "edgeGram",
"minGrams": 2,
"maxGrams": 8,
"foldDiacritics": true
}
}
},
...
}
}
}
}
}
Autocomplete Index (dictionary)
{
"$search": {
"autocomplete": {
"query": "Pien",
"path": ["translations.es.name", "translations.en.name", ...]
}
}
}
compound 😵💫
5
Autocomplete
Name | Standard | Simple | Whitespace | Keyword |
---|---|---|---|---|
Applaws | ap, app, appl, appla, applaw, applaws | ap, app, appl, appla, applaw, applaws | Ap, App, Appl, Appla, Applaw, Applaws | Ap, App, Appl, Appla, Applaw, Applaws |
True Origins | tr, tru, true, true[space], true o, true or, true ori, or, ori, orig, origi, origin, origins | tr, tru, true, true[space], true o, true or, true ori, or, ori, orig, origi, origin, origins | Tr, Tru, True, True[space], True O, True Or, True Ori, Or, Ori, Orig, Origi, Origin, Origins | Tr, Tru, True, True[space], True O, True Or, True Ori |
Forza 10 | fo, for, forz, forza, forza[space], forza 1, forza 10, 10 | fo, for, forz, forza | Fo, For, Forz, Forza, Forza[space], Forza 1, Forza 10, 10 | Fo, For, Forz, Forza, Forza[space], Forza 1, Forza 10 |
Nature's Variety | na, nat, natu, natur, nature, nature', nature's, va, var, vari, varie, variet, variety | na, nat, natu, natur, nature, nature', nature's, va, var, vari, varie, variet, variety | Na, Nat, Natu, Natur, Nature, Nature', Nature's, Va, Var, Vari, Varie, Variet, Variety | Na, Nat, Natu, Natur, Nature, Nature', Nature's |
Tokens 🇦🇧🇨
(usando edgeGram)
6
Hacks
{
"$search": {
"autocomplete": {
"query": "Pienso gatos",
"path": "name"
}
}
}
{
"$search": {
"autocomplete": {
"query": "Pienso",
"path": "name"
}
}
}
6
Hacks
{
"$search": {
"compound": {
"must": [
{
"autocomplete": {
"query": "Pienso",
"path": "name"
}
},
{
"autocomplete": {
"query": "gatos",
"path": "name"
}
}
]
}
}
}
{
$search: {
autocomplete: {
query: "Pienso gatos",
path: "name",
tokenOrder: "sequential"
}
}
}
7
MongoDB Atlas Vector Search
7
MongoDB Atlas Vector Search
Embedding
[0.9, 0.02, 0.1,...]
🔎 Question
[0.9, 0.08, 0.1,...]
Embedding
Answer
$vectorSearch
MongoDB Atlas
Store
1
2
3
4
5
5
6
7
Context Documents
Prompt
LLM
7
MongoDB Atlas Vector Search
[0.9, 0.02, 0.1,...]
🔎 Question
[0.9, 0.08, 0.1,...]
Answer
MongoDB Atlas
Prompt
LLM
Embedding
1
Store
2
Embedding
3
$vectorSearch
4
Retrieval-augmented generation (RAG)
6
7
5
5
Context Documents
7
Vector Search + LLMs
[0.9, 0.02, 0.1,...]
🔎 Question
[0.9, 0.08, 0.1,...]
Answer
MongoDB Atlas
Prompt
LLM
Embedding
1
Store
2
Embedding
3
$vectorSearch
4
Retrieval-augmented generation (RAG)
6
7
5
5
Context Documents
¿Qué pienso tiene más proteína de origen animal? También quiero que tenga l-carnitina y taurina
composition_emb
composition
Clara Jiménez Recio
(aka )
Nita