Swan's GraphQL schema is 200 000+ characters long (it's HUGE)
The server runs a standard introspection query against our GraphQL endpoint.
type Query {
user: User
account: Account
}type User {
account: Account
}type Account {
iban: String
}Query
User
Account
String
user
account
account
iban
To prepare for the indexing, we traverse the graph (BFS - Breadth-First Search) and for each field, we:
... and put everything in FlexSearch, an in-memory full-text search engine
For Account.iban, we index: iban, account, string, plus its description.
Query
User
Account
String
user
account
account
iban
Depth 1:
user (Query, User)account (Query, Account)Depth 2:
user.account (User, Account)account.iban (Account, String)Depth 3:
user.account.iban (Account, String)When the AI searches for ["IBAN"], FlexSearch returns all matching fields.
We then rank them:
score = (3*name + 2*description + parent_type)/depth
Query/Mutation to include all intermediariesFinal touch:
Step 1: Scraping https://docs.swan.io (sitemap.xml)
Step 2: Chunking
Step 3: Embedding (vectorization)
Step 4: Semantic search