Intelligent Apps with Angular and Azure AI Search

ngSummit Rome | December 2023

Natalia Venditto

microfrontend.dev - @anfibiacreativa

Image credit DALL-E3

Once upon a time...

I build
e2e cloud applications!

(Like 3 months ago)

RAG?

Natalia Venditto

aka anfibiacreativa

Principal JavaScript e2e DX Lead @Microsoft Azure

Google Developer Expert for Angular and Web Technologies

Author of https://microfrontend.dev

2021 Microsoft Most Valuable Professional

What's in it for me?

Retrieval Augmented Generation

Model architecture used in natural language processing (NLP) that combines retrieval-based and generative approaches.

microfrontend.dev - @anfibiacreativa

DATASET

Response

Retriever

SELECTS RELAVANT PASSAGES

FORMULATES RESPONSE

microfrontend.dev - @anfibiacreativa

DATASET

Response

Retriever

SELECTS RELAVANT PASSAGES

FORMULATES RESPONSE

microfrontend.dev - @anfibiacreativa

Am I now ready to build RAG apps?

microfrontend.dev - @anfibiacreativa

Lorem ipsum

HOW WE BUILD IT?

Specification, contracts, architecture decisions (like patterns).

Lorem ipsum

WITH WHAT WE BUILD IT?

Tech stack, integrations (using best practices).

WHERE WE RUN IT?

Infra and pipelines.

REFERENCE

AI Models

Or how things are shaped.

microfrontend.dev - @anfibiacreativa

WITH WHAT?

GPT and LLMs

Generative Pre-trained (on massive amounts of data) Transformers: a prompt-based model that learns to predict the next word in a sequence.

It is autoregressive (one token at a time)*

microfrontend.dev - @anfibiacreativa

GPT

microfrontend.dev - @anfibiacreativa

LLM

Large Language Models, (encompasses the GPT category), and are large because they can include up to trillions of parameters (weight and biases) that establish relationships.

GPT 3.5, GPT 4, TURBO

WITH WHAT?

Completions

Completions are the process of predicting the completion of a sequence (of text) based on a given input.

We can think of code completion with Copilot!

microfrontend.dev - @anfibiacreativa

Embeddings

Embeddings are the vectorial representation of words, phrases, sentences, etc, in the continuous vector space.

WORDS mapped to their VECTOR representation, so we can operate with them. Semantically similar words will be close together in vector space.

microfrontend.dev - @anfibiacreativa

SEMANTIC RANKER CAPTIONS

RETRIEVAL MODE

APPROACH

Organization of the words depending on the semantic similarity, and generation of textual descriptions.

microfrontend.dev - @anfibiacreativa

Establishes the flow of retrieval and generation, for example 'retrieve then read'.

Defines whether retrieval is based on text similarity or vector representation.

TEMPERATURE

Controls randomness.

Azure OpenAI SDK

Client library to implement Open AI services on Azure.

microfrontend.dev - @anfibiacreativa

For JavaScript!

WITH WHAT?

// Copyright (c) Microsoft Corporation.
// Licensed under the MIT License.

/**
 * Demonstrates how to get chat completions for a chat context.
 *
 * @summary get chat completions.
 */

const { OpenAIClient, AzureKeyCredential } = require("@azure/openai");

// Load the .env file if it exists
require("dotenv").config();

// You will need to set these environment variables or edit the following values
const endpoint = process.env["ENDPOINT"] || "<endpoint>";
const azureApiKey = process.env["AZURE_API_KEY"] || "<api key>";

const messages = [
  { role: "system", content: "You are a helpful assistant. You will talk like a pirate." },
  { role: "user", content: "Can you help me?" },
];

async function main() {

  const client = new OpenAIClient(endpoint, new AzureKeyCredential(azureApiKey));
  const deploymentId = "gpt-35-turbo";
  const result = await client.getChatCompletions(deploymentId, messages);

  for (const choice of result.choices) {
    console.log(choice.message);
  }
}

main().catch((err) => {
  console.error("The sample encountered an error:", err);
});

module.exports = { main };

# Chat with the bot
# Chat-app-protocol

POST {{api_host}}/chat
Content-Type: application/json

{
  "messages": [{
    "content": "How to search and book rentals?",
    "role": "user"
  }],
  "context": {
    "approach":"rrr",
    "retrieval_mode": "hybrid",
    "semantic_ranker": true,
    "semantic_captions": false,
    "top": 3,
    "suggest_followup_questions": false
  }
}

###

Swappable

The Chat App Protocol as a contract between frontends and backends using the same services.

microfrontend.dev - @anfibiacreativa

backends

microfrontend.dev - @anfibiacreativa

HOW?

Application Architecture

Web components that can be bootstrapped anywhere.

microfrontend.dev - @anfibiacreativa

HOW?

SEARCH SERVICE

BLOB

Frontend

microfrontend.dev - @anfibiacreativa

AZURE AI SEARCH

Repo

INDEXER SERVICE

AZURE OPEN AI SERVICE

SEARCH SERVICE

BLOB

Frontend

microfrontend.dev - @anfibiacreativa

AZURE AI SEARCH

Repo

INDEXER SERVICE

AZURE OPEN AI SERVICE

HOW?

Sources.

Data ingestion and indexing.

microfrontend.dev - @anfibiacreativa

import { type SearchIndex } from '@azure/search-documents';
import { encoding_for_model, type TiktokenModel } from '@dqbd/tiktoken';
import { type AzureClients } from '../plugins/azure.js';
import { type OpenAiService } from '../plugins/openai.js';

// other code

export class Indexer {
  private blobStorage: BlobStorage;

  constructor(
    private logger: BaseLogger,
    private azure: AzureClients,
    private openai: OpenAiService,
    private embeddingModelName: string = 'text-embedding-ada-002',
  ) {
    this.blobStorage = new BlobStorage(logger, azure);
  }

  async createSearchIndex(indexName: string) {
    this.logger.debug(`Ensuring search index "${indexName}" exists`);

    const searchIndexClient = this.azure.searchIndex;

    const names: string[] = [];
    const indexNames = await searchIndexClient.listIndexes();
    for await (const index of indexNames) {
      names.push(index.name);
    }
    
    // mode code

FINE-TUNING WITH LANGCHAIN 🦜️🔗

https://js.langchain.com/docs/get_started/introduction

https://langchain.com/docs/integrations/platforms/microsoft

microfrontend.dev - @anfibiacreativa

WITH WHAT?

Frontend

microfrontend.dev - @anfibiacreativa

WITH WHAT?

<!-- Other code -->
  <mat-sidenav-content>
    <div class="inner-wrapper">
      <button class="button__button settings" title="Application Settings" (click)="sidenav.toggle()">
        <img src="./assets/svg/gear-solid.svg?raw" alt="Settings" class="icon__img" />
        <span>{{ settingsDefaults.panelLabel }}</span>
      </button>

      <chat-component
        [title]="title"
        [attr.data-input-position]="inputPosition"
        [attr.data-interaction-model]="interactionModel"
        data-api-url=""
        [attr.data-use-stream]="streaming"
        [attr.data-approach]="approach"
        [attr.data-overrides]="overrides"
      ></chat-component>
    </div>
<!-- Other code -->

// other code
@customElement('chat-component')
export class ChatComponent extends LitElement {
  //--
  // Public attributes
  // --

  @property({ type: String, attribute: 'data-input-position' })
  inputPosition = 'sticky';

  @property({ type: String, attribute: 'data-interaction-model' })
  interactionModel: 'ask' | 'chat' = 'chat';

  @property({ type: String, attribute: 'data-api-url' })
  apiUrl = chatHttpOptions.url;

  @property({ type: String, attribute: 'data-use-stream', converter: (value) => value?.toLowerCase() === 'true' })
  useStream: boolean = chatHttpOptions.stream;

  @property({ type: String, attribute: 'data-overrides', converter: (value) => JSON.parse(value || '{}') })
  overrides: RequestOverrides = {};
  //
  // more code

Frontend

microfrontend.dev - @anfibiacreativa

window.postMessage(m, tO, t)

WITH WHAT?


  // other code
  
  @HostListener('window:message',['$event'])
  onMessage(event: Event) {
    console.log('I hear you chat-component!');
    // Do something here
    // For example, make sure opened drawers and expanded items don't overlap
  }
  
  // more code

Go to demo!

Streaming ndJSON

ndJSON or newline delimited JSON, is a data interchange format that represents JSON objects in a file separated by newline characters.

Each line is a valid JSON object, and so is the whole file.

microfrontend.dev - @anfibiacreativa

import { NdJsonParserStream } from './data-format/ndjson.js';
import { globalConfig } from '../../config/global-config.js';

export function createReader(responseBody: ReadableStream<Uint8Array> | null) {
  return responseBody?.pipeThrough(new TextDecoderStream()).pipeThrough(new NdJsonParserStream()).getReader();
}

export async function* readStream<T>(reader: any): AsyncGenerator<T, void> {
  if (!reader) {
    throw new Error('No response body or body is not readable');
  }

  let value: JSON | undefined;
  let done: boolean;
  while ((({ value, done } = await reader.read()), !done)) {
    yield new Promise<T>((resolve) => {
      setTimeout(() => {
        resolve(value as T);
      }, globalConfig.BOT_TYPING_EFFECT_INTERVAL);
    });
  }
}

// Stop stream
export function cancelStream<T>(stream: ReadableStream<T> | null): void {
  if (stream) {
    stream.cancel();
  }
}

// more code
// this is the backend service
async *runWithStreaming(
    messages: Message[],
    context?: ChatApproachContext,
  ): AsyncGenerator<ApproachResponseChunk, void> {
    const { completionRequest, dataPoints, thoughts } = await this.baseRun(messages, context);
    const openAiChat = await this.openai.getChat();
    const chatCompletion = await openAiChat.completions.create({
      ...completionRequest,
      stream: true,
    });
    let id = 0;
    for await (const chunk of chatCompletion) {
      const responseChunk = {
        choices: [
          {
            index: 0,
            delta: {
              content: chunk.choices[0].delta.content ?? '',
              role: 'assistant' as const,
              context: {
                data_points: id === 0 ? dataPoints : undefined,
                thoughts: id === 0 ? thoughts : undefined,
              },
            },
            finish_reason: chunk.choices[0].finish_reason,
          },
        ],
        object: 'chat.completion.chunk' as const,
      };
      yield responseChunk;
      id++;
    }
  }