Practical Applications of Chatbot

Novel Network Architecture: Transformer(1/3)

Sutskever, I., Vinyals, O., & Le, Q.V. (2014). Sequence to Sequence Learning with Neural Networks. ArXiv, abs/1409.3215.

圖片來源1

seq2seq model, 2014/09

Encoder-Decoder Model for NLP tasks

文字型任務

編碼器

解碼器

圖片來源

Novel Network Architecture: Transformer(2/3)

attention, 2014/09

Bahdanau, D., Cho, K., & Bengio, Y. (2014). Neural Machine Translation by Jointly Learning to Align and Translate. CoRR, abs/1409.0473.

seq2seq model, 2014/09

Attention: focus on specific parts of input while generating output

認知科學：選擇性注意力

圖片來源

Novel Network Architecture: Transformer(3/3)

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł. & Polosukhin, I. (2017). Attention is all you need. Advances in Neural Information Processing Systems (p./pp. 5998--6008), .

attention, 2014/09

seq2seq model, 2014/09

Transformer, 2017/06

Self-Attention: input interact with each other

上下文context

圖片來源, 圖片來源2

Transformer-based Language Models GPT-2 (2/5)

Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., & Sutskever, I. (2019). Language Models are Unsupervised Multitask Learners.

attention, 2014/09

seq2seq model, 2014/09

Transformer, 2017/06

GPT-2, 2019

BERT, 2018/10

decoder-only

117M Parameters

1,542M Parameters

圖片來源

Transformer-based LM Pre-trained + Fine-tune (3/5)

預訓練

微調

Transformer-based LM You Never Know What They Really Do (4/5)

Transform-based Model

Icon by Nikita Golubev

●●●

pre-trained data sets

NN1

Task 1

NN2

Task 2

NNn

Task N

target domain data sets

●●●

LLM

難以捕捉更高層次的語義概念
難以進行邏輯運算和因果推理
缺乏情境理解

LLMs are More Than You Think, but also Lower Than Expected

source1, source2, source3

Capabilities of Pattern Matching

Mirzadeh, I., Alizadeh-Vahid, K., Shahrokhi, H., Tuzel, O., Bengio, S., & Farajtabar, M. (2024). GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models.

One of the solutions: RAG

Hallucination

Prompt Engineering

Hallucination(1/4)

Maleki, N., Padmanabhan, B., & Dutta, K. (2024, June). AI hallucinations: a misnomer worth clarifying. In 2024 IEEE conference on artificial intelligence (CAI) (pp. 133-138). IEEE.

電腦視覺領域：錯誤標示不存在的物件、將物件錯誤定位
NLP領域：機器翻譯中可能產生「流暢但無關的輸出」，或生成看似合理但缺乏事實依據的內容(2017)
醫學領域：部分學者認為 AI 並不具備感官知覺，因此「幻覺」一詞並不適用

定義分歧！

Hallucination(2/4)

Maleki, N., Padmanabhan, B., & Dutta, K. (2024, June). AI hallucinations: a misnomer worth clarifying. In 2024 IEEE conference on artificial intelligence (CAI) (pp. 133-138). IEEE.

機器翻譯：「流暢但無關的翻譯」或「流暢但內容不足的輸出」
文本摘要：「與原始文件不一致的內容」
- 「內在幻覺（Intrinsic Hallucination）」
- 「外在幻覺（Extrinsic Hallucination）」
醫療領域：「脫離已知醫學知識的錯誤資訊」，可能影響診斷與決策。
法律與倫理領域：可能導致「不準確或誤導性的法律建議」，影響法務工作的可靠性。

Hallucination(3/4)

Maleki, N., Padmanabhan, B., & Dutta, K. (2024, June). AI hallucinations: a misnomer worth clarifying. In 2024 IEEE conference on artificial intelligence (CAI) (pp. 133-138). IEEE.

虛構（Fabrication）：指 AI 生成的內容看似合理，但實際上不存在於訓練數據中。
錯誤資訊（Misinformation）：指 AI 生成的內容錯誤或不符事實。
隨機鸚鵡效應（Stochastic Parroting）：指 AI 只是機械式地重組訓練數據，並非真正理解資訊。
事實捏造（Fact Fabrication）：指 AI 生成未經驗證的「新事實」。

部份研究建議的替代用語：

Hallucination(4/4)

Maleki, N., Padmanabhan, B., & Dutta, K. (2024, June). AI hallucinations: a misnomer worth clarifying. In 2024 IEEE conference on artificial intelligence (CAI) (pp. 133-138). IEEE.

建立標準化定義：不同領域應達成共識，以便更準確地描述 AI 的錯誤輸出。
採用更精確的術語：避免使用「幻覺」等可能引發誤解的詞彙，改用「事實捏造」、「錯誤資訊」等更貼近 AI 行為的詞彙。
提高 AI 透明度：AI 開發者應提供更明確的機制，以標示 AI 產生內容的可信度，並減少錯誤資訊的傳播。

結論與建議

調整策略	示例
變更檢索策略	若搜尋「Apple Remote」無結果，則改搜尋「Apple TV 遙控器」
修正推理過程	若模型推理「A 是 B 的創辦人」，但查無資料，則改為「A 可能參與了 B 的早期發展」
重新規劃行動順序	若計畫「先檢查抽屜再看桌面」失敗，則改為「先檢查桌面再看抽屜」
嘗試不同的知識來源	若維基百科無法提供答案，則改用 Google 搜尋（若環境允許）

LangChain ReAct Design Pattern: High Level Abstraction (3/5)

attention, 2014/09

seq2seq model, 2014/09

Transformer, 2017/06

GPT-2, 2019

BERT, 2018/10

Prompt Engineering, 2018/06

RAG, 2020/05

ReAct, 2022/10

Setting, Resouces

Interactive UI for Work Flow Design

https://dify.ai/

source1, source2

圖片來源

LlamaIndex For Efficient Indexing & Retrieval (4/5)

https://www.llamaindex.ai/

Source

RAG Research Framework (5/5)

Context
Query
Prompt

LLM

Output

Vector DB

❶ Dataset

❸ Embedding

➍ Similarity

❷

➎ Reranking algorithm

➏

source

RAG

web crawler

Context
Query
Prompt

LLM

Output

Vector DB

source

RAG

關懷理論

Workflow icons created by Haca Studio - Flaticon

Scenario: 法律扶助

問責合規

來源透明

專家審查

defining an appropriate workflow

pip install llama-index

LlamaIndex Hello world (1/6)

1. 安裝Python套件

圖片來源1, 圖片來源2, 圖片來源4, 圖片來源5

●●●

2. 準備知識庫

支援各種檔案格式(參考官網文件)

3. 設定OpenAI API Key (可更換別的LLM)

LlamaIndex Hello world (2/6)

2. 準備知識庫

data資料夾：5個PDF檔

範例程式

知識庫

3. 設定OpenAI API Key (可更換別的LLM)

LlamaIndex Hello world (3/6)

3. 設定OpenAI API Key (可更換別的LLM)

LlamaIndex Hello world (3/6)

3. 設定OpenAI API Key (可更換別的LLM)

LlamaIndex Hello world (3/6)

3. 設定OpenAI API Key (可更換別的LLM)

LlamaIndex Hello world (3/6)

3. 設定OpenAI API Key (可更換別的LLM)

LlamaIndex Hello world (3/6)

3. 設定OpenAI API Key (可更換別的LLM)

LlamaIndex Hello world (3/6)

輸入申請之API KEY

 import os
from llama_index.core import (
    VectorStoreIndex,
    SimpleDirectoryReader,)
 
def setCurrentWD():
    abspath = os.path.abspath(__file__)
    dname = os.path.dirname(abspath)
    os.chdir(dname)
 
setCurrentWD()  # 設定工作目錄, 以免找不到data資料夾
# 1. Loading & Parsing
documents = SimpleDirectoryReader("data").load_data()
 
# 2. Indexing & vector store
index = VectorStoreIndex.from_documents(documents)
 
# 3. Query
query_engine = index.as_query_engine()
response = query_engine.query("Tell me about rag")
print(response) import os
from llama_index.core import (
    VectorStoreIndex,
    SimpleDirectoryReader,)
 
def setCurrentWD():
    abspath = os.path.abspath(__file__)
    dname = os.path.dirname(abspath)
    os.chdir(dname)
 
setCurrentWD()  # 設定工作目錄, 以免找不到data資料夾
# 1. Loading & Parsing
documents = SimpleDirectoryReader("data").load_data()
 
# 2. Indexing & vector store
index = VectorStoreIndex.from_documents(documents)
 
# 3. Query
query_engine = index.as_query_engine()
response = query_engine.query("Tell me about rag")
print(response) import os
from llama_index.core import (
    VectorStoreIndex,
    SimpleDirectoryReader,)
 
def setCurrentWD():
    abspath = os.path.abspath(__file__)
    dname = os.path.dirname(abspath)
    os.chdir(dname)
 
setCurrentWD()  # 設定工作目錄, 以免找不到data資料夾
# 1. Loading & Parsing
documents = SimpleDirectoryReader("data").load_data()
 
# 2. Indexing & vector store
index = VectorStoreIndex.from_documents(documents)
 
# 3. Query
query_engine = index.as_query_engine()
response = query_engine.query("Tell me about rag")
print(response) import os
from llama_index.core import (
    VectorStoreIndex,
    SimpleDirectoryReader,)
 
def setCurrentWD():
    abspath = os.path.abspath(__file__)
    dname = os.path.dirname(abspath)
    os.chdir(dname)
 
setCurrentWD()  # 設定工作目錄, 以免找不到data資料夾
# 1. Loading & Parsing
documents = SimpleDirectoryReader("data").load_data()
 
# 2. Indexing & vector store
index = VectorStoreIndex.from_documents(documents)
 
# 3. Query
query_engine = index.as_query_engine()
response = query_engine.query("Tell me about rag")
print(response) import os
from llama_index.core import (
    VectorStoreIndex,
    SimpleDirectoryReader,)
 
def setCurrentWD():
    abspath = os.path.abspath(__file__)
    dname = os.path.dirname(abspath)
    os.chdir(dname)
 
setCurrentWD()  # 設定工作目錄, 以免找不到data資料夾
# 1. Loading & Parsing
documents = SimpleDirectoryReader("data").load_data()
 
# 2. Indexing & vector store
index = VectorStoreIndex.from_documents(documents)
 
# 3. Query
query_engine = index.as_query_engine()
response = query_engine.query("Tell me about rag")
print(response)

LlamaIndex Hello world (4/6)

response = query_engine.query("Tell me about rag")

LlamaIndex output (5/6)

RAG models leverage a retriever to retrieve text documents based on an input query and use them as additional context when generating a target sequence. These models have been shown to achieve state-of-the-art results on various tasks such as open Natural Questions, WebQuestions, CuratedTrec, MS-MARCO, Jeopardy question generation, and FEVER fact verification. RAG models generate responses that are more factual, specific, and diverse compared to baseline models like BART. The retrieval mechanism in RAG plays a key role in improving results across different tasks.

實際回應

出處

LlamaIndex Cost (6/6)

$0.12

<$0.01

Embedding (indexes) 可以只算一次

Embedding預訓練是不是MultiLingual？效果差很多！

選用模型以llama-3.2-1B為例

https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct

使用條件

選用模型以llama-3.2-1B為例

https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct

模型資訊

選用模型以llama-3.2-1B為例

於提示列貼上token

export HF_TOKEN=貼上TOKEN

pip install -U "huggingface_hub[cli]"    # 如有必要

huggingface-cli login

或是設定HF_TOKEN環境變數

token貼於此

n

token合法

Python套件安裝 LlamaIndex, OpenAI等

pip install llama-index llama-index-llms-openai openai

LlamaIndex 核心及 OpenAI Client 整合套件(vLLM api與OpenAI API相容)

本地端Embedding模型以HuggingFace模型為例

pip install llama-index-embeddings-huggingface sentence-transformers

pip install llama-index-embeddings-ollama
# 確保 Ollama 服務運行並已 pull embedding 模型, 
# e.g., ollama pull nomic-embed-textpip install llama-index-embeddings-huggingface sentence-transformers

本地端Embedding模型(optional)以ollama模型為例

範例程式碼

 from openai import OpenAI
client = OpenAI(
    base_url="http://localhost:8000/v1",
    api_key="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",
)
 
 
completion = client.chat.completions.create(
  model="meta-llama/Llama-3.2-1B-Instruct",
  messages=[
    {"role": "system", "content": "一律以台灣繁體中文慣用語回覆"},
    {"role": "user", "content": "什麼是語言模型"}
  ],
  max_tokens=512
)

練習：改用其他model, 如https://huggingface.co/facebook/m2m100_1.2B, 檢視結果

Vector Embeddings

Central to many NLP, recommendation, and search algorithms.

數值

物件、文字、圖像...

Source

Vector Embeddings semantic similarity

Vector Space: semantic similarity

Barančíková, P., & Bojar, O. (2019). In search for linear relations in sentence embedding spaces.

Vector Embeddings types

Word embeddings
- used to represent words in NLP
- Word2Vec, GloVe, FastText
Sentence and document embeddings
- semantic meaning of sentences and documents.
- BERT, Doc2Vec
Graph embeddings
- nodes and edges of graphs in vector space
- link prediction, node classification.
Image embeddings
- images in a compact vector form
- image recognition, image classification.

Pavan Belagatti, Vector Embeddings Explained for Developers!

Vector Embeddings creating embeddings using Huggingface

pip install -U transformers torch

 from transformers import AutoTokenizer, AutoModel
import torch
 
def get_huggingface_embedding(text, 
model_name='sentence-transformers/all-MiniLM-L6-v2'):
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    model = AutoModel.from_pretrained(model_name)
 
    inputs = tokenizer(text, return_tensors="pt", padding=True, 
truncation=True, max_length=512)
    with torch.no_grad():
        outputs = model(**inputs)
    # You can choose how to derive the final embeddings, e.g., mean pooling
    embeddings = outputs.last_hidden_state.mean(dim=1).squeeze().numpy()
    return embeddings
 
# Example usage
text = "Pavan is a developer evangelist."
embedding_huggingface = get_huggingface_embedding(text)
print(embedding_huggingface)

Ranking of Vector DBMS

Source: Jayita Bhattacharyya, A Brief Comparison of Vector Databases

Ranking of Vector DBMS

Source: Jayita Bhattacharyya, A Brief Comparison of Vector Databases

Open Source
Hosted Solution
Pricing
Supported Vector Lengths
Supported Distances: Similarity metrics
Nearest Neighbor Search: speed and accuracy trade-offs.
Clustering: useful for data exploration and analysis.
Filtering & Aggregation: refine search results and summarize data patterns.
Integrations: SDK, 支援的語言等等
Cloud Providers
Developer Experience

Ranking of Vector DBMS

https://www.vecdbs.com/

安裝PostgreSQLWindows版

1. 下載安裝postgreSQL

資料庫管理工具(GUI)

1. Stack Builder可取消

2. 預設已建立一個DB伺服器

3. 過程中可能要設定[超級管理員]密碼

管理PostgreSQL using pgAdmin4

phAdmin的master password

2. 開啟pgAdmin4，建立與管理伺服器

管理PostgreSQL using pgAdmin4

輸入[超級管理員]的密碼

[超級管理員]: postgres

2. 開啟pgAdmin4，建立與管理伺服器

管理PostgreSQL using pgAdmin4

2. 開啟pgAdmin4，建立與管理伺服器

管理PostgreSQL using pgAdmin4

3. 建立使用者，設定相關權限

按右鍵建立使用者

管理PostgreSQL using pgAdmin4

3. 建立使用者，設定相關權限

設定帳號名稱

管理PostgreSQL using pgAdmin4

3. 建立使用者，設定相關權限

設定密碼

管理PostgreSQL using pgAdmin4

3. 建立使用者，設定相關權限

開啟此選項，保留其他預設值

	import os
	from llama_index.core import (
	VectorStoreIndex,
	SimpleDirectoryReader,)

	def setCurrentWD():
	abspath = os.path.abspath(__file__)
	dname = os.path.dirname(abspath)
	os.chdir(dname)

	setCurrentWD() # 設定工作目錄, 以免找不到data資料夾
	# 1. Loading & Parsing
	documents = SimpleDirectoryReader("data").load_data()

	# 2. Indexing & vector store
	index = VectorStoreIndex.from_documents(documents)

	# 3. Query
	query_engine = index.as_query_engine()
	response = query_engine.query("Tell me about rag")
	print(response)

	from openai import OpenAI
	client = OpenAI(
	base_url="http://localhost:8000/v1",
	api_key="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",
	)


	completion = client.chat.completions.create(
	model="meta-llama/Llama-3.2-1B-Instruct",
	messages=[
	{"role": "system", "content": "一律以台灣繁體中文慣用語回覆"},
	{"role": "user", "content": "什麼是語言模型"}
	],
	max_tokens=512
	)

	from transformers import AutoTokenizer, AutoModel
	import torch

	def get_huggingface_embedding(text,
	model_name='sentence-transformers/all-MiniLM-L6-v2'):
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModel.from_pretrained(model_name)

	inputs = tokenizer(text, return_tensors="pt", padding=True,
	truncation=True, max_length=512)
	with torch.no_grad():
	outputs = model(**inputs)
	# You can choose how to derive the final embeddings, e.g., mean pooling
	embeddings = outputs.last_hidden_state.mean(dim=1).squeeze().numpy()
	return embeddings

	# Example usage
	text = "Pavan is a developer evangelist."
	embedding_huggingface = get_huggingface_embedding(text)
	print(embedding_huggingface)