Chatbot Design
设计经验
mail@qhduan.com
wx: longinusd
段清华 Dean
Sumary
- 对话机器人是一种人机交互
- 也是一种流程自动化
- 是否需要自然语言处理
- 在某些情况下是的
- 需要语言处理(LP)
- 不一定是自然语言处理(NLP)
- 在产品上需要注意的问题
- 场景、环境、使用者与目的
CHATBOT
∈
HUMAN-MACHINE-INTERFACE
Human:User
Machine:Expert & Algorithm
human-MACHINE
OR
HUMAN-machine
human-MACHINE
- Bloomberg
- Slack
- IFTTT
Human cheap
Machine expensive
HUMAN-machine
- Xiao Ice
- Echo
Human expensive
Machine cheap (NLP cheap)
LP
HUMAN
machine
HUMAN
machine
Dialog System (Task Complete)
Question Answering (Information Retrivel)
Chitchat (comfortable)
Dialog System
Language Processing
DIALOG SYSTEM PIPELINE

Young (2000)
Task View
任务完成的定义:
收集所有任务必要的元素
即:
任何东西都是要素
Product Design
- Domains
- Frame
- Flow
Tip 1
不要预期能设计一个大而全的模型
设计一个可迭代的模型
Tip 2
产品随时进行集成测试,测试End-to-End对话效果
Cons:集成测试需要比传统App多的多的多
Pros:集成测试写起来比传统APP简单的多
Tip 3
人的阅读速度比听音速度快得多,区分场景
Tip 4
限制用户想象,集中在生产环境
Tip 5
MORE HAPPY PATH
as MUCH as POSSIBLE
Tip 6
文字交互时:
说更多细节与引导,用户会学习
给更多当前状态的信息,用户会纠错
Tip 7
EXPLAIN BOTS's WEAKNESS, DON'T HIDE
Target: Finish Task
HOW
Dialog Flow
Make every required slot
filled & confirmed
Slot Type
- Domain Shared Slot
- Destination City
- train, flight
- Price
- train, flight, hotel
- Destination City
- Domain Dependent Slot
- Nearby
- hotel
- Nearby
- Flow Control Slot
- Is Destination City Confirmed?
- Is Ticket Confirmed?
Flight Frame
- Destination City
- filled
- confirmed
- Depart City
- filled
- confirmed
- Depart Date And Time
- filled
- confirmed
- ... / Cabin / Price / Airline / ...
- Ticket Confirmed
{
"DestCity": {
"value": null,
"confirm": false
},
"DepartCity": {
"value": null,
"confirm": false
},
"DepartDate": {
"value": null,
"confirm": false
},
"Cabin": {
"value": null,
"confirm": false
},
"Price": {
"value": null,
"confirm": false
},
"Ticket": {
"confirm": false
}
}Walterfall
(MS BotFramework)
- REQUEST(DestCity)
- CONFIRM(DestCity)
- REQUEST(DepartCity)
- CONFIRM(DepartCity)
- REQUEST(DepartDate)
- CONFIRM(DepartDate)
- CONFIRM(Ticket)
- ORDER_TICKET
- CONFIRM(Ticket)
- CONFIRM(DepartDate)
- REQUEST(DepartDate)
- CONFIRM(DepartCity)
- REQUEST(DepartCity)
- CONFIRM(DestCity)
Domain Switch
国际难题
挂起当前领域
尝试复制当前领域可用SLOT
完成任务
尝试问用户回顾前领域
User Request
UserRequest(slot_name)
UserRequest(entity,slot_name)
Assume Entity
no anaphora resolution
(飞机) 它几点到? == 几点到?
Show Me The Code
dialog_state = initialize_dialog()
system_action = HELLO()
system_utterance = NLG(system_action) # 你好
while True:
user_utterance = input('>') # eg. 我要去北京的机票
domain, intent, slots = NLU(user_utterance)
# domain: travel
# intent = inform
# slots = {'DestCity': '北京', 'TravelType': '飞机'}
latest_user_action = domain, intent, slots
dialog_state = DialogStateTracking(
dialog_state, # 输入老的dialog_state
latest_user_action,
latest_system_action)
system_action = DialogPolicy( # eg. REQUEST('DepartCity')
dialog_state,
latest_user_action,
latest_system_action)
system_utterance = NLG(system_action) # 请问从哪里出发NLU
INPUT: "我要去北京的飞机"
Domain
Classifier
Intent
Classifier
Slot
Filler
(NER)
Travel
Inform
DestCity: 北京
TravelType: 飞机
NLU
User Intent = Domain + Intent + Slots
Domain(travel) + Intent(inform) + Slots(DestCity=北京, TravelType=飞机)
Domain(plane) + Intent(inform) + Slots(DestCity=北京)
Domain(travel) + Intent(inform_plane) + Slots(DestCity=北京)
=
=
=
意图是语言要素的排列组合,没有定式
Dialog State Tracking
人工定义 - 瀑布流
newFrame = clone(frame)
if frame.DestCity not filled:
if userAction.slot.DestCity:
newFrame.DestCity.value = userAction.slot.DestCity
if last_system_action == CONFIRM("DestCity"):
if userAction.intent == YES():
newFrame.DestCity.confirm = True
elif userAction.intent == NO():
newFrame.DestCity.confirm = False
return newFrameDialog Policy
if frame.DestCity not filled:
# 请告诉目的地
return REQUEST("DestCity")
if frame.DestCity not confirmed:
# 请确认目的地是 $frame.DestCity$
return CONFIRM("DestCity", frame.DestCity)
# ...
if lastest_system_action == CONFIRM("Ticket"):
if lastest_user_action == YES():
return ORDER_TICKET()
elif lastest_user_action == NO():
return CONFIRM("Ticket")
Domain / Intent
Choice: Keras Classification
Pros
Higher Precision
Cons
Lower Recall
Trick
Scikit-Learn Wrapper, Grid Search
Chinese Fuzzy Enhancement
Lower Recall: Focus on domain
Slot Filling
Choice: Tensorflow Bi-LSTM CRF
Pros
Position Sensitive
Cons
Bounder mistake
Trick
Vocabulary-based re-fine
Two Type Slot Fillers
Bounder or Entity
导航到广安门
我要去北京
Dialog Management
Choice: BotFramework
made by Microsoft
Pros
Waterfall design
Cons
Complex rules programming
NLG
Choice: Function Template
Pros
Simple
Cons
Not Configurable
// input: system action == hello(name)
// output: natural language
funciton hello (name: string= null) {
if (name) {
return `你好,${name}`
}
return '你好'
}Everything is Function / Module
Everything Module
could be
Handcraft
or
Machine Learning
Future
Every Module is Neural Network
+
Reinforcement Learning
=
END-to-END Model
Future
END-to-END Model
+
Reinforcement Learning
=
FUTURE MODEL
FUTURE MODEL
=
STILL NOT WORK ALONE
BUT GOOD FOR BACKUP OR BASELINE
WHERE IS DATA ?
1) ARTIFICIAL
2) COLLECT
3) BUY
DATA =
TEMPLATE + VARIANCE + ENTITIES + ERROR
ARTIFICIAL
ARTIFICIAL
- TEMPLATE:
- MODE: 我要买去北京的票
- TEMP: 我<var1:要><var2:买><var2:去><entity-city><var3:票>
- VARIANCE:
- var1:要、想要、需要、急需
- var2:买、购买、网上购买、预订、预购
- ENTITIES:
- 北京、上海、天津……
- ERRORS:
- OMITTING
- PINYIN INPUT
- ASR ERROR
COLLECT
微博、论坛、贴吧、豆瓣
WHATEVER
BUY
标注、众包
Thank You
mail@qhduan.com
wx: longinusd
段清华 Dean
Chatbot Design
By qhduan
Chatbot Design
- 638