Chatbot Design

设计经验

mail@qhduan.com

wx: longinusd

段清华 Dean

Sumary

对话机器人是一种人机交互
- 也是一种流程自动化
是否需要自然语言处理
- 在某些情况下是的
需要语言处理（LP）
- 不一定是自然语言处理（NLP）
在产品上需要注意的问题
- 场景、环境、使用者与目的

CHATBOT

∈

HUMAN-MACHINE-INTERFACE

Human：User

Machine：Expert & Algorithm

human-MACHINE

OR

HUMAN-machine

human-MACHINE

Bloomberg
Slack
IFTTT

Human cheap

Machine expensive

HUMAN-machine

Xiao Ice
Echo

Human expensive

Machine cheap (NLP cheap)

LP

HUMAN

machine

HUMAN

machine

Dialog System (Task Complete)

Question Answering (Information Retrivel)

Chitchat (comfortable)

Dialog System

Language Processing

DIALOG SYSTEM PIPELINE

Young (2000)

Task View

任务完成的定义：

收集所有任务必要的元素

即：

任何东西都是要素

Product Design

Domains
Frame
Flow

Tip 1

不要预期能设计一个大而全的模型

设计一个可迭代的模型

Tip 2

产品随时进行集成测试，测试End-to-End对话效果

Cons：集成测试需要比传统App多的多的多

Pros：集成测试写起来比传统APP简单的多

Tip 3

人的阅读速度比听音速度快得多，区分场景

Tip 4

限制用户想象，集中在生产环境

Tip 5

MORE HAPPY PATH

as MUCH as POSSIBLE

Tip 6

文字交互时：

说更多细节与引导，用户会学习

给更多当前状态的信息，用户会纠错

Tip 7

EXPLAIN BOTS's WEAKNESS, DON'T HIDE

Target: Finish Task

HOW

Dialog Flow

Make every required slot

filled & confirmed

Slot Type

Domain Shared Slot
- Destination City
  - train, flight
- Price
  - train, flight, hotel
Domain Dependent Slot
- Nearby
  - hotel
Flow Control Slot
- Is Destination City Confirmed?
- Is Ticket Confirmed?

Flight Frame

Destination City
- filled
- confirmed
Depart City
- filled
- confirmed
Depart Date And Time
- filled
- confirmed
... / Cabin / Price / Airline / ...
Ticket Confirmed

{
    "DestCity": {
        "value": null,
        "confirm": false
    },
    "DepartCity": {
        "value": null,
        "confirm": false
    },
    "DepartDate": {
        "value": null,
        "confirm": false
    },
    "Cabin": {
        "value": null,
        "confirm": false
    },
    "Price": {
        "value": null,
        "confirm": false
    },
    "Ticket": {
        "confirm": false
    }
}

Walterfall

(MS BotFramework)

REQUEST(DestCity)
- CONFIRM(DestCity)
  - REQUEST(DepartCity)
    - CONFIRM(DepartCity)
      - REQUEST(DepartDate)
        
        CONFIRM(DepartDate)
        
        CONFIRM(Ticket)
        
        ORDER_TICKET

Domain Switch

国际难题

挂起当前领域

尝试复制当前领域可用SLOT

完成任务

尝试问用户回顾前领域

User Request

UserRequest(slot_name)

UserRequest(entity,slot_name)

Assume Entity

no anaphora resolution

(飞机) 它几点到？ == 几点到?

Show Me The Code

dialog_state = initialize_dialog()
system_action = HELLO()
system_utterance = NLG(system_action) # 你好
while True:
    user_utterance = input('>') # eg. 我要去北京的机票
    domain, intent, slots = NLU(user_utterance)
    # domain: travel
    # intent = inform
    # slots = {'DestCity': '北京', 'TravelType': '飞机'}
    latest_user_action = domain, intent, slots
    dialog_state = DialogStateTracking(
        dialog_state, # 输入老的dialog_state
        latest_user_action,
        latest_system_action)
    system_action = DialogPolicy( # eg. REQUEST('DepartCity')
        dialog_state,
        latest_user_action,
        latest_system_action)
    system_utterance = NLG(system_action) # 请问从哪里出发

NLU

INPUT： "我要去北京的飞机"

Domain

Classifier

Intent

Classifier

Slot

Filler

(NER)

Travel

Inform

DestCity: 北京

TravelType: 飞机

NLU

User Intent = Domain + Intent + Slots

Domain(travel) + Intent(inform) + Slots(DestCity=北京, TravelType=飞机)

Domain(plane) + Intent(inform) + Slots(DestCity=北京)

Domain(travel) + Intent(inform_plane) + Slots(DestCity=北京)

=

意图是语言要素的排列组合，没有定式

Dialog State Tracking

人工定义 - 瀑布流

newFrame = clone(frame)

if frame.DestCity not filled:
    if userAction.slot.DestCity:
        newFrame.DestCity.value = userAction.slot.DestCity

if last_system_action == CONFIRM("DestCity"):
    if userAction.intent == YES():
        newFrame.DestCity.confirm = True
    elif userAction.intent == NO():
        newFrame.DestCity.confirm = False

return newFrame

Dialog Policy

if frame.DestCity not filled:
    # 请告诉目的地
    return REQUEST("DestCity")

if frame.DestCity not confirmed:
    # 请确认目的地是 $frame.DestCity$
    return CONFIRM("DestCity", frame.DestCity)

# ...

if lastest_system_action == CONFIRM("Ticket"):
    if lastest_user_action == YES():
        return ORDER_TICKET()
    elif lastest_user_action == NO():
        return CONFIRM("Ticket")

Domain / Intent

Choice: Keras Classification

Pros

Higher Precision

Cons

Lower Recall

Trick

Scikit-Learn Wrapper, Grid Search

Chinese Fuzzy Enhancement

Lower Recall: Focus on domain

Slot Filling

Choice: Tensorflow Bi-LSTM CRF

Pros

Position Sensitive

Cons

Bounder mistake

Trick

Vocabulary-based re-fine

Two Type Slot Fillers

Bounder or Entity

导航到广安门

我要去北京

Dialog Management

Choice: BotFramework

made by Microsoft

Pros

Waterfall design

Cons

Complex rules programming

NLG

Choice: Function Template

Pros

Simple

Cons

Not Configurable

// input: system action == hello(name)
// output: natural language
funciton hello (name: string= null) {
  if (name) {
    return `你好，${name}`
  }
  return '你好'
}

Everything is Function / Module

Everything Module

could be

Handcraft

or

Machine Learning

Future

Every Module is Neural Network

+

Reinforcement Learning

=

END-to-END Model

Future

END-to-END Model

+

Reinforcement Learning

=

FUTURE MODEL

=

STILL NOT WORK ALONE

BUT GOOD FOR BACKUP OR BASELINE

WHERE IS DATA ?

1) ARTIFICIAL
2) COLLECT
3) BUY

DATA =

TEMPLATE + VARIANCE + ENTITIES + ERROR

ARTIFICIAL

TEMPLATE:
- MODE: 我要买去北京的票
- TEMP: 我<var1:要><var2:买><var2:去><entity-city><var3:票>
VARIANCE:
- var1:要、想要、需要、急需
- var2:买、购买、网上购买、预订、预购
ENTITIES:
- 北京、上海、天津……
ERRORS:
- OMITTING
- PINYIN INPUT
- ASR ERROR