Building Knowledge Bases from Text

[ Data Day Seattle ]

Today's Talk

Knowledge Graphs?

Building Knowledge Bases

Topic Modeling

Structured Information Extraction

About Me

Garrett Eastham

Principal Scientist - Digital Commerce

  • AI & Digitial Commerce Focus
  • CS @ Stanford
  • Background in Web Analytics
  • Career in Data Science & Product Management
  • Prior: Edgecase (founder), Bazaarvoice, RetailMeNot

Knowledge Graphs

What is a Knowledge Graph?

Knowledge Graph: Graphical representation of the guiding information principles within a given domain.

Advanced Analytics

Hierarchical segmentation & rollup

Cognitive Systems

Enables tractable computational reasoning

Memory

CPU

Disk Space

500 gb

i7

128 gb

Why Learn Them from Text?

A pre-existing taxonomy or knowledge base does not exist.

Subject matter experts and/or taxonomists are cost prohibitive.

Types of Knowledge Representation

Model Complexity

Increasing Creation and Maintenance Costs

Taxonomy

Ontology

Knowledge Graph

Knowledge Base

Categorical

Relational

Taxonomies

Categorical Structure

Easy, intuitive conceptual representation that (usually) maps easily to external systems.

Rollup Power

Enables easy, robust analytic power simply by providing different levels of abstraction for query rollups.

Ontologies

Attribute

Attribute

Attribute

Attribute

Attribute

Attribute

Attribute

Attribute

Attribute

Localized Semantics

Enables concept-specific algorithms (i.e. - slot prediction) that require strongly typed inputs.

Implicit Categorical Inheritance

Allows logical inheritance of outcomes to be passed up or down the conceptual hierarchy.

Abstract Knowledge Graphs

Abstract Relations

Perfect for managing light or unknown categorical sub-structure while supporting high breadth.

Graph-Oriented Insights

Typically used as a vehicle to leverage graph approaches to insight discovery from another entity processing step.

P_{ij}
P_{ij}
P_{ij}
P_{ij}
P_{ij}
P_{ij}
P_{ij}
P_{ij}
P_{ij}
P_{ij}
P_{ij}
P_{ij}

Knowledge Bases

is_a

married_to

owns

has_property

Explicit, Relational Structure

Machine-readable data enables richer question answering via structured queries.

Enables Propositional Reasoning

Can facilitate rich symbolic reasoning through automatic logical resolvers.

Building Knowledge Graphs

Knowledge Development Landscape

Small Data

Big Data

Structured Domain

Open Domain

General Web

Rich expression but no pre-tense of conceptual structure; however, large amounts of samples for concept discovery.

Legal Documents

Deep, rich, repetitive domain language but little to no expected document style.

Product Reviews

Varied grammatical structure but very rich meta-data to connect into pre-existing graphical relationships.

Social Media

Rich meta-structure, varied grammar and typically small document size; however, large and on-going samples.

Bootstrapping Knowledge Graphs

Topic Modeling

Discover latent structure between keywords concepts by enforcing categorical structuring.

Structured Information Extraction

Extract precise conceptual relationships given a seed set of relation patterns.

Goal: Use unsupervised clustering to help look for implicit relations that can then more richly codified by a SME.

Knowledge Representational Power: Both actual facts and their connections BUT also why they are connected.

Data Preparation: Keyword Extraction

TopicRank: Graph-Based Topic Ranking for Keyphrase Extraction (Bougouin A., Boudin F. and Daille B. - 2013)

Data Preparation: Word Embeddings

Distributed Representations of Words and Phrases and their Compositionality (Mikolov et al. - 2013)

GloVe: Global Vectors for Word Representation (Pennington J., Socher R. & Manning D. - 2014)

Topic Modeling

Clustering Algorithms: Overview

Classical

Agglomerative Hierarchical Clustering

State of the Art

Spectral Clustering

Latent Dirichlet Allocation

Markov (Graph) Clustering

Deep Embedded Clustering

Variational Deep Embedding

Topic Modeling

Agglomerative Hierarchical Clustering

Modern hierarchical, agglomerative clustering algorithms (Mullner D. - 2011)

Spectral Clustering

On Spectral Clustering: Analysis and an algorithm (Ng A., Jordan M. & Weiss Y. - 2001)

Similarity Matrix

Word Embeddings Cosine Distance

Latent Dirichlet Allocation

Latent Dirichlet Allocation (Blei M., Ng A. & Jordan M. - 2003)

(Semantic) Latent Dirichlet Allocation

Gaussian LDA for Topic Models with Word Embeddings (Das R., Zaheer M. & Dyer C. - 2015)

Markov (Graph) Clustering

Graph Clustering by Flow Simulation (Dongen S. - 2000)

K_1
K_2
S_{K_1,K_2}

Emergent Cluster Discovery

Flow simulation creates natural set of cluster membership without needing to pre-define number of clusters K.

Deep Embedded Clustering

Unsupervised Deep Embedding for Clustering Analysis (Xie J., Girshick R. & Farhadi A. - 2016)

X_i
f(X_i ; \theta)
Z_i

Keyword Embeddings

Non-Linear Feature Transformation

(deep neural network)

Latent Feature Space

U_i

Cluster Assignment

Deep Embedded Clustering

Unsupervised Deep Embedding for Clustering Analysis (Xie J., Girshick R. & Farhadi A. - 2016)

"Soft" Cluster Assignment

Determine KL Divergence

Deep Embedded Clustering

Unsupervised Deep Embedding for Clustering Analysis (Xie J., Girshick R. & Farhadi A. - 2016)

KL Loss is Differentiable

{\partial L} / {\partial \theta}

Back Propagate Parameter Gradient

Variational Deep Embedding

Variational Deep Embedding: A Generative Approach to Clustering (Jiang Z. et al. - 2016)

2-Step Process

Cannot optimize in end-to-end in same architecture

Issues w/ DEC

Variational Deep Embedding

Variational Deep Embedding: A Generative Approach to Clustering (Jiang Z. et al. - 2016)

Decoder

Variational Auto Encoders

[0, 1, 0, 0]

 

Variational Deep Embedding

Variational Deep Embedding: A Generative Approach to Clustering (Jiang Z. et al. - 2016)

Auto-Encoder

Variational Auto Encoders

[0, 1, 0, 0]

 

Variational Deep Embedding

Variational Deep Embedding: A Generative Approach to Clustering (Jiang Z. et al. - 2016)

Gaussian Mixture

Variational Auto Encoders

Variational Deep Embedding

Variational Deep Embedding: A Generative Approach to Clustering (Jiang Z. et al. - 2016)

Supervised-Level Generative Power

VaDE is capable of learning both cluster and feature embedding abstractions that rival traditional supervised learning techniques.

Variational Deep Embedding

Variational Deep Embedding: A Generative Approach to Clustering (Jiang Z. et al. - 2016)

High

Low

Categorical Reconstructive Power

Variational Deep Embedding

Variational Deep Embedding: A Generative Approach to Clustering (Jiang Z. et al. - 2016)

Adaptive Sub-Categorization

Structured Information Extraction

Information Extraction Spectrum

Supervised

Regular Expressions

Unsupervised

DeepDive

ReVerb

Information Extraction

Entity Extraction from Text

Hand-coded, High-precision Patterns

Automatic Entity Extraction: Reverb

Identifying Relations for Open Information Extraction (Fader A., Soderland S. & Etzioni O. - 2011)

Syntactic Constraint

Confidence Function

Logistic regression on hand-tuned features trained externally.

Lexical Constraint

https://github.com/knowitall/reverb

Relational Extraction: DeepDive

DeepDive: Web-scale Knowledge-base Construction using Statistical Learning and Inference (Niu F. et al. - 2012)

Factor Graphs

User Defined Functions

https://github.com/HazyResearch/deepdive

How to Contact Me

garrett@dataexhaust.io

  • Follow up questions
  • Model development / implementation
  • Data science team training
  • Moral support

Slides: http://slides.com/dataexhaust

Twitter: @data_exhaust

Made with Slides.com