Building Knowledge Bases from Text
[ Data Day Seattle ]
Today's Talk
Knowledge Graphs?
Building Knowledge Bases
Topic Modeling
Structured Information Extraction
About Me
Garrett Eastham

Principal Scientist - Digital Commerce
- AI & Digitial Commerce Focus
- CS @ Stanford
- Background in Web Analytics
- Career in Data Science & Product Management
- Prior: Edgecase (founder), Bazaarvoice, RetailMeNot

Knowledge Graphs
What is a Knowledge Graph?
Knowledge Graph: Graphical representation of the guiding information principles within a given domain.
Advanced Analytics
Hierarchical segmentation & rollup
Cognitive Systems
Enables tractable computational reasoning
Memory
CPU
Disk Space
500 gb
i7
128 gb
Why Learn Them from Text?

A pre-existing taxonomy or knowledge base does not exist.
Subject matter experts and/or taxonomists are cost prohibitive.

Types of Knowledge Representation
Model Complexity
Increasing Creation and Maintenance Costs
Taxonomy
Ontology
Knowledge Graph
Knowledge Base
Categorical
Relational
Taxonomies
Categorical Structure
Easy, intuitive conceptual representation that (usually) maps easily to external systems.
Rollup Power
Enables easy, robust analytic power simply by providing different levels of abstraction for query rollups.
Ontologies
Attribute
Attribute
Attribute
Attribute
Attribute
Attribute
Attribute
Attribute
Attribute
Localized Semantics
Enables concept-specific algorithms (i.e. - slot prediction) that require strongly typed inputs.
Implicit Categorical Inheritance
Allows logical inheritance of outcomes to be passed up or down the conceptual hierarchy.
Abstract Knowledge Graphs
Abstract Relations
Perfect for managing light or unknown categorical sub-structure while supporting high breadth.
Graph-Oriented Insights
Typically used as a vehicle to leverage graph approaches to insight discovery from another entity processing step.
Knowledge Bases
is_a
married_to
owns
has_property
Explicit, Relational Structure
Machine-readable data enables richer question answering via structured queries.
Enables Propositional Reasoning
Can facilitate rich symbolic reasoning through automatic logical resolvers.
Building Knowledge Graphs
Knowledge Development Landscape
Small Data
Big Data
Structured Domain
Open Domain
General Web
Rich expression but no pre-tense of conceptual structure; however, large amounts of samples for concept discovery.
Legal Documents
Deep, rich, repetitive domain language but little to no expected document style.
Product Reviews
Varied grammatical structure but very rich meta-data to connect into pre-existing graphical relationships.
Social Media
Rich meta-structure, varied grammar and typically small document size; however, large and on-going samples.
Bootstrapping Knowledge Graphs
Topic Modeling
Discover latent structure between keywords concepts by enforcing categorical structuring.
Structured Information Extraction
Extract precise conceptual relationships given a seed set of relation patterns.
Goal: Use unsupervised clustering to help look for implicit relations that can then more richly codified by a SME.
Knowledge Representational Power: Both actual facts and their connections BUT also why they are connected.
Data Preparation: Keyword Extraction


TopicRank: Graph-Based Topic Ranking for Keyphrase Extraction (Bougouin A., Boudin F. and Daille B. - 2013)
Data Preparation: Word Embeddings

Distributed Representations of Words and Phrases and their Compositionality (Mikolov et al. - 2013)
GloVe: Global Vectors for Word Representation (Pennington J., Socher R. & Manning D. - 2014)
Topic Modeling
Clustering Algorithms: Overview
Classical
Agglomerative Hierarchical Clustering
State of the Art
Spectral Clustering
Latent Dirichlet Allocation
Markov (Graph) Clustering
Deep Embedded Clustering
Variational Deep Embedding
Topic Modeling
Agglomerative Hierarchical Clustering

Modern hierarchical, agglomerative clustering algorithms (Mullner D. - 2011)


Spectral Clustering
On Spectral Clustering: Analysis and an algorithm (Ng A., Jordan M. & Weiss Y. - 2001)


Similarity Matrix
Word Embeddings Cosine Distance
Latent Dirichlet Allocation
Latent Dirichlet Allocation (Blei M., Ng A. & Jordan M. - 2003)

(Semantic) Latent Dirichlet Allocation
Gaussian LDA for Topic Models with Word Embeddings (Das R., Zaheer M. & Dyer C. - 2015)

Markov (Graph) Clustering
Graph Clustering by Flow Simulation (Dongen S. - 2000)

Emergent Cluster Discovery
Flow simulation creates natural set of cluster membership without needing to pre-define number of clusters K.
Deep Embedded Clustering
Unsupervised Deep Embedding for Clustering Analysis (Xie J., Girshick R. & Farhadi A. - 2016)


Keyword Embeddings
Non-Linear Feature Transformation
(deep neural network)
Latent Feature Space
Cluster Assignment
Deep Embedded Clustering
Unsupervised Deep Embedding for Clustering Analysis (Xie J., Girshick R. & Farhadi A. - 2016)



"Soft" Cluster Assignment

Determine KL Divergence
Deep Embedded Clustering
Unsupervised Deep Embedding for Clustering Analysis (Xie J., Girshick R. & Farhadi A. - 2016)



KL Loss is Differentiable
Back Propagate Parameter Gradient
Variational Deep Embedding
Variational Deep Embedding: A Generative Approach to Clustering (Jiang Z. et al. - 2016)



2-Step Process
Cannot optimize in end-to-end in same architecture
Issues w/ DEC
Variational Deep Embedding
Variational Deep Embedding: A Generative Approach to Clustering (Jiang Z. et al. - 2016)



Decoder
Variational Auto Encoders
[0, 1, 0, 0]
Variational Deep Embedding
Variational Deep Embedding: A Generative Approach to Clustering (Jiang Z. et al. - 2016)



Auto-Encoder
Variational Auto Encoders
[0, 1, 0, 0]
Variational Deep Embedding

Variational Deep Embedding: A Generative Approach to Clustering (Jiang Z. et al. - 2016)



Gaussian Mixture
Variational Auto Encoders
Variational Deep Embedding

Variational Deep Embedding: A Generative Approach to Clustering (Jiang Z. et al. - 2016)


Supervised-Level Generative Power
VaDE is capable of learning both cluster and feature embedding abstractions that rival traditional supervised learning techniques.
Variational Deep Embedding
Variational Deep Embedding: A Generative Approach to Clustering (Jiang Z. et al. - 2016)



High
Low
Categorical Reconstructive Power
Variational Deep Embedding
Variational Deep Embedding: A Generative Approach to Clustering (Jiang Z. et al. - 2016)




Adaptive Sub-Categorization
Structured Information Extraction
Information Extraction Spectrum
Supervised
Regular Expressions
Unsupervised
DeepDive
ReVerb
Information Extraction
Entity Extraction from Text


Hand-coded, High-precision Patterns
Automatic Entity Extraction: Reverb
Identifying Relations for Open Information Extraction (Fader A., Soderland S. & Etzioni O. - 2011)

Syntactic Constraint

Confidence Function
Logistic regression on hand-tuned features trained externally.
Lexical Constraint
https://github.com/knowitall/reverb
Relational Extraction: DeepDive
DeepDive: Web-scale Knowledge-base Construction using Statistical Learning and Inference (Niu F. et al. - 2012)


Factor Graphs

User Defined Functions
https://github.com/HazyResearch/deepdive
How to Contact Me
garrett@dataexhaust.io
- Follow up questions
- Model development / implementation
- Data science team training
- Moral support
Slides: http://slides.com/dataexhaust
Twitter: @data_exhaust
Building Knowledge Bases from Text
By Garrett Eastham
Building Knowledge Bases from Text
- 544