knowledge graphs part I

 

Extropolis AI

KG fundamentals

  • Modeling and Representing KGs
  • KG construction
  • KG completion
  • Reasoning/retrieval and Question Answering

What is a KG?

graphical representation of Information

  • Entities
  • Relationships
  • Events
  • Attributes
  • etc

graphical representation of Information

  • Entities
  • Relationships
  • Events
  • Attributes
  • etc

The KG Pipeline

high level view

Upstream

Representation

Downstream

Domain Discovery

Named Entity Recognition

Web Info Extraction

Relation Extraction

Transformation

RDF/RDFS

Wikidata

Property-centric

Ontology vs. Open

Instance matching

Stat Relational Learning

Representation Learning

Reasoning

Retrieval

Structured Querying

Question Answering

high level view

Representation

RDF/RDFS

Wikidata

Property-centric

Ontology vs. Open

KG modeling & representation

Representation

RDF/RDFS

Wikidata

Property-centric

Ontology vs. Open

Collection of triples <s,p,o>

Uses URIs

Designed for semantic web

:kalin_ovtcharov

foaf:name

'Kalin Ovtcharov'

KG modeling & representation

Representation

RDF/RDFS

Wikidata

Property-centric

Ontology vs. Open

Collection of triples <s,p,o>

Uses URIs

Designed for semantic web

:kalin_ovtcharov

foaf:name

'Kalin Ovtcharov'

Richer than RDF

Items/Properties/Statements

Statements: claims, ref, qualifier

think Wikipedia

info boxes

KG modeling & representation

Representation

RDF/RDFS

Wikidata

Property-centric

Ontology vs. Open

Collection of triples <s,p,o>

Uses URIs

Designed for semantic web

:kalin_ovtcharov

foaf:name

'Kalin Ovtcharov'

Richer than RDF

Items/Properties/Statements

Statements: claims, ref, qualifier

think Wikipedia

info boxes

Local identifiers (no URIs)

KG modeling & representation

Representation

RDF/RDFS

Wikidata

Property-centric

Ontology vs. Open

Collection of triples <s,p,o>

Uses URIs

Designed for semantic web

:kalin_ovtcharov

foaf:name

'Kalin Ovtcharov'

Richer than RDF

Items/Properties/Statements

Statements: claims, ref, qualifier

think Wikipedia

info boxes

Local identifiers (no URIs)

This is critical

Domain-specific?

User-focused?

Open?

KG Construction

Upstream (construction)

Domain Discovery

Named Entity Recognition

Web Info Extraction

Relation Extraction

All about crawlers

Lexical term matching, semantics, HMM-based

Require an ontology usually

using raw text is kind of old-school

 

Upstream (construction)

Domain Discovery

Named Entity Recognition

Web Info Extraction

Relation Extraction

All about crawlers

Lexical term matching, semantics, HMM-based

Require an ontology usually

using raw text is kind of old-school

 

Extract instances of predefined set of concepts from text

(considered hard)

Ontology

no ontology?

See Open IE

(even harder)

KG Construction

Upstream (construction)

Domain Discovery

Named Entity Recognition

Web Info Extraction

Relation Extraction

All about crawlers

Lexical term matching, semantics, HMM-based

Require an ontology usually

using raw text is kind of old-school

 

Extract instances of predefined set of concepts from text

(considered hard)

NER pipeline

CRFs, LSTMs etc 

no ontology?

See Open IE

(even harder)

KG Construction

Upstream (construction)

Domain Discovery

Named Entity Recognition

Web Info Extraction

Relation Extraction

All about crawlers

Lexical term matching, semantics, HMM-based

Require an ontology usually

using raw text is kind of old-school

 

Extract instances of predefined set of concepts from text

(considered hard)

IE from HTML pages basically

Rule-based, hybrid, heuristics

Works with structured data too 

might not be relevant for Extropolis

no ontology?

See Open IE

(even harder)

KG Construction

KG modeling & representation

Upstream (construction)

Domain Discovery

Named Entity Recognition

Web Info Extraction

Relation Extraction

All about crawlers

Lexical term matching, semantics, HMM-based

Require an ontology usually

using raw text is kind of old-school

 

Extract instances of predefined set of concepts from text

(considered hard)

IE from HTML pages basically

Rule-based, hybrid, heuristics

Works with structured data too 

might not be relevant for Extropolis

no ontology?

See Open IE

(even harder)

Think classification: given entities A and B what relation is the most likely?

Based on text, needs ontology

no ontology?

clustering

SVMs

CNNs/PCNNs

LSTM/BERT?

Bootstrapping/Distant Supervision

KG modeling & representation

Upstream (construction)

Open IE

Domain-independent extraction

Good for heterogenous corpora

Main focus on relations

Domain-agnostic concepts exist

DBpedia

Wiki classes

YAGO

Challenges:

Inherently unsupervised

Suffers in multilingual settings

Granularity? Events?

 

Techniques:

Self-supervised ML (distant sup)

Rule-based

Clause based

+ Context

KG transformation

Transformation

Instance matching

Stat Relational Learning

Representation Learning

Resolve instances that refer to the same entity

+ ENS  and Swoosh

KG transformation

Transformation

Instance matching

Stat Relational Learning

Representation Learning

Resolve instances that refer to the same entity

+ ENS  and Swoosh

Infer relationships probabilistically

Why??

Markov Logic

PSL

KG transformation

Transformation

Instance matching

Stat Relational Learning

Representation Learning

Resolve instances that refer to the same entity

+ ENS  and Swoosh

Infer relationships probabilistically

Why??

Markov Logic

PSL

Embeddings galore

To be continued...

Thoughts

  • Do we use fragments of existing KGs vs build our own?
    • Both, depending on task?
  • One KG for all downstream tasks vs. several KGs
    •  If many, are they disjoint? Consistency? 
  • Which public KGs are useful for distant supervision in our case?
  • Open vs. fixed ontology
    • Idea: can a "golden" ontology help guide Qs from Eve?

Knowledge Graphs

By zpoulos

Knowledge Graphs

  • 482