knowledge graphs part I
Extropolis AI
KG fundamentals
- Modeling and Representing KGs
- KG construction
- KG completion
- Reasoning/retrieval and Question Answering
What is a KG?
graphical representation of Information
- Entities
- Relationships
- Events
- Attributes
- etc
graphical representation of Information
- Entities
- Relationships
- Events
- Attributes
- etc
The KG Pipeline
high level view
Upstream
Representation
Downstream
Domain Discovery
Named Entity Recognition
Web Info Extraction
Relation Extraction
Transformation
RDF/RDFS
Wikidata
Property-centric
Ontology vs. Open
Instance matching
Stat Relational Learning
Representation Learning
Reasoning
Retrieval
Structured Querying
Question Answering
high level view
Representation
RDF/RDFS
Wikidata
Property-centric
Ontology vs. Open
KG modeling & representation
Representation
RDF/RDFS
Wikidata
Property-centric
Ontology vs. Open
Collection of triples <s,p,o>
Uses URIs
Designed for semantic web
:kalin_ovtcharov
foaf:name
'Kalin Ovtcharov'
KG modeling & representation
Representation
RDF/RDFS
Wikidata
Property-centric
Ontology vs. Open
Collection of triples <s,p,o>
Uses URIs
Designed for semantic web
:kalin_ovtcharov
foaf:name
'Kalin Ovtcharov'
Richer than RDF
Items/Properties/Statements
Statements: claims, ref, qualifier
think Wikipedia
info boxes
KG modeling & representation
Representation
RDF/RDFS
Wikidata
Property-centric
Ontology vs. Open
Collection of triples <s,p,o>
Uses URIs
Designed for semantic web
:kalin_ovtcharov
foaf:name
'Kalin Ovtcharov'
Richer than RDF
Items/Properties/Statements
Statements: claims, ref, qualifier
think Wikipedia
info boxes
Local identifiers (no URIs)
KG modeling & representation
Representation
RDF/RDFS
Wikidata
Property-centric
Ontology vs. Open
Collection of triples <s,p,o>
Uses URIs
Designed for semantic web
:kalin_ovtcharov
foaf:name
'Kalin Ovtcharov'
Richer than RDF
Items/Properties/Statements
Statements: claims, ref, qualifier
think Wikipedia
info boxes
Local identifiers (no URIs)
This is critical
Domain-specific?
User-focused?
Open?
KG Construction
Upstream (construction)
Domain Discovery
Named Entity Recognition
Web Info Extraction
Relation Extraction
All about crawlers
Lexical term matching, semantics, HMM-based
Require an ontology usually
using raw text is kind of old-school
Upstream (construction)
Domain Discovery
Named Entity Recognition
Web Info Extraction
Relation Extraction
All about crawlers
Lexical term matching, semantics, HMM-based
Require an ontology usually
using raw text is kind of old-school
Extract instances of predefined set of concepts from text
(considered hard)
Ontology
no ontology?
See Open IE
(even harder)
KG Construction
Upstream (construction)
Domain Discovery
Named Entity Recognition
Web Info Extraction
Relation Extraction
All about crawlers
Lexical term matching, semantics, HMM-based
Require an ontology usually
using raw text is kind of old-school
Extract instances of predefined set of concepts from text
(considered hard)
NER pipeline
CRFs, LSTMs etc
no ontology?
See Open IE
(even harder)
KG Construction
Upstream (construction)
Domain Discovery
Named Entity Recognition
Web Info Extraction
Relation Extraction
All about crawlers
Lexical term matching, semantics, HMM-based
Require an ontology usually
using raw text is kind of old-school
Extract instances of predefined set of concepts from text
(considered hard)
IE from HTML pages basically
Rule-based, hybrid, heuristics
Works with structured data too
might not be relevant for Extropolis
no ontology?
See Open IE
(even harder)
KG Construction
KG modeling & representation
Upstream (construction)
Domain Discovery
Named Entity Recognition
Web Info Extraction
Relation Extraction
All about crawlers
Lexical term matching, semantics, HMM-based
Require an ontology usually
using raw text is kind of old-school
Extract instances of predefined set of concepts from text
(considered hard)
IE from HTML pages basically
Rule-based, hybrid, heuristics
Works with structured data too
might not be relevant for Extropolis
no ontology?
See Open IE
(even harder)
Think classification: given entities A and B what relation is the most likely?
Based on text, needs ontology
no ontology?
clustering
SVMs
CNNs/PCNNs
LSTM/BERT?
Bootstrapping/Distant Supervision
KG modeling & representation
Upstream (construction)
Open IE
Domain-independent extraction
Good for heterogenous corpora
Main focus on relations
Domain-agnostic concepts exist
DBpedia
Wiki classes
YAGO
Challenges:
Inherently unsupervised
Suffers in multilingual settings
Granularity? Events?
Techniques:
Self-supervised ML (distant sup)
Rule-based
Clause based
+ Context
KG transformation
Transformation
Instance matching
Stat Relational Learning
Representation Learning
Resolve instances that refer to the same entity
+ ENS and Swoosh
KG transformation
Transformation
Instance matching
Stat Relational Learning
Representation Learning
Resolve instances that refer to the same entity
+ ENS and Swoosh
Infer relationships probabilistically
Why??
Markov Logic
PSL
KG transformation
Transformation
Instance matching
Stat Relational Learning
Representation Learning
Resolve instances that refer to the same entity
+ ENS and Swoosh
Infer relationships probabilistically
Why??
Markov Logic
PSL
Embeddings galore
To be continued...
Thoughts
- Do we use fragments of existing KGs vs build our own?
- Both, depending on task?
- One KG for all downstream tasks vs. several KGs
- If many, are they disjoint? Consistency?
- Which public KGs are useful for distant supervision in our case?
- Open vs. fixed ontology
- Idea: can a "golden" ontology help guide Qs from Eve?
Knowledge Graphs
By zpoulos
Knowledge Graphs
- 482