Concept graphS

FOR

Learning and SEARCH



Ashish Dubey
B.Tech CSE, 2013-14

What is a concept graph?

Representation of semantic relationships in natural language text.

  • Concepts in a document become nodes
  • Relationships become the edges connection concepts.

Example


Extracted from an article about tree. Clearly shows the interesting attributes.

HOW CAN THEY be USED?

  • Easy understanding of an article.
  • Concept discovery
  • Helps in ideation
  • Natural language search - answer finding

Existing VARIANTS

  • Google's Knowledge Graph
  • Facebook Graph Search
  • Luma7
  • Mindmap tools

project aim

  • Build a concept graph engine
  • Should be re-usable
  • Build applications on top of it - Visualization and Search

existing work

  • Lot of manual tools for concept map construction
  • Not many automatic tools
  • No re-usable frameworks

EXISTING RESEARCH

  • A survey of concept map mining techniques by Zubrinic, K. et al (2012)

  • A semi-automatic concept map extraction and evaluation framework by Jorge J. Villalon and Rafael A. Calvo (2010)

APPROACH

  • Use efficient information extraction techniques for extraction of concepts and relationships.
  • Store the extracted concepts and relations.
  • Use the data to build apps like search.

Information extraction

  • The most challenging aspect of the project
  • Overall efficiency relies on the underlying techniques
  • Couple of state of the art NLP libraries like OpenNLP, StanfordCoreNLP, etc
  • No single library satisfies the complete requirement

INFORMATION EXtraction - cont

  • <subject - relation - predicate> can be treated as concept - <relation - concept - triple>
  • Options:
    • Syntactic extraction
      • POS tags
      • NLTK
    • Statistical parsers
      • Treebanks
      • MaltParser, StanfordParser
  • Challenges:
    • Anaphora resolution, enablers, etc

Information extraction - CONT

Solutions to the problem:
  • Information extraction libraries by NLP scientists at UWa like OLLIE
    • Based on MaltParser
    • Outputs relationship triples
  • Anaphora resolution:
    • StanfordCoreNLP's coreference resolution system

Post-extraction

  • Concepts and relations are stored in a graph database
    • Neo4J
  • Data exposed through a web API
  • Application front-ends can interact with the API to get data.

Concept visualization app

(DEMO)

FUTURE: SEMANTIC SeARCH

  • Natural language queries - often questions
  • Extraction of answers from the concept graph
  • Search modules
    • Query processing
    • Graph data adapters
    • Ranking of results

Thank you

titled

By Ashish Dubey

titled

  • 1,087