Semantic Graphs are for Everybody


Michael Grove
Chief Software Architect - Clark & Parsia
@mikegrovesoft

About Us


  • Founded in 2005
  • Offices in Washington DC and Boston
  • Customers in US gov't, banking, energy, health/bio, retail
  • Strong academic partnerships in US, UK, Europe, and Mexico
  • Leader in Semantic Technologies: graph databases, reasoning, planning, services

Semantic Graphs are for Everybody


  • We agree that graphs are great
    • Fantastic alternative to traditional systems
    • Superior for many tasks
  • Semantics are a natural complement
    • As asset for information integration & analysis problems

Semantic Graphs


  • Organizations employ people other than programmers
    • Not everything needs to be in code
    • Especially business logic
    • You can't just teach everyone programming either
  • So why not semantics?
    • Declarative, formal descriptions of nodes, edges and relationships
  • A little bit of semantics goes a long way
    • Yields a powerful combination that is well suited for information integration tasks

Free Your Logic


  • Adding semantics to our graphs gives meaning to our data
  • Better yet, it lets us free our business logic from our codebase
    • By encoding it in the graph
  • Still better? It frees it from programmers (sorry!)
  • Can be written in a high-level declarative language
    • By programmers and non-programmers
    • Also lets us talk about the same thing while using different vocabularies



So what can we do with semantic graphs?

Enterprise Semantics


  • Complex organizations face challenging IT problems
  • Enterprise Semantics: model-driven information integration for decision support & analysis
    • Let an enterprise better utilize what they already have
    • Data is an underutilized asset
    • Leverage semantic graphs
  • Smart data, not strictly big data
    • Scale is not a necessary condition for utility
    • Problems are not always solved with more data

    Smart Data


    • Help deal with information overload
      • Not just overload, but also management
      • Semantics adds meaning to the data so the computer can help
    • Isn't this another silo?
      • Consumers of the data simply issue queries
      • But the data can live in place
        • Key if it changes frequently
        • Easier to bring new data online
      • Domains use their own vocabulary, semantics provides the glue
        • Facilitates cross-cutting views for information analysis & accessibility






    How do we make this happen?

    Put Business Logic in its Place


    • Business logic is naturally a first class object to be curated, validated, versioned, etc.
    • Encode your business logic in a language with a formal semantics
      • Capturing expertise
      • Inference rules & queries
      • Let the computer do the work
    • No programming required
      • Can write programs, but that's not easier or more maintainable
      • Actual experts in the business logic can implement it, not a programmer
        • Lets non-programmers do complex information processing without having to write code

    Reasoning


    • Make implicit information explicit
      • Consistency checking
        • Find contradictions in your logic
      • Entailment
        • Answer queries using reasoning
      • Integrity Constraint Validation
        • In some use cases, data consistency is crucial
      • Explanations
        • Logical debugging
    • Can be as complex as your domain requires
      • Expressivity broken down into profiles based on computational characteristics

    Reasoning at work

    • For example, enforcing security (ACLs)

    • You want to see if Bob can access Resource1
      • This can get hairy quick
      • Finding this in a single query, or multiple related queries is not easy

    Bob is-a Admin OR Bob created Resource1 OR (Bob hasRole ?x AND ?x canAccess Resource1) OR ... 
      • It doesn't get better writing a program
    • We can leverage reasoning to make this easier

      Queries get easier


      • But if you encode your security policy in semantic graphs
        • Bob canAccess Resource1
        • That's it!
      • Allows a more concise and maintainable query
      • Shields you from evolution in the security policies in your application
      • Try updating that query (or program) for each change! 

      Seat Belts Required


      • Data is key for solving problems with Enterprise Semantics
      • As I said, smart data
        • In some use cases, data consistency is crucial
      • So we need to be able to enforce integrity constraints over our data
        • Age must be a positive integer
        • Supervisors may not supervise more than 10 employees
        • All US Citizens must have one Social Security Number and it must be unique
        • ...

      Integrity Constraints


      • Custom logic to validate incoming data can be complex & time-consuming
        • Not to mention telling a user what they did wrong
        • No need to enforce some in the application layer, some in the database ...
      • Semantic Graphs make this easier
        • Again, no reason to write more code
        • Can use same language as reasoning
          • A rich modeling language for integrity constraints
            • The high-level expressive syntax makes authoring complex validations easier
          • Can be authored by the experts
      • Reasoning is used to satisfy, or violate, your constraints

      Houston, we have a problem...


      • Things go wrong, it's inevitable
        • People fat-finger stuff when doing data entry
        • Errors modeling the business logic
      • You can sit around and debug them
        • Debugging is fun, right?
        • Some Programming Required
      • Or you can have the computer tell you what is wrong
        • In fact, it will tell you why

      Explanations


      • Tells you why something is the case
        • Inference from business logic you don't understand?
        • Someone violate an integrity constraint?
        • You get a proof
      • Aids traceability & debugging
        • Pinpoints the exact issue
        • Can reference other proofs
        • Even tells you how to fix it
          • Or in some cases, fixes it for you

      Oh, they're still graphs


      • Semantic graphs are still graphs
      • Adding semantic capabilities does not diminish their graph-ness
      • You can pointer chase if you want
      • But all the normal algorithms still apply
        • Page Rank
        • Clustering
        • Vertex Degree Scoring
        • Betweenness Centrality
        • Shortest path
        • and so on...

      Did I mention the standards?


      • All of this is based on W3C standards
      • Data format, query language, semantics; none are proprietary
      • Don't like your current vendor?  Just switch
        • Standard semantics means queries won't return different answers
        • No ETL required to move your data
      • Makes interoperability easier
      • Promotes re-use
        • Semantics does not change, data retains original meaning

      Who uses this?


      • VIACOM
      • Best Buy
      • SAIC
      • NASA
      • Wolters Kluwer
      • JPMorgan Chase
      • inQuest Technologies
      • IBM

      A Quick Example: POPS


      • NASA is a big, interesting organization
        • 100k+ employees across 12 centers throughout the US
        • They have a universe of data
      • And one simple problem, finding experts
        • COTS solutions were not working
        • All the information was already in-house, but collecting dust
      • Enter POPS
        • Created a model for their existing data
        • Built a simple web interface to let them use their own data
        • Saved NASA $38M a year!

      Stardog


      • The leading semantic graph database
      • Design influenced by our approach to Enterprise Semantics
        • Optimized support for reasoning, integrity constraints & explanations
          • These resolve to query answering
          • So our main focus was on a query planner
        • Built to support information integration & analysis applications
          • But general enough to support any semantic graph use case
          • Also wanted best out of box experience possible

      The Next Step


      • Make it easier for all developers to contribute
        • Organizations don't always have experts in semantics or graphs readily available
      • Provide a tool that abstracts away these details
        • Rely on web development standards
          • HTML, Javascript, CSS, etc.
        • Start building an application right away without focusing on learning the semantics or the graphs
          • Because the value is in solving the problem

      Conclusions


      • Graph databases offer advantages over traditional systems for many uses
        • Semantics complement those advantages
      • More than just programmers work at organizations
        • Not everything needs to be code
        • Key components, such as business logic, can live separate
        • And thus, be created & maintained by non-programmers
      • So using semantic graphs we retain advantages of graph databases, while adding features valuable to the enterprise
        • Promotes adoption & interoperability
        • Facilitates smart data & Enterprise Semantics



      Questions?



      Thanks!

      http://clarkparsia.com
      http://stardog.com
      @mikegrovesoft
      Made with Slides.com