The Tree Problem

An overview

A Tree?

Public vs Private

Disagreements

I Also Want

  • To run my own stack
  • Proper Performance/Scaling
  • Expressive ad-hoc queries
  • To run online AND offline

Property Graph

A collection of nodes, relationships, and properties

Nodes and Relationships

Storing Data in Properties

Naive Model - Basic

Naive Model - Persona

Existing property Graphs

- Neo4J

- Titan

- Tinkerpop Blueprints

- Many more

vGraph

A distributable versioned property graph

Here is a Regular Graph

Each Node and Edge Contains

  • An ID (UUIDv4)
  • A Label (Immutable String)
  • A Repo Identifier
  • A Hash (SHA-1 of the properties)
  • Properties (booleans, ints, strings, etc)

Lets Add UUIDs

Lets Partition into 2 Repos

A Closer Look at Boundaries

A Closer Look at Boundaries

Our Graph

Make Changes

Commit Contains

Our Graph

Make Changes

Commit Contains

Our Graph

Make Changes

Commit Contains

Our Graph

Make Changes

Commit Contains

Our Graph

Make Changes

Commit Contains

Notes

  • You may not edit Boundary Nodes.
  • An exiting boundary node that is added to a Commit points to the original Repo.
  • There is no "update" in a Commit, only Create/Delete.

dCap

Storing and querying vGraph commits

dCap Store

  • Store the commits and retrieve them by id, date, author, repo, etc.
  • ElasticSearch is a good backend
  • A commit may exist in more than 1 store. This is fine because the id is a uuid.

dCap Server

  • A Simple REST server that saves and retrieves commits from one or more dCap Stores.
  • RESTish HTTP API
  • Handles Authentication
  • Spec is language agnostic

dCap Client

  • Handles routing when saving and retrieving commits.
  • A commit may be saved to multiple stores (locally and/or remotely)
  • Retrieves commit based on the repo that it was made in. 

Example dCap Deployment

nPipes

A cross-repo graph traversal framework

Blueprints Traversal

Cross Repo Queries

Moving On...

We just don't have the time to look at how this works in depth right now.

pTree

A set of standard models

Person

(No Properties - Just an Identifier)

Person with conclusions

Birth

Marriage

Properties

Why make everything first class?

  • More granularity in versioning, sharing, and referencing
  • More expressiveness when querying and searching
  • More extensibility when adding custom elements and properties.

What

This is what happened or what was. For instance, a person was born, lived a while, and died. These are referred to as "conclusions" or "facts" in other systems.

Why

This is why you believe that a particular set of whats happened. Personas, sources, and notes, are all information about why you believe something happened.

What vs. Why

Why Connectors

What do we gain with Whys

  • Multiple types of Whys - multiple research methodologies
  • Namespaced custom extensions - use your own and still share
  • Can create the Why(s) before you commit to the What(s)

Standard

vs

Micro-Standard

Trepo

Trepo is

Distributed

Trepo is

Scalable

Trepo is

Extensible

Trepo is

Cross Platform

Trepo is

Open Source

The Tree Problem

By John Clark

The Tree Problem

  • 970