Enabling Machine-Actionable Semantics for Comparative Analyses of Trait Evolution
Project Meeting Oct 2017
RENCI
Architecture
API-first
Consequences of API first
Most reporting through query answering, not web UI
Report analysis through client-side tools
Opportunity for literate programming platforms
Jupyter notebooks
Rmarkdown documents
Opportunity for QC automation
Automatic testing
Continuous integration
Deliverable I:
Cross-study matrix synthesis and calibration
Ontotrace
Ontotrace works because the problem is highly bounded
Number of character states := 2
State values = { "present", "absent" }
Character = <entity>: <amount>
Character inference, schematically
Unconstrained character and state synthesis is a combinatorial problem
In first approximation
|\cup_{E \in M}(S(E))| \times |\cup_{Q \in M}(S(Q))|
∣
∪
E
∈
M
(
S
(
E
)
)
∣
×
∣
∪
Q
∈
M
(
S
(
Q
)
)
∣
|\cup_{E \in M}(S(E))| \times |\cup_{Q \in M}(S(Q))|
∣
∪
E
∈
M
(
S
(
E
)
)
∣
×
∣
∪
Q
∈
M
(
S
(
Q
)
)
∣
|\cup_{E \in M}(S(E))| \times |\cup(S(A))|
∣
∪
E
∈
M
(
S
(
E
)
)
∣
×
∣
∪
(
S
(
A
)
)
∣
|\cup_{E \in M}(S(E))| \times |\cup(S(A))|
∣
∪
E
∈
M
(
S
(
E
)
)
∣
×
∣
∪
(
S
(
A
)
)
∣
There can be hundreds of states subsumed by a synthetic character.
Using statistics and machine learning to constrain character inference and state consolidation
Use semantic similarity-derived statistics to tell "good" from "bad" matrices?
What is a desirable "semantic information content" as an objective function?
Quantify the semantic coherence of (consolidated) character states
Using statistics and machine learning to constrain character inference and state consolidation
Made with Slides.com