Tarje Sælen Lavik

RDF kræsjkurs

Semantic web

ALLE kan si HVA SOM HELST om ALT

 

Open World Assumption

provides a common framework that allows data to be shared and reused across application, enterprise, and community boundaries

NOEN begrep

* *RDF* = Resource Description Framework
* *Tripler* = subject -> predicate -> object
* *Triplestore* = database for tripler
  - Virtuoso (NRK)
  - Fuseki (UB)
  - Stardog
  - GraphDB
  - ...
* *Datamodell / ontologi / schema* = klasser og relasjoner
  - foaf:     -> friend of a friend
  - dct:      -> Dublin Core Terms
  - schema:   -> Schema.org (Google, Microsoft, Yahoo and Yandex)
  - crm:      -> CIDOC-CRM
  - bibframe: -> Den nye datamodellen for bibliotek
* *Vokabular* = kontrollert begrepsapparat

LINKEd data

  1. Use URIs to name (identify) things.
  2. Use HTTP URIs so that these things can be looked up (interpreted, "dereferenced").
  3. Provide useful information about what a name identifies when it's looked up, using open standards such as RDF, SPARQL, etc.
  4. Refer to other things using their HTTP URI-based names when publishing data on the Web.

rdf 1.1

Resource Description Framework

#basis er tripler
<subject>   <predicate>   <object>
   iri          iri      iri|literal
   ?s           ?p          ?o

#N-TRIPLES
<http://data.uib.no/tarje> <http://schema.uib.no/name> "Tarje Lavik" .
<http://data.uib.no/tarje> <http://schema.uib.no/knows> <http://data.uib.no/per> .
iri = Internationalized Resource Identifiers

Serialisering av rdf

#N-TRIPLES
<http://data.uib.no/tarje> <http://schema.uib.no/name> "Tarje Lavik" .
#TURTLE/N3
@prefix uib: <http://schema.uib.no/> .

<http://data.uib.no/tarje> uib:name "Tarje Lavik" .
#RDF/XML
<?xml version="1.0" encoding="utf-8" ?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
         xmlns:uib="http://schema.uib.no/">
  <rdf:Description rdf:about="http://data.uib.no/tarje">
    <uib:name>Tarje Lavik</uib:name>
  </rdf:Description>
</rdf:RDF>
#JSON-LD
[{
    "@id":"http://data.uib.no/tarje",
    "http://schema.uib.no/name":[{   
        "@value":"Tarje Lavik"
    }]
}]

FORDELER MED SERIALISERING

# Turtle er lett å lese for mennesker

# N-TRIPLES kan splittes opp i flere filer basert på linjer,
# siden alle linjer er en påstand

<http://data.uib.no/tarje> <http://schema.uib.no/name> "Tarje Lavik" .
<http://data.uib.no/tarje> <http://schema.uib.no/father>  <http://data.uib.no/kjartan> .
# JSON-LD er bra fordi man kan ta en normal JSON og legge
# til en @context for å gjøre det om til RDF (+ @id)
{
  "@context": {
    "name": "http://schema.uib.no/name"
  },
  "@id": "http://data.uib.no/tarje",
  "name": "Tarje Lavik"
}

(for de som liker XML er RDF/XML flott for vanlige xml-verktøy)

BARE IRI|LITERALS?

# RDF datatypes etter XSD types som 
# xsd:date, xsd:dateTime, xsd:Year, blabla

<http://data.uib.no/tarje> uib:selfDelusion "73.1"^^xsd:decimal .

Datatypes are used with RDF literals to represent values such as strings, numbers and dates.

UiB relatert eksempel

#Turtle
@prefix :     <http://data.uib.no/> .
@prefix uib:  <http://schema.uib.no/> .
@prefix fs:   <http://fs.no/> .
@prefix owl:  <https://www.w3.org/2002/07/owl#> .

:ms1 a uib:MasterThesis ;
    uib:title "The democratic problem of Boaty McBoatface or why Attenborough sucks" ;
    uib:author uib:st10938 ;
    uib:file uib:ms1-pdf .

uib:st10938 a uib:Student ;
    uib:name "Tarje Sælen Lavik" ;
    uib:cristinNr "46759" ;
    owl:sameAs <http://orcid.org/0000-0002-1191-6474> .

uib:ms1-pdf a uib:PDF ;
    uib:url "http://mam.uib.no/0000000010101100101.pdf" .

fs:1 a uib:GradeStatment ;
    uib:assessmentOf :ms1 ;
    uib:grade uib:A ;
    uib:passed true .

SCHEMA

datamodell

#Turtle
@prefix :     <http://data.uib.no/> .
@prefix uib:  <http://schema.uib.no/> .
@prefix rdfs: <http://www.w3.org/TR/1999/PR-rdf-schema-19990303#> .
@prefix rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .

:ms1 a uib:MasterThesis ;
    uib:title "The democratic problem of Boaty McBoatface"@eng ;
    uib:author uib:st10938 .
# UiB Schema definisjoner

uib:MasterThesis a rdfs:Class ;
    rdfs:label "Master thesis"@eng ;
    rdfs:label "Masteroppgave"@nor .

uib:title a rdf:Property ;
    rdfs:label "Tittel"@nor .

uib:author a rdf:Property ;
    rdfs:label "Forfatter"@nor ;
    rdfs:range uib:Student ;
    rdfs:domain uib:MasterThesis .


uib:Person a rdfs:Class ;
    rdfs:label "Person"@eng .

uib:Student a rdfs:Class ;
    rdfs:label "Student"@eng ;
    rdfs:subClassOf uib:Person .

SPARQL

SPARQL Protocol and RDF Query Language

http://sparql.ub.uib.no/​

ASK (true|false), SELECT (tabell), 
DESCRIBE (graph), CONSTRUCT (ny graph)
#SPARQL
PREFIX : <http://data.uib.no/> 
PREFIX uib: <http://schema.uib.no/> 
PREFIX fs: <http://fs.no/> 

CONSTRUCT {
    :best uib:topTopThesis :?ms . 
    :best a uib:Collection .    
}
WHERE {
    ?s uib:assessmentOf ?ms ;
       uib:grade uib:A ;
       uib:passed true .
    ?ms a uib:MasterThesis .
}
#Turtle
@PREFIX : <http://data.uib.no/> .
@PREFIX uib: <http://schema.uib.no/> .
@PREFIX fs: <http://fs.no/> .

:best a uib:Collection ;
    uib:topTopThesis :ms1 .

SPARQL UPDATE

#SPARQL UPDATE - Insert/delete triples
PREFIX : <http://data.uib.no/> 
PREFIX uib: <http://schema.uib.no/> 
PREFIX fs: <http://fs.no/> 

INSERT {
    :best uib:topTopThesis :?ms . 
    :best a uib:Collection .    
}
WHERE {
    ?s uib:assessmentOf ?ms ;
       uib:grade uib:A ;
       uib:passed true .
    ?ms a uib:MasterThesis .
}

GRAPH

# Triplestores har én DEFAULT GRAPH, og >=1 NAMED GRAPHS .
# SPARQL-spørring mot sparql.ub.uib.no må ha med GRAPH

CONSTRUCT {
  <http://data.ub.uib.no/instance/photograph/ubb-bs-ok-19362> ?p ?o .
  ?o ?p2 ?o2 .
  ?o a ?type .
}
WHERE {
  GRAPH ?g {
   <http://data.ub.uib.no/instance/photograph/ubb-bs-ok-19362> ?p ?o .
   OPTIONAL { ?o ?p2 ?o2 .
      ?o a ?type .
      FILTER isLiteral(?o2) }
  }
}

# NAMED GRAPHS i datasettet på sparql.ub.uib.no
# Marcus data  - <http://data.ub.uib.no/dataset/marcus-bs>
# Datamodellen - <http://data.ub.uib.no/ontology/ubbont>
# Extra data   - <http://data.ub.uib.no/dataset/extra>

DESCRIBE og JSON-LD

JSON-LD nykommeren av serialiseringene

# Problemstilling: Indeksere data i ElasticSearch med nøstede elementer

<http://data.ub.uib.no/instance/photograph/ubb-bs-ok-19362>
        a                         ubbont:Photograph ;
        dct:identifier            "ubb-bs-ok-19362" ;
        dct:isPartOf              <http://data.ub.uib.no/instance/collection/00c02500-1b04-41bc-8183-4015c5e7a757> ;
        dct:relation              <http://data.ub.uib.no/instance/organization/62757af6-0510-4824-b560-4fc6feca1126> ;
        dct:subject               <http://data.ub.uib.no/topic/dd509891-f2a4-4008-817b-2644bad3a8b4> ;
        dct:title                 "[Byggingen av Haukeland Sykehus]" .

<http://data.ub.uib.no/instance/organization/62757af6-0510-4824-b560-4fc6feca1126>
        a               foaf:Organization ;
        dct:identifier  "62757af6-0510-4824-b560-4fc6feca1126" ;
        skos:altLabel   "Haukeland Sykehus" ;
        foaf:name       "Haukeland Universitetssykehus" .

<http://data.ub.uib.no/instance/collection/00c02500-1b04-41bc-8183-4015c5e7a757>
        a               bibo:Collection ;
        dct:identifier  "00c02500-1b04-41bc-8183-4015c5e7a757" ;
        dct:title       "Byggingen av Haukeland Sykehus, 1908 - 1911" .

<http://data.ub.uib.no/topic/dd509891-f2a4-4008-817b-2644bad3a8b4>
        a                          skos:Concept ;
        dct:identifier             "dd509891-f2a4-4008-817b-2644bad3a8b4" ;
        skos:prefLabel             "Sykehus" .

RDF kræsjkurs

By Tarje Lavik

RDF kræsjkurs

  • 577