ARCHE provides metadata in RDF which is not the most intuitive format for most programmers. The aim of this guide is to help avoid most common caveats connected with dealing with RDF.
This document goes down to two rules:
application/n-triples
for machine processing (for performance reasons).RDF represents graphs with directed and labelled edges which has pretty intuitive object mapping:
There are three kinds of graph nodes in the RDF:
So far doesn’t sound bad but the devil is in the details.
The most common caveats are discussed bellow.
The bottom line is you should always use a dedicated RDF library to deal with RDF because handling it by hand is too complex and error prone.
The main problem is RDF edge labels - being mapped to object properties - are URIs. This brings two issues:
myObject.http://some.namespace/someProperty
.A solution to this problem is called compacting and is discussed here.
Identifiers of RDF graph nodes - mapped to objects - are also URIs:
Another complexity is brought by literals.
Literals have compound structure with value, datatype and language tag. Because of that they must be modeled as objects. The important thing is these objects have to be immutable.
Let’s consider a following RDF graph:
Resource1(someUri) --someProperty----\
Literal1(someValue, someDatatype, someLang)
Resource2(otherUri) --someProperty---/
with two objects (Resource1
and Resource2
)
sharing a reference to a common literal1
object.
After doing something like (it’s pseudo code and not any particular
programming language)
resource1.someProperty.setValue(NEWVALUE)
the graph should
look as follows:
Resource1(someUri) --someProperty--> Literal1(NEWVALUE, someDatatype, someLang)
Resource2(otherUri) --someProperty--> Literal2(someValue, someDatatype, someLang)
and not:
Resource1(someUri) --someProperty----\
Literal1(NEWVALUE, someDatatype, someLang)
Resource2(otherUri) --someProperty---/
By the way it means the
resource1->someProperty->setValue(NEWVALUE)
syntax
should not exist in the first place and the right one should be
resource1.someProperty = resource1.someProperty.withValue(NEWVALUE)
or during the parsing of the RDF graph into objects of your programming
language a separate Literal1
objects should be created for
resource1.someProperty
value and
resource2.someProperty
value.
RDF is an abstract data model which can be expressed in (far too many to be honest) different serialization formats. Just to name the most important:
What you should remember is that a given RDF data set may have virtually countless valid serializations even in a single serialization format.
This is particularly important for RDF-XML and JSON-LD serialization where you can feel tempted to assume the data structure is stable and can be directly used by your app. It can’t.
The only safe way to go is to use a dedicated RDF parsing library which will deal with all those ambiguities.
Let’s take a simple RDF graph:
Resource(http://foo/1) --http://bar/1--> Resource(http://foo/2)
| |
http://bar/1 http://baz
| |
v v
Blank(_://b1) --http://baz--> Literal(baz)
and consider a small subset of its possible serializations.
Remember, all examples below represent exactly the same RDF graph!
n-triples - lines below in any order
<http://foo/1> <http://bar/1> <http://foo/2> .
<http://foo/1> <http://bar/1> _:b1 .
<http://foo/2> <http://baz> "baz" .
_:b1 <http://baz> "baz" .
turtle
@prefix foo: <http://foo/> .
foo:1 <http://bar/1> foo:2 ,
[<http://baz> "baz"] .
foo:2 <http://baz> "baz" .
@prefix foo: <http://bar/> .
<http://foo/1> foo:1 <http://foo/2> ,
_:b1 .
<http://foo/2> <http://baz> "baz" .
_:b1 <http://baz> "baz" .
JSON-LD: