While using metadata read modes other than
resource you can easily run into downloading a massive amounts of metadata. You should be aware that the metadata serialization format heavily affects the API performance.
If you just want to know the best solutions:
application/n-triples. It’s definitely the fastest to serialize on the ARCHE side and most probably the fastest to parse on your application side (with the only exception of JS where
application/ld+jsonis probably the fastest to parse).
A few tests were performed on ACDH’s ARCHE production instance. The repository contained 85k resources and 2.6M triples.
application/ld+json are the same for ARCHE 1.9 and 1.10.
As we can see:
application/n-triples. Since ARCHE 1.10.0 it has a constant memory footprint, therefore there is no risk of out of memory error on the server side. On the test machine it was able to serialize up to around 0.5M triples within PHP’s standard 30s script execution time limit.
text/turtle. It takes roughly twice as much time to generate a
application/n-triplesone (as of ARCHE 1.10) but it’s worth noticing it has much higher server side memory footprint (here up to around 100MB @ 150k triples).
application/ld+jsonis significantly slower (more than twice comparing to
text/turtleand more than four times comparing to
application/n-triples) and has much bigger server-side memory footprint (almost 350MB @ 150k triples comparing to around 100MB for
text/turtleand 2MB of
application/n-triples). As there are
application/n-triplesparsers for JS, in large triples count scenarios it’s more performant and safe (no risk of out of memory and minimal risk of execution timeout) to use
application/n-triplesas the data interchange format and the parser on the client side.
rdf+xmlis definitely the worse format as the serialization time grows exponentially with the number of triples. Please just avoid it.
application/n-triplesshould be the fastest and has neglectable server memory footprint as every triple read from the database can be just immidiately streamed to the output. Also the amount of processing is minimal and goes down to some basic escaping only.
text/turtleis more demanding. First, whole data must be inspected first to create prefixes list. Second, data must be either buffered or initially ordered by the subject, so triples can be grouped by subject (and property). It’s still pretty fast but as data has to be buffered,
text/turtlememory footprint grows linearly with the number of return triples.
application/ld+jsonsuffers from same problems as
text/turtle(we also expect linear relation between number of triples and both serialization time and server memory footprint), just the library stack used to generate it in ARCHE is less performant (EasyRdf and ml/json-ld used for
application/ld+jsonintroduces bigger overhead than ARCHE-embedded preprocessing + pietercolpaert/hardf used for
application/rdf+xmluses EasyRdf XML serializer which is just slow.