Documentation

LiveCmdiMetadata
in package
implements MetadataInterface

Creates <metadata> element by filling in an XML template with values read from the repository resource's metadata.

Required metadata format definitition properties:

  • uriProp - metadata property storing resource's OAI-PMH id
  • idProp - metadata property identifying a repository resource
  • labelProp - metadata property storing repository resource label
  • schemaProp - metadata property storing resource's CMDI profile URL
  • resolverNmsp - regular expression uniquely matching the resource's identifier namespace allowing the content type negotation. Required for the format template attribute to generate proper URLs.
  • templateDir - path to a directory storing XML templates; each template should have exactly same name as the CMDI profile id, e.g. clarin.eu:cr1:p_1290431694580.xml
  • defaultLang - default language to be used when the template doesn't explicitly specify one

Optional metadata format definition properties:

  • propNmsp[prefix] - an array of property URLs namespaces used in the template
  • idNmsp[prefix] - an array of id URIs namespaces used in the template
  • schemaDefault provides a default CMDI profile (e.g. clarin.eu:cr1:p_1290431694580.xml) to be used when a resource's metadata don't contain the schemaProp or none of its values correspond to an existing CMDI template. If schemaDefault isn't provided, resources which don't contain the schemaProp in their metadata are automatically excluded from the OAI-PMH search.
  • schemaEnforce if provided, only resources with a given value of the schemaProp are processed.
  • valueMaps[mapName] - value maps to be used with the valueMap template attribute. Every map should be an object with source values being property names and target values being property values.
  • timeout - if the template generation takes longer than the given time (in seconds) emit bunch of whitespaces to inform the client something's going on and keep the connectio alive
  • cache - sets up LiveCmdiMetadata internal cache (separate from the global OAI-PMH cache). Just skip this configuration property to avoid using the internal cache.
    • perResource (true/false) should a clean cache be used for every OAI-PMH resource? Takes effect only for the GetRecords OAI-PMH verb. Having a shared cache is likely to speed up the response generation but can also significantly increase memory usage. Use with caution and probably combine with skipClasses/includeClasses. On the other hand per-resource only cache makes sense only for the complex template structure where there are chances for resources and/or subtemplates to be used more than once within a single top-level metadata record.
    • skipClasses (array of URIs) repository resources of given RDF classes will be excluded from caching
    • includeClasses (array of URIs) only repository resources of given RDF classes will be cached
    • statistics (true/false) should cache usage statistics be appended to each generated metadata record? This feature is useful for tuning the cache settings, especially selecting skipClasses and includeClasses filters but it shouldn't be used on production as it alters the output metadata schema.

XML tags in the template can be annotated with following attributes:

  • val="valuePath" specifies how to get the value. Possible valuePath variants are:
    • /propUri - get a value from a given metadata property value
    • /propUri[key] - parse given metadata property value as YAML and take the value at the key key
    • @propUri1/propUri2 - get another resource URL from the propUri1 metadata property value, then use the propUri2 metadata property value of this resource. If inverse of propoUri1 is needed, prepend it with a dash: @^propUri1/propUri2.
    • @propUri in a tag having the ComponentId attribute - inject the template indicated by the ComponentId attribute taking the resource to which the propUri points to as template's base resource. If inverse of propoUri is needed, prepend it with a dash: @^propUri.
    • NOW - current time
    • URL, URI - resource's repository URL
    • METAURL - resource's metadata repository URL
    • OAIID - resources's OAI-PMH identifier
    • OAIURL - URL of the OAI-PMH GetRecord request returning a given resource metadata in the currently requested metadata format
    • RANDOM - a random number from 0 to 2^31
  • dateFormat="FORMAT" - when present, causes value to be interpreted as a date and formatted according to a given format. Formatting is applied before any further processing is done like applying match/replace/aggregate/count. Values which can't be parsed as dates are skipped. FORMAT description can be found on the https://www.php.net/manual/en/datetime.format.php#format
  • match="regular expression" - when present, only values matching a given regular expression are processed. It is applied to the values list returned according to the val (and, if specified, dateFormat) attribute and before aggregate attribute is applied.
  • replace="regular expression replace" - works with the match attribute. Provides a way to adjust the matched value. Match placeholders use the backslash syntax (\1 matches the first regex capture group, etc.)
  • aggregate="min or max" - when present out of all values passing the match/replace step only a single value (minimum or maximum) is taken. match and replace are applied to values before aggregating them.
  • valueMap="mapName" - name of the value map (defined in the metadata format config) to be applied to the value(s) denoted by the val attribute. Value map name can be preceeded with a *, - (default) or +:
    • * keep both original and mapped values, don't care if an original value doesn't have a mapping
    • - use only mapped values, if original value doesn't have a mapping, discard it
    • + use only mapped values but if original value doesn't have a mapping, keep the original value instead
  • valueMapProp="RDFpropertyURL" maps the value indicated by the val attribute using an external RDF data. First, a value indicated by the val attribute is treated as an URL. Its content is downloaded and parsed as RDF. Then all values of the valueMapProp RDF property in the parsed RDF graph are taken as actual template values. This mechanism allows to resolve e.g. external SKOS vocabulary concepts, if only they are published in a way allowing to download the concept definition as an RDF.
  • valueMapKeepSrc="false" if present, removes the original value fetched according to the val attribute and returns only values fetched according to the valueMapProp attribute. Taken into account only if valueMapProp provided and not empty.
  • count="N" (default 1)
    • when "*" and metadata contain no property specified by the val attribute the tag is removed from the template;
    • when "*" or "+" and metadata contain many properties specified by the val attribute the tag is repeated for each metadata property value
    • when "1" or "+" and metadata contain no property specified by the val attribute the tag is left empty in the template;
    • when "1" and metadata contain many properties specified by the val attribute first metadata property value is used
  • format="FORMAT" - for all values being RDF resources or the OAIID generates an URL requesting response to be returned in a given format (e.g. image/jpeg or text/turtle)
  • URLEncode="prefix" - when present, the final value after all transformations defined by other attributes is URL-encoded and appended to a given prefix..
  • lang="true" if present and a metadata property value contains information about the language, the xml:lang attribute is added to the template tag
  • asXML="true" if present, value specified with the val attribute is parsed and added as XML
  • replaceXMLTag="true" if present, value specified with the val attribute substitutes the XML tag itself instead of being injected as its value.
  • asAttribute="targetAttribute" if present, value specified with the val attribute is stored as a given attribute's value. Takes precedense over replaceXMLTag and forces asXML="false".
  • ComponentId specifies a template to substitue a given tag with. The template is being processed with a base resource(s) as defined by the val attribute. The attribute value should match the template file name without the .xml extension. If a template file {ComponentIdValue}_{BaseResourceRdfClass}.xml exists, it's used instead of the {ComponentIdValue}.xml template. This allows for using different templates for different target resources. When the ComponentId is used the actual tag in the template is not important because it's anyway replaced by the component's root tag. When ComponentId is used with the asAttribute and without val the component template selected as described above is evaluated in the context of the current resource and then the final XML text value is being put as an attribute value.
  • id if has value of '#', it is filled in with a globally unique sequence
Tags
author

zozlak

Table of Contents

Interfaces

MetadataInterface
Interface for different metadata providers.

Constants

FAKE_ROOT_TAG  = 'fakeRoot'
STATS_TAG  = 'debugStats'
VALUEMAP_ALL  = '*'
VALUEMAP_FALLBACK  = '+'
VALUEMAP_STRICT  = '-'

Properties

$cacheHits  : array<string, int>
$depth  : int
$format  : MetadataFormat
Metadata format descriptor
$idSeq  : int
Sequence for id generation
$mapper  : ValueMapper|null
Value mapping cache
$rdfCache  : array<string, RepoResourceDb>
$res  : RepoResourceDb
Repository resource object
$template  : string
Path to the XML template file
$timeout  : int
$xmlCache  : array<string, DOMDocument>

Methods

__construct()  : mixed
Creates a metadata object for a given repository resource.
extendSearchDataQuery()  : QueryPart
Allows to extend a search query with additional clauses specific to the given metadata source.
extendSearchFilterQuery()  : QueryPart
Applies metadata format restrictions.
getXml()  : DOMElement
Creates resource's XML metadata
appendCacheStats()  : void
cmdiComponentAsAttribute()  : bool
Handles situation where a component's text value should be made template node's attribute value
collectMetaValue()  : void
Extracts metadata value from a given EasyRdf node
extractMetaValues()  : array<string, array<int, mixed>>
Extracts metadata values from a resource
getRdfResource()  : RepoResourceDb
getResourcesByPath()  : resource
Prepares fake resource metadata allowing to resolve inverse and/or recursively targetted resources.
getXmlCacheId()  : string
insertCmdiComponents()  : void
Inserts a value by injecting an external CMDI template.
insertValue()  : bool
Injects metadata values into a given DOM element of the CMDI template.
insertValues()  : bool
maintainRdfCache()  : void
maintainXmlCache()  : void
parseVal()  : array<string, mixed>
Parses the `val` attribute into components and returns them as an array.
processElement()  : bool
Recursively processes all XML elements
processValues()  : array<string, array<string|int, mixed>>
removeTemplateAttributes()  : void
replacePropNmsp()  : string
shouldBeCached()  : bool

Constants

Properties

$cacheHits

private static array<string, int> $cacheHits = ['rdf' => 0, 'xml' => 0]

Methods

__construct()

Creates a metadata object for a given repository resource.

public __construct(RepoResourceDb $resource, object $searchResultRow, MetadataFormat $format) : mixed
Parameters
$resource : RepoResourceDb

a repository resource object

$searchResultRow : object

SPARQL search query result row

$format : MetadataFormat

metadata format descriptor describing this resource

extendSearchDataQuery()

Allows to extend a search query with additional clauses specific to the given metadata source.

public static extendSearchDataQuery(MetadataFormat $format) : QueryPart

Remark! PHP doesn't consider static methods as an interface part therefore existance of this method in classes implementing this interface is not enforced.

Parameters
$format : MetadataFormat

metadata format descriptor

Return values
QueryPart

getXml()

Creates resource's XML metadata

public getXml([int $depth = 0 ][, bool $cache = true ]) : DOMElement

If the template's root element has an val attribute a fake root element is introduced to the template to assure it will be a valid XML after the substitution (XML documents have to have a single root element).

Parameters
$depth : int = 0

subtemplates insertion depth

$cache : bool = true

should cache be used?

Return values
DOMElement

appendCacheStats()

private appendCacheStats(DOMDocument $doc) : void
Parameters
$doc : DOMDocument

cmdiComponentAsAttribute()

Handles situation where a component's text value should be made template node's attribute value

private cmdiComponentAsAttribute(DOMElement $el) : bool
Parameters
$el : DOMElement
Return values
bool

collectMetaValue()

Extracts metadata value from a given EasyRdf node

private collectMetaValue(array<string|int, string> &$values, Literal|Resource $metaVal, string|null $subprop, string|null $dateFormat, string|null $format) : void
Parameters
$values : array<string|int, string>
$metaVal : Literal|Resource
$subprop : string|null
$dateFormat : string|null
$format : string|null

extractMetaValues()

Extracts metadata values from a resource

private extractMetaValues(Resource $meta, string $prop, string|null $subprop, string|null $extUriProp, string|null $dateFormat, string|null $format) : array<string, array<int, mixed>>
Parameters
$meta : Resource
$prop : string
$subprop : string|null
$extUriProp : string|null
$dateFormat : string|null
$format : string|null
Return values
array<string, array<int, mixed>>

getResourcesByPath()

Prepares fake resource metadata allowing to resolve inverse and/or recursively targetted resources.

private getResourcesByPath(string $prop, bool $recursive, bool $inverse) : resource
Parameters
$prop : string
$recursive : bool
$inverse : bool
Return values
resource

insertCmdiComponents()

Inserts a value by injecting an external CMDI template.

private insertCmdiComponents(DOMElement $el, resource $meta, string $component, string $prop) : void
Parameters
$el : DOMElement
$meta : resource
$component : string
$prop : string

insertValue()

Injects metadata values into a given DOM element of the CMDI template.

private insertValue(DOMElement $el) : bool
Parameters
$el : DOMElement

DOM element to be processes

Return values
bool

should $el DOMElement be removed from the document

insertValues()

private insertValues(DOMElement $el, array<string, array<string|int, mixed>> $values) : bool
Parameters
$el : DOMElement
$values : array<string, array<string|int, mixed>>
Return values
bool

maintainXmlCache()

private maintainXmlCache(DOMDocument $doc) : void
Parameters
$doc : DOMDocument

parseVal()

Parses the `val` attribute into components and returns them as an array.

private parseVal(string $val) : array<string, mixed>

The components are:

  • prop the metadata property to be read
  • recursive should the property be follow recursively?
  • subprop the YAML object key (null if the property value should be taken as it is)
  • extUriProp if prop value points to a resource, metadata property which should be read from the target resource's metadata
  • inverse boolean value indicating if extUriProp points to the external resource (false) or if the external resource is pointing to the current one (true)
Parameters
$val : string
Return values
array<string, mixed>

processElement()

Recursively processes all XML elements

private processElement(DOMElement $el) : bool
Parameters
$el : DOMElement

DOM element to be processed

Return values
bool

processValues()

private processValues(DOMElement $el, string|array<string, array<string|int, mixed>> $values, string $formatPrefix) : array<string, array<string|int, mixed>>
Parameters
$el : DOMElement
$values : string|array<string, array<string|int, mixed>>
$formatPrefix : string
Return values
array<string, array<string|int, mixed>>

removeTemplateAttributes()

private removeTemplateAttributes(DOMElement $ch) : void
Parameters
$ch : DOMElement

replacePropNmsp()

private replacePropNmsp(string $prop) : string
Parameters
$prop : string
Return values
string

shouldBeCached()

private shouldBeCached([Resource|null $meta = null ]) : bool
Parameters
$meta : Resource|null = null
Return values
bool

        
On this page

Search results