Constants

MATCH

MATCH

SKIP

SKIP

SKIP_NONE

SKIP_NONE

SKIP_NOT_EXIST

SKIP_NOT_EXIST

SKIP_EXIST

SKIP_EXIST

SKIP_BINARY_EXIST

SKIP_BINARY_EXIST

VERSIONING_NONE

VERSIONING_NONE

VERSIONING_ALWAYS

VERSIONING_ALWAYS

VERSIONING_DIGEST

VERSIONING_DIGEST

VERSIONING_DATE

VERSIONING_DATE

PID_KEEP

PID_KEEP

PID_PASS

PID_PASS

Properties

$debug

$debug : boolean

Turns debug messages on

Type

boolean

$paths

$paths : array

File system paths where resource children are located

It is a concatenation of the container root path coming from the class settings and the location path properties of the FedoraResource.

They can be also set manually using the setPaths() method

Type

array

$filter

$filter : string

Regular expression for matching child resource file names.

Type

string

$filterNot

$filterNot : string

Regular expression for excluding child resource file names.

Type

string

$flatStructure

$flatStructure : boolean

Should children be directly attached to the FedoraResource or maybe each subdirectory should result in a separate collection resource containing its children.

Type

boolean

$uploadSizeLimit

$uploadSizeLimit : integer

Maximum size of a child resource (in bytes) resulting in the creation of binary resources.

For child resources bigger then this limit an "RDF only" Fedora resources will be created.

Special value of -1 means "import all no matter their size"

Type

integer

$fedoraLoc

$fedoraLoc : string

Fedora path in the repo where imported resources are created.

Type

string

$collectionClass

$collectionClass : string

URI of an RDF class assigned to indexed collections.

Type

string

$binaryClass

$binaryClass : \acdhOeaw\util\type

URI of an RDF class assigned to indexed binary resources.

Type

\acdhOeaw\util\type

$depth

$depth : integer

How many subsequent subdirectories should be indexed.

Type

integer

$autoCommit

$autoCommit : integer

Number of resource automatically triggering a commit (0 - no auto commit)

Type

integer

$includeEmpty

$includeEmpty : boolean

Should resources be created for empty directories.

Skipped if $flatStructure equals to true

Type

boolean

$skipMode

$skipMode : integer

Should files (not)existing in the Fedora be skipped?

Type

integer

$versioningMode

$versioningMode : integer

Should new versions of binary resources already existing in the Fedora be created (if not, an existing resource is simply overwritten).

Type

integer

$pidPass

$pidPass : integer

Should PIDs (epic handles) be migrated to the new version of a resource during versioning.

Type

integer

$metaLookupRequire

$metaLookupRequire : boolean

Should files without external metadata (provided by the `$metaLookup` object) be skipped.

Type

boolean

$commitedRes

$commitedRes : array

Collection of resources commited during the ingestion. Used to handle errors.

Type

array

$indexedRes

$indexedRes : array

Collection of indexed resources

Type

array

Methods

containerDir()

containerDir() : string

Returns standardized value of the containerDir configuration property.

Returns

string

__construct()

__construct(\acdhOeaw\fedora\FedoraResource  $resource = null) 

Creates an indexer object for a given Fedora resource.

Parameters

\acdhOeaw\fedora\FedoraResource $resource

setFedora()

setFedora(\acdhOeaw\util\Fedora  $fedora) 

Sets the repository connection object

Parameters

\acdhOeaw\util\Fedora $fedora

setParent()

setParent(\acdhOeaw\fedora\FedoraResource  $resource) 

Sets the parent resource for the indexed files

Parameters

\acdhOeaw\fedora\FedoraResource $resource

setPaths()

setPaths(array  $paths) : \acdhOeaw\util\Indexer

Overrides file system paths to look into for child resources.

Parameters

array $paths

Returns

\acdhOeaw\util\Indexer

setAutoCommit()

setAutoCommit(integer  $count) : \acdhOeaw\util\Indexer

Controls the automatic commit behaviour.

Even when you use autocommit, you should commit your transaction after Indexer::index() (the only exception is when you set auto commit to 1 forcing commiting each and every resource separately but you probably don't want to do that for performance reasons).

Parameters

integer $count

number of resource automatically triggering a commit (0 - no auto commit)

Returns

\acdhOeaw\util\Indexer

setSkip()

setSkip(integer  $skipMode) : \acdhOeaw\util\Indexer

Defines if (and how) resources should be skipped from indexing based on their (not)existance in Fedora.

Parameters

integer $skipMode

mode either Indexer::SKIP_NONE (default), Indexer::SKIP_NOT_EXIST, Indexer::SKIP_EXIST or Indexer::SKIP_BINARY_EXIST

Returns

\acdhOeaw\util\Indexer

setVersioning()

setVersioning(integer  $versioningMode, integer  $migratePid = self::PID_KEEP) : \acdhOeaw\util\Indexer

Defines if new versions of binary resources should be created or if they should be simply overwritten with a new binary payload.

Parameters

integer $versioningMode

mode either Indexer::VERSIONING_NONE, Indexer::VERSIONING_ALWAYS, Indexer::VERSIONING_CHECKSUM or Indexer::VERSIONING_DATE

integer $migratePid

should PIDs (epic handles) be migrated to the new version - either Indexer::MIGRATE_NO or Indexer::MIGRATE_YES

Throws

\BadMethodCallException

Returns

\acdhOeaw\util\Indexer

setCollectionClass()

setCollectionClass(string  $class) : \acdhOeaw\util\Indexer

Sets default RDF class for imported collections.

Overrides setting read form the cfg::indexerDefaultCollectionClass configuration property.

Parameters

string $class

Returns

\acdhOeaw\util\Indexer

setBinaryClass()

setBinaryClass(string  $class) : \acdhOeaw\util\Indexer

Sets default RDF class for imported binary resources.

Overrides setting read form the cfg::indexerDefaultBinaryClass configuration property.

Parameters

string $class

Returns

\acdhOeaw\util\Indexer

setFilter()

setFilter(string  $filter, integer  $type = self::MATCH) : \acdhOeaw\util\Indexer

Sets file name filter for child resources.

You can choose if file names must match or must not match (skip) the filter using the $type parameter. You can set both match and skip filters by calling setFilter() two times (once with $type = Indexer::MATCH and second time with $type = Indexer::SKIP).

Filter is applied only to file names but NOT to directory names.

Parameters

string $filter

regular expression conformant with preg_replace()

integer $type

decides if $filter is a match or skip filter (can be one of Indexer::MATCH and Indexer::SKIP)

Returns

\acdhOeaw\util\Indexer

setFlatStructure()

setFlatStructure(boolean  $ifFlat) : \acdhOeaw\util\Indexer

Sets if child resources be directly attached to the indexed FedoraResource (`$ifFlat` equals to `true`) or a separate collection Fedora resource be created for each subdirectory (`$ifFlat` equals to `false`).

Parameters

boolean $ifFlat

Returns

\acdhOeaw\util\Indexer

setFedoraLocation()

setFedoraLocation(string  $fedoraLoc) : \acdhOeaw\util\Indexer

Sets a location where the resource will be placed.

Can be absolute (but will be sanitized anyway) or relative to the repository root.

Given location must already exist.

Note that this parameter is used ONLY if the resource DOES NOT EXISTS. If it exists already, its location is not changed.

Parameters

string $fedoraLoc

fedora location

Returns

\acdhOeaw\util\Indexer

setUploadSizeLimit()

setUploadSizeLimit(boolean  $limit) : \acdhOeaw\util\Indexer

Sets size treshold for uploading child resources as binary resources.

For files bigger then this treshold a "pure RDF" Fedora resources will be created containing full metadata but no binary content.

Parameters

boolean $limit

maximum size in bytes; 0 will cause no files upload, special value of -1 (default) will cause all files to be uploaded no matter their size

Returns

\acdhOeaw\util\Indexer

setDepth()

setDepth(integer  $depth) : \acdhOeaw\util\Indexer

Sets maximum indexing depth.

Parameters

integer $depth

maximum indexing depth (0 - only initial Resource dir, 1 - also its direct subdirectories, etc.)

Returns

\acdhOeaw\util\Indexer

setIncludeEmptyDirs()

setIncludeEmptyDirs(boolean  $include) : \acdhOeaw\util\Indexer

Sets if Fedora resources should be created for empty directories.

Note this setting is skipped when the $flatStructure is set to true.

Parameters

boolean $include

should resources be created for empty directories

Returns

\acdhOeaw\util\Indexer

setMetaLookup()

setMetaLookup(\acdhOeaw\util\metaLookup\MetaLookupInterface  $metaLookup, boolean  $require = false) : \acdhOeaw\util\Indexer

Sets a class providing metadata for indexed files.

Parameters

\acdhOeaw\util\metaLookup\MetaLookupInterface $metaLookup
boolean $require

should files lacking external metadata be skipped

Returns

\acdhOeaw\util\Indexer

index()

index() : array

Performs the indexing.

Returns

array —

a list FedoraResource objects representing indexed resources

performUpdate()

performUpdate(\DirectoryIterator  $iter, \acdhOeaw\schema\file\File  $file, string  $parent, boolean  $upload) : \acdhOeaw\fedora\FedoraResource

Performs file upload taking care of versioning.

Parameters

\DirectoryIterator $iter
\acdhOeaw\schema\file\File $file
string $parent
boolean $upload

Returns

\acdhOeaw\fedora\FedoraResource

__index()

__index() : array

Performs the indexing.

Returns

array —

a two-element array with first element containing a collection of indexed resources and a second one containing a collection of commited resources

indexEntry()

indexEntry(\DirectoryIterator  $i) : array

Processes single directory entry

Parameters

\DirectoryIterator $i

Returns

array

isSkippedExisting()

isSkippedExisting(\acdhOeaw\schema\file\File  $file) : boolean

Checks if a given file should be skipped because it already exists in the repository while the Indexer skip mode is set to SKIP_EXIST or SKIP_BINARY_EXIST.

Parameters

\acdhOeaw\schema\file\File $file

file to be checked

Throws

\acdhOeaw\util\metaLookup\MetaLookupException

Returns

boolean

isSkipped()

isSkipped(\DirectoryIterator  $i) : boolean

Checks if a given file system node should be skipped during import.

Parameters

\DirectoryIterator $i

file system node

Returns

boolean

handleAutoCommit()

handleAutoCommit() : boolean

Performs autocommit if needed

Returns

boolean —

if autocommit was performed