# File System Indexer and Mount Configuration

The file system indexer (FSIndexer) or file indexer maps one or more file system(s) in the database. It regularly checks if files have been added, deleted or updated in the file system and updates the database. The start folder for mapping is the mount point.

Supported are:

  • up to 20.000.000 files and folders
  • read-only file systems
  • nodeId of the file system so moving files will be recognized
  • cluster installations
  • multiple mount points

Also note:

  • Mount points are configured as relative paths
  • Hidden files and folders will not be imported
  • Files with "Combining Characters" will not be imported

In order to speed up the synchronization and import process, imported folders are distributed in several so-called index parts which are processed in parallel.

File system indexer specifications can be configured in the mount configuration admin snap-in.

# Mapping Data with the Database

When started, the file system indexer compares the data from the file system with the database following these steps:

During synchronization, files are distributed into a positive list (files found or changed in the file system), and a negative list (files not found in the file system):

  • The positive list is built from the files searched like this:

    • by relative_path, mount and file_id
    • by relative_path, mount and fs_hash
    • by mount and file_id (except hardlink)
    • by relative_path, mount and name
  • The negative list consists of all files that were not found in the file system. They will be deleted from the database.

Please note: Without an enabled mount, it is not possible to find a file during the synchronization.

# Cache

The information about all imported folders is stored in folder cefs/{folder module name}/file_system_indexer/{mount_name}. Deleting the cache files causes the file indexer to perform a complete synchronization.

# Mount Configuration

A configured and enabled mount is required for the file system indexer to work.

A mount configuration is stored in the file system under file/mounts, e.g. default mount point file data.xml.

To configure the default data or create a new mount, go to admin snap-in DAM/Mount configuration:

Field XML Description
Mount-ID Assign an identification name for this mount, e.g. "data" (max. 28 characters). This name will be the XML file's name.
Please note: Because multiple mount points are supported, all must have a unique name.
Enable mount enabled Enable or disable this mount in the system.
If set to "false", the mount is shown, but files or folders cannot be created, edited, deleted, copied, or moved.
Mount read-only read_only If set to "true", the file system permission is set to "read only". Files and folders cannot be edited or deleted (but still imported).
File module name file_module Read-only. Set to "Files" by default.
Folder module name folder_module Read-only. Set to "Folder" by default.
Root directory base_path Define the root directory here. The file structure will be imported from this directory, e.g. /4allportal/data/data.
Event handler change_handlers Define the java classes to react to changes in the file structure, e.g. create or change an entry in the database for folders and/or files. Default:
com.cm4ap.ce.fsi.handler.FileChangeHandler
com.cm4ap.ce.fsi.handler.LogOutputHandler
Ignore folders (regex) exclude_folders Define all folders that should be ignored during the import process. They are excluded from the DAM (default regex pattern).
Ignore files (regex) exclude_files Define all files that should be ignored during the import process. They are excluded from the DAM (default regex pattern).
Execution interval
(seconds)
period Define the repeat rate with which the file system indexer should scan and synchronize (in seconds). Default: 43.200 seconds.
To make uploads available faster, choose an interval of 60 seconds, for example.
Use File-IDs use_fileid If set to "true", unique file-IDs (Windows)/inodes (Linux) of files and folders are enabled (more details).
Use Volume-IDs use_volumeid If set to "true", a volumeID is used to identify the mount, so the file system indexer indexes the correct mount even after renaming or moving the network drive (more details).
Minimum index part size min_index_part_size minimum number of folders in one index part
Maximum index part size max_index_part_size maximum allowed number of folders in one index part
Use milliseconds for
file modification time
If set to "true", milliseconds will be considered when indexing files and the file modification time (recommended).
Please note: When using a Docker Image, note that Docker trims milliseconds when packaging files.

# Default Regex Pattern

Per default, there is a predefined "exclude" pattern, which always applies to folders and files:

^([.~].*)|(.*[\\/][.~]).*$

Files and folders matching this regex pattern will be ignored during the import process or, if already imported, not be shown in the 4ALLPORTAL.

# File-IDs and Volume-IDs

Unique file-IDs/inodes are required for SMB- and WebDAV-connections. Not all file systems support them. Before setting to "true", check if your file system supports unique fileIDs. If not supported, the filesystem indexer stops.

If use_fileid is set to "true", a .4apmount file is created in the root directory of the mount on the first import process.

A volumeID is stored on the connected network drive (mount) to make a mount clearly recognizable for the file system indexer. In case the shared network drive has no, or a wrong volumeID, the filesystem indexer stops.

# Default Mount XML

The default mount XML in file/mounts looks like this:

<mount enabled="true">
	<file_module>file</file_module>
	<folder_module>folder</folder_module>
	<base_path>/4allportal/data/data</base_path>
	<min_index_part_size>15</min_index_part_size>
	<max_index_part_size>150</max_index_part_size>
	<read_only>false</read_only>
	<use_fileid>true</use_fileid>
	<use_volumeid>true</use_volumeid>
	<exclude_folders> <!-- regex pattern to exclude folder -->
		<exclude_folder>(^|.+[/])layout($|[/].*$)</exclude_folder>
	</exclude_folders>
	<exclude_files> <!-- regex pattern to exclude file -->
		<exclude_file>4ap_fileimport[.]xml</exclude_file>
		<exclude_file>Thumbs[.]db</exclude_file>
	</exclude_files>
	<period>43200</period> <!-- repeat in n seconds -->
	<change_handlers>
		<change_handler>com.cm4ap.ce.fsi.handler.FileChangeHandler</change_handler>
		<change_handler>com.cm4ap.ce.fsi.handler.LogOutputHandler</change_handler>
	</change_handlers>
</mount>

Request missing documentation