Alfresco Docs - Using Federation Services

Once your Federation Services environment has been successfully installed, you are ready to start making connections and executing jobs.

Quick Start User Guide

This guide will walk you through this process. For this example, we will be using Google Drive as our source (repository) location and Dropbox as our target location (output).

Step 1. Login to your Federation Services environment once all the required components have been installed.
Step 2. Create the repository connection for your source data that you want to transfer.
Step 3. Create your output connection for your target location you want to move your data to.
Step 4. Create a new integration job to sync your two new connections.
Step 5. Create a task to filter your results before moving your data.
Step 6. Map your fields to tell Federation Services where you want the data in each field moved to.
Step 7. Run your new job to begin the data transfer process.
Step 8. View your results to confirm the transfer was successful.

Step 1. Login to your Federation Services environment

Install Federation Services - To begin using Federation Services, make sure you have all the required software components installed.
Login to Federation Services Admin - Once Federation Services is installed, you are ready to log in and begin setting up jobs to transfer your data.

(Sub-Step 1)

Create Authentication Connections - Before we begin, please note that some connections will require authentication before you can create an Integration Connection. The Authentication process can vary from connector to connector. For details on the specific authentication steps for your connectors, see the full Authentication Connection List.

(Sub-Step 2)

Create Discovery Instance - Once Authenticated, you will set up your discovery instance to allow Federation Services to find the field properties in your connected repository for mapping in Step 6. For steps on setting up this instance, see Discovery Schema Instances.

NEXT STEP: Create your integration connections.

Step 2. Create your Integration Connections.

Your Integration connections will be the places you want to move your files to and from. First set up your Integration connection for your source repository that you are pulling your files from and then set up one for your output repository that you are moving your files to.

Select Connections from the navigation section on the left
Select Integration Connections
Click the Create Integration Connection button
Add the following:
- Connection Name
- Connection Description (Optional)
- Connection Type
Click Save to continue editing your integration connector
Edit your new Integration Connection
Add the Authentication Connection you created for this connector.

NEXT STEP: Create your Output Connection.

Step 3. Create your Output Integration Connection:

Repeat Step 2 for your Output Integration Connection. This will be the location you want to move your files to.

Select Connections from the navigation section on the left
Select Integration Connections
Click the Create Integration Connection button
Add the following:
- Connection Name
- Connection Description (Optional)
- Connection Type
Click Save to continue editing your integration connector
Edit your new Integration Connection
Add the Authentication Connection you created for this connector.

NEXT STEP: Create your Integration Job

Step 4. Create a new integration job to sync your two new connections.

This process will allow Federation Services to connect the two systems and move your data from one location to another.

Click Integration in the navigation
Select List Jobs
Click the Create New Job button
Add the following:
- Give your job a descriptive Job Name
- Select your Repository Connection
- Select your Output Connection
Leave the Job type as Simple Migration
Click Save
View your new job edit page.
Edit the start and end dates and times to filter for the documents in your repository connection that were created within that time period.
Configure Repository Specifications: This tab will vary depending on the connector.
Click on the Query Tab if there is one and add the Google Drive folder ID. This can be found after the last slash “/”in the url for the folder.
Leave Query blank and leave process folders unchecked.
Output Specifications: This tab will vary depending on the connector
Click on the Output Specifications Tab and enter your Target Folder.

NEXT STEP: Add Tasks

Step 5. Create a task

Among other things, tasks will allow you to filter your documents, so you only bring over the data you need. You can also use tasks to check for duplicates among other useful filters.

In your saved job edit page click the TASKS tab.
Select the task you would like to add from the list drop-down.
Click the green plus sign to add the task. Fill in the necessary configurations for that specific task.
Click DONE to add the task to the job.

NEXT STEP: Job Mapping

Step 6. Map your fields

This will tell Federation Services which data property fields from one location match the property fields in the other location.

From the edit job page click the mappings tab
Select Manual in the Mapping Type drop-down and add the following.
Select your Repository Discovery
Select drive_file as the Repository Type
Select your Output Discovery
Select file as your output type Select the source field Select the target field Click on the blue plus button to add to the mapping
Click Save at the top of the page

NEXT STEP: Run Job

Step 7. Run your new job

From the navigation section on the left, click Integration to expand the section and select Run and Monitor Jobs
Click the Play button to the left of your new job to begin transferring files.
Click the REFRESH button to see the progress of your integration. You can also set the auto refresh time periods to 30 seconds, 1 minute, 5 minutes.

NEXT STEP: View your results

Step 8. View your Integration results

Once the integration job is complete, you can view your results. This shows that all the files have been successfully integrated from one system to the other. If you log into your target repository you will see these files have been added successfully.

Congratulations! You have successfully created your first Integration Job.

Discovery and Analytics

Discovery is the 2nd step in the integration process. Create an instance, select a connector and a schema will be produced based on the connector’s configuration. Includes a list of all object types as well as attributes stored in each.

Discovery Schema Instances

Discovery connectors are responsible for discovering the schema, or the metadata about the repository. This will include content types and their associated properties.

A discovery schema can then be used to aid in creating Job Mappings or to simply find out what’s in the repository. Running a discovery instance will create a new version of the schema. Whatever version you select will be used when creating mappings.

Schema Instances

Viewing the schema instances will show you a table of all available Discovery Instances, this table can be sorted by Name, Type, and Available Versions.

Here you can view the most recently created schema instances as well as created new ones.

Name: Unique Name for the Discovery Connection to identify it in the UI.
Type: The Discovery Type refers to the type of the repository (i.e. CMIS, SharePoint, Documentum etc.).
Version: Each Discovery Schema can have multiple versions. Selecting this version will set which is used in Job Mappings as well as which schema will be shown when examining a schema.

New Discovery Connection Instance

To create a new Discovery Connection, click the Create New Discovery Instance button at the bottom of the Schema Instances Page. Fill in the following fields.

Name: Unique Name for the Discovery Connection to identify it in the UI.
Discovery Type: The Discovery Type refers to the type of the repository
Authentication Connection: A predefined connection for authentication.
Ignore Types: Comma delimited list of types to ignore.
Additional Fields: Some connections require additional information to search for types. Click on a link in the table below for more details on setting up the Discovery Connection for a particular software

Discovery Schema View

Once Discovery finishes running Federation Services will store the information. Discovery only needs to be run once, unless the schema of the source repository is changed.

To view the report, click the document icon to the left of the instance. You can click into each attribute section by category.

Data Collection

Data Collection is a scheduling system that allows the user to schedule job runs. It can be turned on and off in the Data Collection index page.

Data collection schedules can be run for a job group or individual job. For the purposes of data collection jobs it is suggested you use the Reporting Connector and do not have Include Binaries checked in the Details tab. Federation Services can create the necessary auditing data without the actual file content

How to use Data Collection

The Set Info Pane will show you basic information about the various collections in the data set, as well as allowing you to delete individual runs.

The data collection service runs in the background based on properties set either in simflofy-global.properties or in the Admin Properties page. It checks each data set (created once you hit Set Collection Schedule) to see if that job group needs to be run based on whether Collect Data is ‘on’.

Collection Statuses: Sets can have one of three statuses
- Unsaved- The Collection hasn’t been saved yet
- Off- The service will not collect data for this group
- Waiting- The job group is between collections
- Running- The jobs in the job group are currently running
Time until next run: As it reads. If this runs to zero, it will not refresh unless the schedule is closed and reopened. Will Display as NA if performing a run.
Collect Date: If on, the collection will run based on the collection dates.
Run jobs in parallel: Run all jobs in the set simultaneously. The default is to run them in prescribed order set on the job group.
Collection dates: Dates over which data will be collected.
- If the “From” time is set to a value before the current date/time, the run will begin immediately.
- The “To” time and Interval are checked upon completion of a collection run to determine whether the set should run again.
Interval: The interval between collections will dictate how often the jobs are going to run.
- It is suggested you determine the general length of the jobs you wish to run to avoid overlap. The Set Info pane provides an average run time for the data set.
- Setting the collection interval to Once will turn off collection after a single run.
- Interval below 3 minutes are not allowed, as they cause unpredictable behaviour.
Choose between a job group or individual job
- If selecting a job group, you can choose the run order by dragging. The top job is run first.

Example Duplicate Checking

If you wish to collect information on duplicate data, use the Duplication Check Task. As of the inclusion of Data Collection, you do not need to include a binary for duplication checks. If you declare a field that is not the md5 field as your “hash”, you can set a field in the JavaScript Task using the rd.setMetaHash(Strings) method.

Discovery Report Views

A view which charts data from a Content Report generated through Data Collection

Once you have finished with your Data Collection, Federation Services allows users to visualize their data in a Discovery Report View in the Discovery UI

In this report view users will find:

An interactive chart that adjusts to selected repositories
Content Past Date
Content Summary
Storage Summary
Total Numbers of files

Generate Reports

This tab initially only contains a drop-down of data sets that have job runs associated with them. It also displays their collection status.

Upon selecting a data set you should see the following:

Included in the table are the date of the run, number of job runs included in the dataset, as well as some general stats on the document counts. The table is sortable on the various values.
Clicking the eye icon next to the job run count will open a pop-over with the individual job run ids
Clicking one of these ids will take you to a detailed view of the job run. For content reporting you are currently only allow to select one dataset.
In the future a “Content Growth Report” option will be made available, allowing you to select multiple job runs to monitor increases in storage usage.

Filter Dataset

There are a number of filters that can be applied to the dataset during report generation. These include:

Filters for Obsolete Date
Checked against the Last Modified Date, or Created date if the former is unavailable
Filters for certain file type extensions.
Extracted from filename
Filters for file size
(New) Filters for type. Document types can also be filtered out.

These filters are processed in this order, upon selecting them you will have the option to reorder them.

Available Reports

Upon completion of report generation you should be taken to the Available Reports tab.

This tab shows the generated reports as well some additional information:

Obsolete Data: The report contains data on content that was past the filter date
Duplicate Data: The report has information on duplicates
Content Type Data: The report contains a content type breakdown for all included repositories.
Multiple repositories: The report contains data for each of the jobs run as part of the dataset, as well as a merged report containing the combined data.
Actions:
- Download the raw report in JSON format
- Generate New Version of this Report
- Delete Report with the additional option to delete all versions of the report

Schedule Reports

The schedule reports tab allows you to automate the creation of reports after creating the initial one.

Automatically Generate This Reports: Enables the scheduling of this report’s creation
Generate Reports Every: The time interval to use for generating this report.
Starting On and Ending On: How long the schedule lasts. If the Before time is set before the current time, generation will accrue immediately upon setting the schedule.
Update Associated Report Views: Each Discovery Content View is associated with a single report series. If set to On, the view will be updated to use the most current version each run.
Which views: A list of associated views to update.
Use Data From the Following Set or Most Recent: If checked, the new report will always use the next data collection chronologically from the previous. Otherwise the most recent data set will be used. Useful if you wish to perform QA on the most recent runs before creating a new report.
Send email Notifications: A simple yes/no. Email must be enabled for the application.
Comma delimited list of emails: As the tooltip says.

Pii Scan

The Pii Scan feature allows the user to select a configured repository connector and job, then scan the repository for Personally Identifiable Information (Pii).

This can include phone numbers, social security numbers, proper names, and addresses.

A temporary job is created using the repository configuration of the selected job.

Appropriate tasks and mappings are added to the job, then the job is run.

Upon completion, the temporary job and records are deleted, leaving only Pii Data.

Federated Search

Federation Services allows you to manage data throughout multiple repositories without having to move the content from one location to another. With Federated Search users can view, organise, report on, update and maintain all of their documents, files and records in one easy to use highly customisable space.

Content Search

Content View Connectors are used by Content Views in Federation to query indexed documents and populate the results within a federated search view. Federation can then take action against the results such as preview, download, check in and out and edit the file’s properties.

Content Search Connection Configuration

Available Content View Connection Types

Elasticsearch Connector
MongoDB GridFS Search Connector
MongoDB Search Connector
SOLR Search Connector (End of Life)

Basic Configuration

Connector ID Connector ids are how Federation Services identifies the individual connector when receiving calls from other sources, such as Federation Service. This value must be usable as part of url. Use the description field if you need more than a few letters/numbers to describe the connection. The description shows up with its connectorId across the product.

Connector ID: A unique identifier for this connection i.e. simflofy_demo (alphanumeric, dashes and underscore characters only).
Description: The text that will be displayed on drop-downs etc. to identify this connection.
Type: The type of Search Connection (Solr, Mongo, Elastic etc.)
Keep Connection Alive: Federation Services will cache the connection for a given amount of time before discarding it.
Keep Alive in Milliseconds: How long to keep the connection alive before discarding it (300000 is 5 minutes).
Security Mode: This is how to authenticate with the back-end search.
Authentication Connection: The most common method is to use the appropriate authentication connection.
User Pass-through Credentials: Users the authenticates with whatever authentication they used for Federation Services. Only supported in rare cases.

Result Links

Result Links are under the Search Configuration tab, but work universally.

If Download is selected, the file names in your view will call a document download, much like the Widget

If External is selected, you will need to add Result Links. When you click the Result Links button a modal should appear. It takes three arguments

Content Service Connector: External link configurations are grouped by content service connector.
- This allows documents from different repositories to form different links.
Link Field: The document field that contains relevant information for building the link.
Link Url: The content of the link field will be appended to this url to create the link.

Link Field With Semicolons Due to how certain ids with versions are handled. The value of the link field will be cut off after the first instance of a semicolon(;)

Search Configuration

Refer to the documentation for each connector to see which fields are available.

Search Security

Filter:

The authenticated user’s group ids and login will be added to each search request. Requires content to be indexed using the Index User Group Task.
Restrict:

Only users from the selected groups will be able to use this search connection, regardless of role. If this connection in used as part of a Content View, users outside these groups will not be able to see it in their View List in Discovery.

Connection Configuration

Individuals connectors might have specific fields here that were not general enough to be in Search Configuration

Additionally, this is where you can use the “Add Custom Parameter” button to set any default query values for the connection.

A query_fq configuration parameter lets you define facet queries behind the scene. This is done to provide limited views or subsets of data in the search. Essentially you could create any number of views on the same date but each view would display different results. This can also be used in a role base system where you have views setup for specific user roles.Unless the fq is already encoded, you will need to wrap it in the encode() function where it will be URLEncoded UTF-8.

The syntax is:

query_fq for the first one  
query_fq1 for the second  
etc.

Content Services

Content Service Connections are used to interact with content in Federated Search views. Actions set up in the Content Services API or the Discovery UI web application will be performed against specified repositories and documents through these Content Service connectors.

Content Service Connections

Federation Services’s Content Service Connections offer public REST endpoints that allow for integration with external applications. Actions in the Content Services API or the Discovery web application perform actions against specific repositories through these connectors.

Commonly Supported Operations

Note: The capabilities of each Content Service Connector is limited to operations allowed by the repository. Additionally, not all methods are available for all connectors. View individual connector pages for a list of capabilities.

Creating content
Retrieving file content
Showing content properties
Listing folder items
Listing file versions
Updating file content and properties
Deleting files and folder
Showing repository capabilities
Retrieving the Root Folder ID
Managing Permissions

Basic Configuration

Tip: Connector ids are how Federation Services identifies the individual connector when receiving calls from other sources, such as Federation Service. This value must be usable as part of url. Use the description field if you need more than a few letters/numbers to describe the connection. The description shows up with its connector ID across the product.

Connector ID: A unique identifier for this connection i.e. simflofy_demo (alphanumeric, dashes and underscore characters only)
Description: The text that will be displayed on drop-downs etc. to identify this connection.
Type: The type of Search Connection (Solr, Mongo, Elastic etc.)
Keep Connection Alive: Federation Services will cache the connection for a given amount of time before discarding it.
Keep Alive in Milliseconds: How long to keep the connection alive before discarding it (300000 is 5 minutes)
Security Mode: This is how to authenticate with the back-end search.
- Authentication Connection: The most common method is to use the appropriate authentication connection
- User Pass-through Credentials: Users the authenticates with whatever authentication they used for Federation Services. Only supported in rare cases.
- None: Only usable with the Filesystem Content Service Connector

Connection Configuration

Different connectors might require additional configuration. These values will populate after selecting a Type.

Content Service Mapping

You can also add mappings or mapping groups to your Content Service Connections which will allow you to map custom parameters to properties in the destination system.

Add Mappings to a Content Service Connector

Create a mapping.
- (Optional) Add your mappings (and other mappings) to a Mapping Group.
Click the edit icon for your Content Service connector.
In the drop-down under Mapping Type select Single Mapping or Group Mapping
In the drop-down under the mapping type you chose, select the Mapping or Mapping Group you want to add to this connection.
Save the Content Service connector.

Creating Mappings for Content Services

Content Service mappings will use the source as the parameter name for the content service call. The target field should match a field name in the destination repository. For easier use, you can leverage the schema discovery for your connectors to populate the output field names when mapping.

Content Service Mapping

In the above example, the parameter name can be populated to map data to the Title field in the connected repository.

Here is an example call using the mapping above:

http://localhost:8080/3sixty-admin/repo/spo2/update?
fileId=s5dba1525-44a6-45ed-a42e-4a155a3f0539&name=second_level_22

The above call will update the SharePoint document by setting the Title to ‘second_level_22’.

Creating A Content Service Connection

Following is an example of how to create a new content service connection.

Step 1: New Connection

To create a content service connector instance select the Content Service Connections under the Connections header. Click Create New Content Service Connection

You will be brought to the generic Content Connection page.

Step 2: Basic Configuration

Connector ID: This uniquely identifies this instance of the connector. This must be unique and will be used when you index content and metadata.
Description: Describe your connection. Be descriptive as you may not be the only one using this connection.
Type: Select from a drop-down list of Content Service Connectors installed in your instance of Federation Services. Once you select a connector type you will be presented with further options which will be documented on the connector page for that type. Don’t see your connector listed here? Contact your Federation Services Administrator or Federation Services Support.
Keep Connection Alive: If checked then we will cache the connection object so the user doesn’t have to re-connector for every action they want to perform. If unchecked each API action will result in a login to the third-party system.
Keep Alive in Milliseconds: How long to keep the session in cache.
Connection URL: The URL or Path to connect to.
Security Mode: Authentication Connection: This uses an authentication connector. An example may be a Box OAuth Connector. These can be a little more complex than the first 3 options and provides customisations by allowing you to create your own auth connectors.

Step 3: Connection Configuration

Add Configuration Parameter (Button): Some connectors allow for optional parameters than can be passed via key/value pairs. Most of the time you’ll have a form to fill out, but for more advanced features that are situational adding a key/value pair makes sense as it won’t clutter up the user interface.

List of Repositories supported by Content Services

Alfresco
CMIS
Amazon S3
Azure Content Service Connector
Box
MongoDB (GridFS)
Documentum (DFC)
DocuShare
Dropbox
Elasticsearch
File-system
FTP
Google Drive
IBM FileNet
iManage
Microsoft Exchange
Microsoft OneDrive
Nuxeo
Objective
Salesforce
ServiceNow
SharePoint Online
Webdav
Zendesk

Content Service REST Endpoints

GET FILE

GET /api/repo/<connectorid>/file?id=<id>

Description:

Returns an input stream that contains the document’s content.

Path Parameters:

connectorid: The connector id of your content service connector

Query Parameters:

id: Source repository id. Differs depending on repository.

With CURL

curl -u admin:admin -X GET "localhost:8081/3sixty-admin/api/repo/box/file?id=384896487495" | json_pp

GET OBJECT ID BY PATH

GET /api/repo/<connectorid>/idbypath?fileName=<fileName>&folderPath=<folderPath>

Description:

Searches for a file and returns its unique id

Path Parameters:

connectorid: The connector id of your content service connector

Query Parameters:

fileName: Name of the file to find
folderPath: Parent folder path of the file

Returns:

{
"results": {
"id": "38489648749"
},
"success": true
}

With CURL

curl -u admin:admin -X GET "localhost:8081/3sixty-admin/api/repo/box/idbypath?fileName=testFolder&folderPath=/TestFolder/ | json_pp

POST FILE

POST /api/repo/<connectorid>/file?fileName=<fileName>&folderId=<folderId>&type=<type>

Description:

Uploads a file to the target repository. Must be a multipart /form-data request with the file as a body. Multipart form boundaries will need to be set. See the Postman tutorial for correct settings.

Body Parameters:

file: Attach your multipart file as the file parameter

Path Parameters:

connectorid: The connector id of your content service connector

Query Parameters:

fileName: The name of the file.
folderId: The repository id of the folder where the file with live.
type: The document type. Varies greatly depending on repository. Use types endpoint.

Returns:

{
"results": {
"id": "629425696136"
},
"success": true
}

With CURL

curl -u admin:admin -Ffile=@/Users/simflofy/CaterpillarDEUAjax.pdf"
localhost:8081/3sixty-admin/api/repo/box/file?fileName=Caterpillar%20DEU%20Ajax.pdf&folderId=105965269305&type=document" | json_pp

UPDATE FILE

PUT /api/repo/<connectorid>/update?
fileId=<fileId>&folderId=<folderId>&fileName=<fileName>&type=<type>&property1=<value1>&property2=...

Description:

Uploads a file’s content or properties to the target repository as an update.Content can be attached as a binary body to the request. If no content is supplied, Federation Services will try update properties instead.

Path Parameters:

connectorid: The connector id of your content service connector

Query Parameters:

fileId: The ID of the file you wish to update, from the repository.
fileName: The name of the file. Use if supplying folder ID.
folderId: The repository id of the folder where the file you are updating lives. Use if supplying fileName.
type: The document type. Varies greatly depending on repository. Use types endpoint.
property(n): Properties that you wish to update with this call can be appended to the query

Returns:

{
"results": {
"id": "629425696136"
},
"success": true
}

With CURL

curl -u admin:admin -Ffile=@/Users/simflofy/CaterpillarDEUAjax.pdf"
localhost:8081/3sixty-admin/api/repo/box/update?fileName=Caterpillar%20DEU%20Ajax.pdf&folderId=105965269305&type=document"
| json_pp

UPDATE FILE CONTENT

PUT /api/repo/<connectorid>/updateContent?fileId=<fileId>

Description:

Will update the content of the file using the supplied request body

Path Parameters:

connectorid: The connector id of your content service connector

Query Parameters:

fileId: The ID of the file you wish to update, from the repository. Returns:

{
"results": {
"id": "804939960448"
},
"success": true
}

With CURL

curl -X PUT -u admin:admin --data-binary '@/Users/simflofy/CaterpillarDEUAjax-newversion.pdf'
'localhost:8081/3sixty-admin/api/repo/box/updateContent?fileId=804939960448'
| json_pp

UPDATE FILE PROPERTIES

PUT /api/repo/<connectorid>/updateProperties?fileId=<fileId>&<param1>=<value1>&<param2>=<value2>

Description:

Will update the properties of the file using the supplied parameters

Path Parameters:

connectorid: The connector id of your content service connector

Query Parameters:

fileId: The ID of the file you wish to update, from the repository.
paramX: Name of the parameter to update
valueX: New value of the parameter

Returns:

{
"results": {
"id": "804939960448"
},
"success": true
}

With CURL

curl -X PUT -u admin:admin
'localhost:8081/3sixty-admin/api/repo/box/updateProperties?fileId=804939960448&textfield=newvalue'
| json_pp

CHECKIN FILE

POST /api/repo/<connectorid>/checkin?id=<id>&comment=<comment>

Description:

Checks in the target file with the specified comment

Path Parameters:

connectorid: The connector id of your content service connector

Query Parameters:

id: Source repository id of the file
comment: Optional: The check-in message.

POST /api/repo/cmis/checkin?id=5dba1525-44a6-45ed-a42e-4a155a3f0539&comment=Spacing%20Fix

Returns:

{
"results": {
"id": "5dba1525-44a6-45ed-a42e-4a155a3f0539"
},
"success": true
}

With CURL

curl -u admin:admin -X POST "localhost:8081/3sixty-admin/api/repo/cmis/checkin?
id=5dba1525-44a6-45ed-a42e-4a155a3f0539&comment=Spacing%20Fix" | json_pp

CHECK OUT FILE

POST /api/repo/<connectorid>/checkout?id=<id>

Description:

Checks out the target file

Path Parameters:

connectorid: The connector id of your content service connector

Query Parameters:

id: Source repository id of the file

POST /api/repo/cmis/checkout?id=5dba1525-44a6-45ed-a42e-4a155a3f0539

Returns:

{
"results": {
"id": "5dba1525-44a6-45ed-a42e-4a155a3f0539"
},
"success": true
}

With CURL

curl -u admin:admin -X POST "localhost:8081/3sixty-admin/api/repo/cmis/checkout?
id=5dba1525-44a6-45ed-a42e-4a155a3f0539" | json_pp

GET FILE BATCH (3.1.1 and older)

GET /api/repo/__BATCH__/file?id[x]=<idx>&conn[x]=<connx>&fname[x]=<fnamex>

Description:

Retrieves the requested files and returns them as a zip file.

Query Parameters:

Note: [X] EQUALS AN INTEGER

idx: ID of the file to retrieve
connx: Connector id of the file
fname: The name to give the file when zipped.

GET /api/repo/__BATCH__/file?
id0=/home/user/simflofy/test.txt&conn0=localFS&fname0=test.txt&id1=629425696136&conn1=box&fname1=Caterpillar%20DEU%20Ajax.pdf

With CURL

curl -u admin:admin -X "localhost:8081/3sixty-admin/api/repo/__BATCH__/file?
id0=/home/user/simflofy/test.txt&conn0=localFS&fname0=test.txt&id1=629425696136&conn1=box&fname1=Caterpillar%20DEU%20Ajax.pdf"

GET FILE BATCH (3.1.2+)

POST /api/repo/batch

Description:

Batches files into a zip file and returns a download link item, whose id can be used with the File Service. Any errors will be reported in the file errors.txt, included with the export.

Body Parameters:

A JSON Object in the following format:

{
"batch":[
{
"fileName": "myfile.txt"
connId: "fs"
"id": "/Users/user/Documents/myfiles.txt"
},
{
"fileName": "boxfile.txt"
connId: "box"
"id": "635846060210"
}
]
}

Returns:

{
"success": true,
"results": {
"internal": true,
"expires": 1646759571044,
"fileName": "BATCH_1646757913.zip",
"createdBy": "admin",
"connectorId": "3Sixty Admin",
"docId": "6213e0f0f7558f0cc7458e5e",
"created": 0,
"id": "$2a$04$3vXhjvIAX7lQZY4OSUa8L.tY6263QBw61eye9OYaEiAGp9enNX.lW",
"used": 0
}
}

GET FILE VERSIONS

GET /api/repo/<connectorid>/listversions?id=<id>

Description:

Returns a JSON Object listing the versions of an object.

Path Parameters:

connectorid: The connector id of your content service connector

Query Parameters:

id: Source repository id. Differs depending on repository.

GET /api/repo/cmis/listversions?id=5dba1525-44a6-45ed-a42e-4a155a3f0539

Returns:

{
"results": [
{
"docId": "5dba1525-44a6-45ed-a42e-4a155a3f0539;1.0",
"lastModified": 1583870808000,
"name": null,
"versionLabel": "1.0",
"versionLength": 409303,
"versionSeriesId": "512"
},
{
"docId": "5dba1525-44a6-45ed-a42e-4a155a3f0539;2.0",
"lastModified": 1583870809000,
"name": null,
"versionLabel": "2.0",
"versionLength": 409303,
"versionSeriesId": "1024"
}
],
"success": true
}

With CURL

curl -u admin:admin -X GET "localhost:8081/3sixty-admin/api/repo/spo/listversions?
id=5dba1525-44a6-45ed-a42e-4a155a3f0539" | json_pp

DELETE FILE OR FOLDER

Delete Requests:

Description: Deletes a file based on source id.

DELETE /api/repo/<connectorid>/delete?id=<id>&allversions=<allversions>

Description: Deletes a folder based on path.

DELETE /api/repo/<connectorid>/delete?path=<path> //Only for Filesystem

Path Parameters:

connectorid: The connector id of your content service connector

Query Parameters:

id: The source repository id of the file.
allversions: Delete all versions of the file, if any exist. Defaults is true
path: The path of the folder to delete.

DELETE /api/repo/box/delete?id=629425696136&allversions=false

Returns:

{
"results": {
"id": "629425696136"
},
"success": true
}

With CURL (remember to encode spaces and quotes)

curl -u admin:admin -X DELETE "localhost:8081/3sixty-admin/api/repo/box/delete?
id=629425696136&allversions=false" | json_pp

GET FOLDER ITEMS

Request:

GET /api/repo/<connectorid>/folderitems?id=<id>

Description:

Returns a JSON Object listing the folders content.

Path Parameters:

connectorid: The connector id of your content service connector

Query Parameters:

id: Source repository id. Differs depending on repository.

GET /api/repo/box/folderitems?id=384896487495

Returns:

{
"results": {
"/All Files/3Sixty/credentials.csv": {
"BaseType": "document",
"CheckedOut": "false",
"CheckinComment": null,
"ContentLength": "201",
"CreatedBy": "",
"CreatedDate": "1573473832000",
"Description": "",
"Id": "556063296648",
"ItemPath": "/All Files/3Sixty/credentials.csv",
"MimeType": "text/csv",
"ModifiedBy": "",
"ModifiedDate": "1573473832000",
"Name": "credentials.csv",
"3SixtyType": "document",
"SitePath": "",
"Type": "file",
"VersionCount": "-1",
"VersionLabel": null,
"VersionSeriesId": ""
},
"/All Files/3Sixty/test copy 24.txt": {
"BaseType": "document",
"CheckedOut": "false",
"CheckinComment": null,
"ContentLength": "40",
"CreatedBy": "",
"CreatedDate": "1573473831000",
"Description": "",
"Id": "556069315644",
"ItemPath": "/All Files/3Sixty/test copy 24.txt",
"MimeType": "text/plain",
"ModifiedBy": "",
"ModifiedDate": "1573473831000",
"Name": "test copy 24.txt",
"3SixtyType": "document",
"SitePath": "",
"Type": "file",
"VersionCount": "-1",
"VersionLabel": null,
"VersionSeriesId": ""
},
"/All Files/3Sixty/test copy.txt": {
"BaseType": "document",
"CheckedOut": "false",
"CheckinComment": null,
"ContentLength": "40",
"CreatedBy": "",
"CreatedDate": "1573473831000",
"Description": "",
"Id": "556069266751",
"ItemPath": "/All Files/3Sixty/test copy.txt",
"MimeType": "text/plain",
"ModifiedBy": "",
"ModifiedDate": "1573473831000",
"Name": "test copy.txt",
"3SixtyType": "document",
"SitePath": "",
"Type": "file",
"VersionCount": "-1",
"VersionLabel": null,
"VersionSeriesId": ""
}
},
"success": true
}

With CURL

curl -u admin:admin -X GET "localhost:8081/3sixty-admin/api/repo/box/folderitems?id=384896487495" | json_pp

CREATE FOLDER

POST /api/repo/(connectorid)/folder?path=<path>

Description:

Creates a folder at the specified location.

Path Parameters:

connectorid: The connector id of your content service connector

Query Parameters:

path: Absolute path of the folder to create.

POST /api/repo/box/folder?path=/3Sixty/testfolder

Returns:

{
"results": {
"id": "109374560508"
},
"success": true
}

With CURL

curl -u admin:admin -X POST"localhost:8081/3sixty-admin/api/repo/box/folder?path=3Sixty/testfolder" | json_pp

GET ITEM PROPERTIES

GET /api/repo/<connectorid>/properties?id=<id>

Description:

Returns a JSON Object listing the objects properties

Path Parameters:

connectorid: The connector id of your content service connector

Query Parameters:

id: Source repository id. Differs depending on repository.
GET /api/repo/box/properties?id=384896487495

Returns:

{
"results": {
"simflofy_created": {
"DisplayName": "[2019-11-11T07:03:46]",
"QueryName": "Created"
},
"simflofy_last_modified": {
"DisplayName": "[2019-11-15T18:46:31]",
"QueryName": "Last Modified"
},
"simflofy_filename": {
"DisplayName": "[3Sixty]",
"QueryName": "File Name"
},
"simflofy_content_type": {
"DisplayName": "[application/octet-stream]",
"QueryName": "Mime Type"
},
"simflofy_length": {
"DisplayName": "[16249882]",
"QueryName": "Content Length"
}
},
"success": true
}

With CURL

curl -u admin:admin -X GET "localhost:8081/3sixty-admin/api/repo/box/properties?id=384896487495" | json_pp

Discovery

Discovery is Federation Services’s user interface for Federation services. It is a highly configurable platform which uses Views and Widgets to organise, view, and manage your data in the most effective ways possible.

Federation Configuration

tsearch.simflofy.services.url= The full url of the Federation Services-admin webapp.
tsearch.security.allow.expired= Allows for expired ssl certificates when connecting to simflofy admin
tsearch.simflofy.services.auth.strategy= Represents how to authenticate when performing queries. Note that this has no effect on how users log in. Options are service or passthrough.
- passthrough - The currently logged-in user is used to perform queries. This is the default, recommended setting.
- service - Queries will be performed as a single service user. This is not recommended, as many security features required the querying users username to filter results
  - tsearch.simflofy.admin.username= service user username
  - tsearch.simflofy.admin.password= service user password
tsearch.logo.file= The location of the logo to use for certain views. Default is tsearch/resources/styles/images/logo-simflofy-search.png
tsearch.favicon.file= Location of the favicon file. Favicon is the symbol that appears in tabs. Default is /resources/styles/images/favicon.ico

Many of these can be accessed by Discovery Admins at tsearch/config. It is available via the user navigation (top right corner) menu under Configuration

Widgets

Widget Instance is an instance of a Widget Definition that can now be placed in a Content View. Many instances of the same definition can exist in a content view. Each instance has a unique id. Federation Services comes preloaded with a number of common widget instances. If you don’t find the type you need in your list, found under Federation > Widget Instances, you will need to create one.

In the Widget Instance list you can search and sort by name, widget ID, label and field.

Following is a the list of widgets you will see upon installing Discovery with Federation Services depending on the template used when building a content view.

Top Menu All (Templates)

My Views
Create New View
Bulk Download
Upload Documents
Saved Searches
User Management

Left Navigation (Search and Dashboard Templates)

Source Repository
File Name Search
Content Search
Content Type
Date Search
File Size
Has Pii
Declared as Record

Content Action (Search Template)

Edit Properties
Edit Metadata
View Metadata
Version Information
Preview
Download File
Check In File
Check Out File

Customisation

To the left of each widget instance there are edit and delete buttons.

To make changes to a widget instance click the edit button. Make any necessary updates and click save. You will need to republish any templates that have the widget in use in order to see any changes reflected in your Discovery environment.

You cannot delete a widget instance that is being used by any content view template. To delete a widget instance you must first remove it from any templates it’s being used on. Once this is done, you can click the delete button and click ok to confirm that you want to delete the instance. This action cannot be reversed.

In Federation Services Admin go to Federation > Widget Instances
Click the Plus button.
Select the Widget Definition for the new Widget Instance and click the Continue button

Name: The name of the Widget Instance that will be display in Federation Services Admin when creating Federation Services Views Field: The facet field associated with this Widget Instance. You can put na if this Widget does not require a field but there must be a value. Label: The label that will be displayed in the Federation Services View when this Widget is displayed (only if the Widget shows a label).

Options is a custom field. See below

Info: Custom Fields are also defined are part of the Widget Definition and act as additional configuration for the widget. Common examples include sort order, chart size, and whether to allow the widget to perform bulk actions. In the above case, the user is defining what facts to search on, and how to label them in a drop-down. When developing a widget, all custom fields should have a default value.

Click Save to finish creating your new Widget Instance.

Content Views

Content Views are how users can interact with indexed content in Discovery. They also allow for administrators to control who can access the view, as well as who can use specific widgets within the view. Content views can be found under the Federation menu in Federation Services Admin.

Creating a View

Views can be created under Federation > Content Views. The following information is needed to create a new view.

Short Name: The key for the view. Only accepts alphanumeric characters, underscored and dashes (URL legal characters). This will be used as part of the url.

Tip: For example: if the shortname is ‘test’ and the template is ‘search’, the url will be /Federation Services-discovery/view/search/test

Template: Which view template to use. Federated Search or Reporting Dashboard Templates.
Display Name (Optional): Enter the name you want displayed for this view in the views drop-down and in the views list in the left navigation menu. If no display name has been entered Discovery will use the short name.
Select a Search Connection: Select the Content Search Connection the view will use to perform searches. This connector should be set up prior to creating a Content View as not having one will prevent you from being able to save the view.

Configuring Views

View Builder

Sections: The sections of the view, as defined by the template. The default SearchTemplate has three sections. Left Sidebar, Body, and Top Menu. Available Widgets: The widget instances of types allowed for the section. Section Widgets: The widgets in the section

Controls

+: Add the widget to the section
x: Remove the widget from the section
i: Widget information panel. It allows you to update the properties of the instance..
Double arrow: Widgets can be reordered by dragging up and down. This changes the order in which the widgets are loaded

Configuration

Short Name: The short name of the view.
Display Name: Display name of the view.
Connector: Search connector the view will use
Search on page load: The view will perform its base query (as configured on the search connection) when the page is first loaded.
Add Records Management: Only available with the Records Management module. Will be grouped with Records Management views on the federation dashboard.

Security

View Users: Users and groups who can access this view. Leave blank to allow all users to see this view.

This list applies to both the Admin and Discovery apps. The SIMFLOFY_ADMIN role will not bypass this list.

Content View Examples

Discovery can be customised in various ways using the content view builder. Following are some examples of the types of views you can build for different result sets.

Basic Search View: Will give you the search results of the files in the indexed repositories. This will allow users to filter these results in various ways. By Filename, file content, connector, and content type. It also gives users the option to perform actions on the files selected. Such as checking files in and out, editing and exporting file properties, preview and download content, upload files, save searches and create new views based on search results.

Simple Dashboard View: Lets users create a visual representation of their data by adding charts based on indexed file properties. For example, you can create charts for the number of files by connector, file size, or file type. You can filter the results using the panels on the left-hand sidebar. Users can also remove and add counts from the charts by selecting the value labels in the chart legend.

Connector Report Dashboard: Provides a report on the search connector in use. The report gives you a line graph of searches per day as well as search terms over the previous 4 days.

Clustering: Provides a visual representation of the data using clusters that the user can drill into by content tags.

Integrations

Federation Services has a standard process for performing integrations or indexing, as demonstrated by the steps below.

Create connector instances - This is part of the installation and requires the Admin to create Integrations and Discovery connections for use with the source and target systems. If setting up Federation Content Service and View connections will also be required.
Discovery - Discovery is the 2nd step in the integration process. Here you will create a discovery schema instance, based on the connector’s configuration. It includes a list of all object types as well as attributes stored in each. Run discovery to get the schema/content type information from both the source and destination systems.
Create and run jobs - This involves mapping metadata, configuring tasks for any kind of processing that is required (pre-requisite), and any targeted configuration for the source and destination. There are several ways to run jobs and depending on the amount of data the Federation Services Admin will set these up accordingly.

Job Setup and Configurations

The final step in the integration process includes mapping fields from each repository, creating and running the final job.

Event Configuration

Event Rules let you trigger jobs from content service calls. For example, if a file is updated in a source system, an event can be set up to trigger an index.

Event Job Configurations

Event Rules let you trigger jobs from content service calls. For example, if a file is updated in a source system, an event can be set up to trigger an index.

Event jobs, in abstract are meant to be triggered by some external action. For a basic event job, we use the push event endpoint, which tells Federation Services which document to get and which job to run it with.

Event Configurations enhance Event Jobs and require:

simflofy.event.queue=true (which is the default setting)

It allows them to be triggered through content service actions. So, if we want to add to that tutorial we would need to add a scenario in which we:

Set up an event configuration
Add it to an Event Job
Trigger that job through a call to a content service endpoint.

Field Options

Name: The name of the document.
Content Length: The size of the document.
Content Type: The content type (mimetype) of the document.

Important: The previous three field options will currently only work for the following endpoints. This may change in future releases. /file /properties /contentStream

Content Connector: The connectorId which was used. Can trigger on any endpoint.
Event Type: The type of action. See the table below for the action for each end point.

Creating and Event Job

This tutorial will walk you through setting up an event job that will process one or more documents. After we create the event job, we will push a document to the event queue using an API call. The API call will also start the event job and copy our document from a source folder to a target folder. To get started we will need to create a source folder, target folder and a test document.

Getting Started

Create a source folder for your document. For example, C:\SourceDocuments.
Place a test document into this folder.
Create a target folder. For example, C:\TargetFolder.

Creating a Federation Services Event Job

Step 1. Create a File System Connection

You will need a connection to your source and target repositories. Because our source and target repositories are the same, we only need to create one connection.

Log into the Federation Services Admin tool (http://(your server)/Federation-Services-admin).
On the left-hand side expand Connections and choose Integration.
At the top left corner of the screen, click the button Create Integration Connection. Fill out the popup screen and click Save. This creates a connection to the File System that you can use as both a source and a target connection.

After you click Save, another screen shows the connection, click Save again.

Step 2. Create an Event Job

A Federation Services Job is the process of moving or syncing content (including versions, ACL’s, metadata) from one CMS (content management system) to another. Since Federation Services Jobs are specifically engineered for content management systems, moving content and metadata is just point and click. There are many Job Types as well as Job Tasks that can handle anything from data validation and cleansing to duplication detection.

In the Federation Services Admin tool:

Expand Integration on the left-hand menu and choose List Jobs.
On the right-hand side, click the button Create New Job.
Enter the data into the popup screen and click Save. The Repository Connection is the source system and the Output Connection is the target system.
For Job Type, select Event
Click Save on the popup, and you are presented with the edit screen for your job.
The source file path should be set to include the location of the files you want to push. In our case we will set it to C:\SourceDocuments. Any files in SourceDocuments or its subfolders will be able to be pushed.

Info: It is possible to set the File Path to a root drive such as C:\\. We do not recommend doing this as it would allow any file on that drive to be pushed.
Next we need to enter the target folder. Click on FileSystem Connection-Output tab and enter the target folder where the file(s) will be copied (migrated) to: C:\TargetFolder.
We will leave the default values for everything else and click Save.

This saves our event job, and we are ready to execute it.

Step 3. Execute the Event Job (Using an API call)

We have to tell the event job which document we want to process. This requires making an API call and passing two parameters, jobId and documentId.

Find the job ID: To get the jobId, open the job and on the Details page look for the ID field. Edit or view the job we just created to see the job id.
Find the document ID: To get the documentId, use the full path of the document, i.e. C:/SourceDocuments/sampledoc.pdf. In the SourceDocuments folder we will be copying sampledoc.pdf. Therefore, the documentId will be the full path: C:/SourceDocuments/sampledoc.pdf. Note the forward slashes.
Construct an API url and call it.

Construct the API URL:

We will now call the API to push the document into the queue for processing.

URL format:

http://{HOST}:{PORT}/3sixty-admin/api/event/service/pushevent?jobId=1631569493226&documentId=C:/SourceDocuments/sampledoc.pdf

Example curl command:

curl -u username:password "http://localhost:8080/3sixty-admin/api/event/service/pushevent?jobId=1631569493226&documentId=C:/SourceDocuments/sampledoc.pdf"

Once the API is called, the document will be placed in a queue and the event job we created above will be started.

The Source file will be copied to the destination folder TargetFolder. (note the full path of the source document is copied to the target folder.)

You can continue to call the API with different document IDs, and they will be processed as soon as they are inserted into the queue.

Also, note that the event job will continue running until you kill the job.

To kill a job, on the left-hand menu under Admin, select Active Jobs - Select the trash can icon to kill the job.

Event Types and Endpoints

Event Type	Endpoint	Method
FILE_DOWNLOAD	/file	GET
FILE_DOWNLOAD	/inlinefile	GET
CREATE_DOCUMENT	/file	POST
UPDATE_DOCUMENT	/update	PUT
UPDATE_CONTENT	/updateContent	PUT
UPDATE_PROPERTIES	/updateProperties	PUT
CREATE_FOLDER	/folder	POST
DELETE_OBJECT	/delete	DELETE
STREAM_CONTENT	/contentStream	HEAD
STREAM_CONTENT	/contentStream	GET
FIND_PROPERTES	/properties	GET
DELETE_ACL	/acls	DELETE
GET_ACL	/acls	GET
EDIT_ACL	/acls	POST
VERSION_DOWNLOAD	/version/content	GET
FIND_VERSION_PROPERTIES	/version/properties	GET
CREATE_VERSION	/version	PUT
REVERT_VERSION	/version/revert	POST
DELETE_VERSION	/version	DELETE
LIST_VERSIONS	/listversions	GET

Job Mappings

Federation Services Mapping gives you the ability to map your content types and metadata from one system to another. Before you start, be sure you are familiar with creating jobs and discovery. Both are integral to mapping.

Overview

In this tutorial we will explain how mapping works in Federation Services. Where you can create your map types, aspects, fields and calculated fields. How to update mappings in mass using Mapping Templates. How to export and import mappings. And complex scenarios such as mapping type hierarchies.

Creating Mappings

Federation Services Mapping gives you the ability to map your content types and metadata from one system to another. There are two places in Federation Services where you can set up mappings:

Reusable Mappings
Mappings Tab

Reusable Mappings

From the Integration menu on the left-hand side of the Admin Page select Job Mappings. The Job Mappings menu item lets you create reusable mappings that you can then add to a job. This is especially useful when you have a lot of jobs with the same mappings. This way, you can edit one Job Mapping, and it will then be updated for all jobs that use that mapping. You can also download the JSON representation of the mapping to use for mappings import on another system.

Mappings Tab

The Mappings tab in a job is great when you need a one-off mapping that other jobs won’t use or if your mapping only applies to one job.

Mapping Types

There are several things you can map as seen below:

Field Mappings allow you to map metadata fields from the source to the target. The names and data types can change from the source to the target by setting the Target type.
Aspect Mappings allow you to map aspects for output systems that support them. Some systems, like OpenText Content Management System call these Categories. Aspects represent a collection of metadata field. You can typically have more than one aspect on a document.
Type Mappings allow you to map the source type to a target type. These do not have to be the same name.
Calculated Fields allow you to set constant values by surrounding them in single quotes (‘) or using the Federation Services Expressions Language.

Info: Calculated fields are processed before tasks and field/aspect mappings. If you wish to further use the product of a calculated field for you will need to create a separate mapping using the calculated fields output as the source.

Discovery Within Mappings

Mappings make heavy use of discovery information. Discovery allows you to pick your types and fields from drop-down lists instead of typing them out. It also knows the data types of each field. This allows for a much faster and less error-prone method of mapping types, aspects, and fields.

Audit Mappings

Audit mappings allow you to add field names and values to the Federation Services audit. These can later be used to create reports for analytics and/or validation. By default, Federation Services selects normal file system attributes to audit. These include file id, file name, created/modified dates, and author. A red eye is not being audited. Green is being audited. Simply clicking on the eye will change it to being audited or not.

Calculated Fields

Calculated fields allow you to hard code values or use functions. If you find the function calls getting too complex, we strongly suggest using a Job Task instead. Calculated fields are designed to be simple. Complex examples that have multiple levels of function calls are prone to error and tasks are designed for complex logic, unlike calculated fields.

Importing and Exporting Mappings

To export a mapping, go to the Job Mapping page, and select the export icon with the down arrow on it. Click Export Mappings to save the mappings as a json file. To import mappings go to the Mappings tab in the integration job and click on the green Import button.

You can import multiple mappings into the same job. Just click on the Import button and cut and paste the JSON from the export into the text area provided.

Example of a JSON format for import mapping under the mappings tab in the job config.

{"mappings":[
{"sourceType":"TEXT","watch":false,"mappingType":"TYPE_MAPPING","targetType":"TEXT","source":"Email","position":0,"target":"duotAF9"}
,
{"sourceType":"TEXT","watch":false,"mappingType":"ASPECT_MAPPING","targetType":"TEXT","source":"Email","position":1,"target":"cAF739"}
,
{"sourceType":"TEXT","watch":false,"mappingType":"FIELD_MAPPING","targetType":"TEXT","source":"Email.Subject","position":2,"target":"tAF723"}
,
{"sourceType":"TEXT","watch":false,"mappingType":"FIELD_MAPPING","targetType":"TEXT","source":"Email.To","position":3,"target":"tAF722"}
,
{"sourceType":"TEXT","watch":false,"mappingType":"FIELD_MAPPING","targetType":"CHECKBOX","source":"hasPii","position":15,"target":"tAF701"}
]}

Mapping Groups

Mapping groups allow you to group integration job mappings into logical groups that you can then manage together.

Creating Mapping Groups

Creating a mapping group is easy. You just give it a name, click the Add Mapping Group button, and then select mappings from the list. You can use the search box to filter the list.

Using Mapping Groups

In the job’s Mappings tab your mapping groups will show up in the Select Additional Mappings drop-down. Read Only versions of the mappings will appear at the bottom of the page, so you won’t have to leave to confirm the content of each mapping in the group.

Properties File Discovery

If you ever have the situation where Discovery can’t automatically find your fields, but you know they will be there, then you can create your own mappings with a properties file.

In some systems, like email headers or iManage, fields can be dynamic and so Discovery can’t find them. To add your own fields, do the following:

Create a properties file that will contain your fields. An example can be found in your installation:
```
 tomcat/webapps/3sixty-admin/WEB-INF/classes/propertiesFileDiscoverySample.properties
```

Edit this file, or the one you created if it’s different:

 # Fields Types
 # CHECKBOX,DATETIME,TEXT,TEXTAREA,INTEGER,LONG,DECIMAL,DOUBLE,URI,READONLY,BINARY,ANY
 #
 # Format
 # type.field=field type
 # If they key has a space in it, replace the space with the unicode value:
 # u0020
 # Historicalu0020Documents.field=CHECKBOX
 document.name=TEXT
 document.description=TEXTAREA
 document.createddate=DATETIME
 folder.name=TEXT
 folder.isversion=CHECKBOX

Create a properties file discovery instance and point it to your properties file. Run it and your types and fields should show up, and now you can use them in job mappings.

Federation Services Expression Language

Federation Services uses a version of the JEval project to handle expressions. Expressions can be used in many places in Federation Services. Specifically they are used for calculated fields on the mapping page, but tasks and connectors can use them in their configuration pages too. The filter task is another good example.

An expression allows you to create values based on some formula or set of function calls. This can be used in a host of use cases to solve many problems including:

Formatting data
Value based on result of a decision
Setting default values
Creating data from other data

Expression Basics

An expression is just a series of Strings, Numbers, Function Calls, and Operators put together by the administrator resulting in a value.

Strings or Numbers or Booleans

Strings are surrounded with ‘ ‘. If something is not surrounded with a ‘, then it’s treated as a number. This will result in an error if you try to convert a string to a number.

‘Mark’ is a string

Mark is a number, but will cause an error.

‘0’ is a String

0 Is the number 0.

Booleans result in 1.0 or 0.0 where 1.0 is true and 0.0 is false.

‘mark’ == ‘mark’ would result in 1.0, or true. Whereas

‘mark’ == ‘nathan’ would result in 0.0.

Variables

Variables are either part of the Repository Document, Version History, or Fields from a Repository. Info: You access a variable using the following format: ’#{rd.path}’

Tasks search for #{ when detecting calculated field. The first portion identifies where to find the data. Repository Document== rd Version History== version Fields== field or nothing

Examples:

#{rd.mimetype}

#{version.label}

Where document is the type:

#{document.myfield}, which is exactly the same as #{field.document.myfield}

See below for a full reference of all rd and version variables available.

Function Calls

You call functions with no ‘ ‘ such as:

now()

Most functions are camel case. So first letter is lower case, start of each syllable after that is upper case:

toLower(), startsWith(), etc..

String Functions

Full list of String functions with example:

startsWith('test.pdf', 'test', 0) substring('test.pdf', 3, length('test.pdf')) toUpperCase('test.pdf') indexOf('test.pdf', '.', 0) length('test.pdf') replace('test.pdf', 'p', 'x') concat('test.pdf', 'test.xml') equals('test.pdf', 'test.PDF') equalsIgnoreCase('test.pdf', 'test.PDF') compareToIgnoreCase('test.pdf', 'test.xml') compareTo('test.pdf', 'test.xml') charAt('test.pdf', 1) endsWith('test.pdf', '.xml') toLowerCase('Hello World!') trim('abc ') + 'd' lastIndexOf('abcabcabc', 'abc', 8) eval(1 + 2)

Math Functions

Full list of math functions with example:

asin(1) atan(2.2) atan2(2.2, 2.3) ceil(2.2) cos(2.1) exp(2.2) floor(2.2) IEEEremainder(2, 6) log(2) max(2.2, 3.5) min(2.2, 6.6) pow(2, 5) random(6) rint(2) round(2) sin(2) sqrt(9) tan(2) toDegrees(45) toRadians(44) abs(-1)

Custom Functions

If the first argument is blank, use the second argument.

isblank('#{rd.path}', '/my/default/path')

Returns the date at the time the job is run

now()

If the first date string is not a valid date, it will use the second date string as the default. This can be now(), another hard coded String, or some other function or set of functions.

toDate('Date String', 'Default Date')

Creating your own custom functions

Operators

The normal Boolean, number, and string operators for Java work with Federation Services Expressions:

+ - && ! || / == ) ( )= (= % * !=

Repository Document Variables

All variables with Federation Services Variable Resolvers are case-insensitive.

Repository Document

CreatedDate ModifiedDate FileName FileLength ID MimeType ObjectType Path 3SixtySourceRepositoryID

Version

totalVersions label seriesId isLatest isMajor listId

Examples

#{rd.CreatedDate} #{rd.ModifiedDate} #{rd.FileName} #{rd.FileLength} #{rd.ID} #{rd.MimeType} #{rd.ObjectType} #{rd.Path} #{rd.3SixtySourceRepositoryID} #{version.totalVersions} #{version.label} #{version.seriesId} #{version.isLatest} #{version.isMajor} #{version.listId}

Custom Fields you need to prepend with field

#{field.myObject.myCustomField}

#{rd.filename} + _ + version.label

#{mytype.claimid} + _ + #{mypolicy.policynumber}

Filters expect a Boolean result, 1.0 or 0.0. So you can do things like:

#{rd.mimeType} == text/html

Nested Functions

You can also nest functions. A good example is toDate:

toDate('#{rd.createddate}', now())

This results in rd.created date being formatted into UTC format, but if for some reason Federation Services detects an invalid date it will use the result of now() instead.

Migration Manager Variables

Variable name: job.lastrundate Description: This variable is substituted by the last successful job run date. If the job is run for the first time, it substitutes with 1st January, 1990 Example: SELECT * FROM mydatabase.events where created_at > CONVERT(DATE_FORMAT(‘#{job.lastrundate}’, ‘%Y%m%d’), char);

Schedule Jobs

Federation Services allows users to schedule integration jobs and job groups to run at a time period that may be more effective for the servers running them. Jobs can be scheduled to run at set times as well as set intervals. By minutes, hours, days, weeks etc.

Creating a Scheduled Job

To create a scheduled job:

Select Schedule Jobs from the navigation menu under the Integration section.
Select the Create New Job Schedule button at the top of the page.
Give your scheduled job a name.
Fill out the Configuration Fields (descriptions below).
Click the Update Configuration button once you are done configuring the scheduled job for it to start running at the set time and interval.

Note: If the schedule on a running job changes, that job will be aborted and the new schedule will take effect immediately.

Configuration Fields

Schedule Name: Name of the schedule.
Run Schedule: Sets the schedule to run.
Run Jobs in Parallel: If using a Job Group, run all the jobs simultaneously.
Schedule Begin Date: The date for this schedule to begin running. The date range still needs to be set on jobs.
- If the date is before the current date and time, the schedule will run immediately if Run Schedule is set to On.
Schedule End Date: The date for this schedule to stop running. The date range still needs to be set on a jobs.
- If the date is before the current date and time, the schedule will inform you and set Run Schedule to Off.
Interval Between Runs: How often the schedule will run. An interval of fewer than 3 minutes is not recommended.
Job/JobGroup: Sets whether to run a Job or a Job Group. If Job Group is selected, you will be able to choose the run order for jobs in the group.

Job Status

At the top of the scheduled job you can see the current status.

Current Status: Unsaved until Update Configuration button is pressed. Then OFF, WAITING, or RUNNING.
Time until next run: As it reads. If this runs to zero, it will not refresh unless the schedule is closed and reopened. Will Display as NA if performing a single run (interval set to 0) or the job is currently running.

List Jobs

A Federation Services Job is the process of moving or syncing content(including versions, ACL’s, metadata) from one CMS (content management system) to another. Since Federation Services Jobs are specifically engineered for content management systems, moving content and metadata is just point and click. There are many Job types as well as Job Tasks that can handle anything from data validation and cleansing to duplication detection.

The List jobs page shows the jobs created. The available Job Types are:

Sync
Simple Migration
Incremental Migration
Event
Polling

Job Types

There are several Job Types available depending on the use case.

Simple Migration: Unlike Sync Jobs, deletions are not processed and content state is not recorded locally other than normal Job Auditing. Simple Migration uses manually entered timestamps whereas Incremental Migration will increment timestamps for each job run.
Incremental Migration: Same as Simple Migration except at the start of each job run the query timestamps will be incremental automatically to read documents that were added or modified since the last run started.
Manage In Place: Creates a shortcut to the file allows users to access files in source repos from Alfresco or Objective ECM without having to move the files allowing users to manage all of their content in place. This type of migration is only available for the Objective ECM Connector and Alfresco Connector at this time.
Polling: A simple migration that runs continuously, the job processes at a specified interval and then sleeps until the next processing time. While it sleeps, the status of this job will be Waiting.
Event: This type of job acts as a listeners for certain Content Service events, in order to run individual documents through a job, without starting and stopping the job. They can be triggered through a push event or by an event configuration. While not running, these jobs will show the status Ready.

Job Groups

To access Job Groups, go to the group section in the left sidebar and select Job Groups. From here you can view and create Job Groups. When creating a job, users are given an option to either create a new Job group or assign the job to an existing job group.

To create a new job group check the Create a new job group for the job check box and enter a name for the new job group. If no name is entered Federation Services will default to the name of the job as the Job Group name
To add the job to an existing group, Select the group from the Job Group drop down. This will add this job to that group once created.

Creating a Job Group

To access Job Groups, go to the group section in the left sidebar and select Job Groups. From here you can view and create Job Groups.

Click Create New Job Group.
Fill in the group’s name.
Select which jobs you want included in the group. You can use the search box to filter the available jobs list.
Click Save.

Job Configuration Fields

Standard Options

ID: The unique identity of this Job. If running jobs using the RESTful interface or command line, this is the Job ID to enter.
Name: Descriptive Name of the Job that will be displayed throughout the UI.
Content Service Connector: If using a Content Service Connector, set it here, and it will be appended to every record processed. This field is required for Federation Integration jobs.
Repository Connection: The Repository Connection used for this Job.
Output Connection: The Output Connection used for this Job.
Type: The type of Federation Services Job.
Job Tags: Another option for separating jobs into groups. If any jobs have tags, they will appear as a filtering option when listing jobs. Tags are case sensitive.
Include documents modified after this date and time: Referred to as “Start Time”, this time will be used to check the last modified date of files before processing them.
Include documents modified before this date and time: Referred to as “End Time”, this time will be used to check the last modified date of files before processing them.

START AND END TIME These values are only set by users for Simple Migrations. For Incremental Migrations, they are calculated based on when the job was last run. Event and Polling jobs don’t use them at all

Notifications

Notifications will only function is email is enabled. This allows the user to set conditions in which the listed email addresses will receive confirm status reports when a job completes.

Auditing

Audit: Where to write audit data too. Note that Job Restart and Document Search will only work if Mongo is selected.
- Mongo: Audit to the Mongo Database or SQL Database if running in SQL mode.
- Log4j: Audit to the 3sixty-admin.log
- None: Auditing is turned off.
Record Audit When: The audit level or what to audit.
- All (Read, Written, Skipped, Deleted and Failed): Full Auditing
- Failures Only: Audit only those records that fail during a job run.
- Failed and Deleted: Audit a record that failed during the job run or was deleted from a Sync Job.
- Failed and Skipped: Audit a record that failed during the job run or was skipped.
- Written, Failed and Deleted: Audit a record that failed during the Job run or was deleted from a Sync Job or written by the Output Connection.
- Written, Failed, Skipped and Deleted : Audit a record that failed during the Job run or was skipped or deleted from a sync job or written by the Output Connection.

Advanced Options

Repository Connection Thread Count: The number of individual workers which will process and queue documents. Implementation may vary in certain systems.
Output Connection Thread Count: The number of individual workers which will process and post documents. Implementation may vary in certain systems.
Batch Size: If greater than 0, Documents will be assigned a batch ID. Batch Ids are a combination of the job id and the job run id (the timestamp of when the job was started), plus the batch number. The batchId will also be set as the highest parent folder for the document. (For example, /home/simflofy will become {batchId}/home/simflofy). Some connectors have their own batch configuration, which is compatible with this field.
Max Queue Size: Limits the number of documents that can be queued at once, making repository workers wait. This can slow down output speeds, which may be useful if your output repository can be throttled.
Max Errors Allowed: Maximum errors before the job stops. Leave 0 for unlimited errors.
Process Relationships : Check this to tell Target Connector to process Document Relationships. This feature only works for the CMIS Connector.
Include Hash : Check to tell Repository Connector to create hash of content if possible.
Repository and Output Time Zone: These values will be used to adjust date fields (Created, Modified, and Mapped) to the appropriate timezone for their destination.
Chain Job: Select a job to run after this one completes.
Chain Job Wait Interval: Amount of time to wait after job completes before starting next job in format: 5s 5m 5h 5d
History Retention: How long to keep the Job Run History for this job before the Job History Clean up Service clears the content.

Job Tasks

Job tasks provide a processing pipeline for documents and metadata. This means doing things like:

Filtering content out or in based on rules, such as metadata values.
Cleaning up file names or file paths.
Re-Parenting based on rules.
Extracting metadata from paths.
Adding metadata from third parties such as Databases, Rest APIs, OCR Engines, etc..
Transforming documents from one mime type to another, such as Word to PDF.
Transforming metadata.
PII - Personally Identifiable Information - detection.
De-Duplication.

Adding Tasks to Jobs

There are over 30 tasks available out of the box and Federation Services is adding more all the time. This is the most popular extension point in Federation Services.

You can choose and configure job tasks on a per-job basis from the Tasks tab in any given Job.

You can re-order the pipeline by dragging and dropping the tasks. You can delete tasks or add new tasks.

The default recommended thread count is 1.

These tasks run in order, meaning if something is changed with the documents or metadata in one task, then the following tasks will see those changes.

Task Groups

Users can create task groups for tasks they will use often so they do not have to constantly create them for every job. When you select “Yes” under use a task group, a second box will pop up.

Run Group Task Options:

Select before or after to let Federation Services know when to run the tasks in the group. Selecting before will run the group tasks prior to any addition tasks listed in the job, while selecting after will run the tasks after the others.

In the drop down provided, select the Task group you would like to run with this job. Once selected you will see a list of all the tasks in the group. Click on the eye icon to view the details of each task.

Run and Monitor Jobs

The Run and Monitor Jobs page is where you can monitor the progress of your integration jobs. From this page you can see the status of your jobs. How many files have been migrated. When the job was last ran and any errors that might have occurred during the integration.

Note: When migrating documents with multiple versions, if a version in that series fails for any reason, the subsequent versions will not be processed. They will instead be ‘Skipped’ and previously uploaded versions will be cleaned up/deleted.

How to Run and Monitor Jobs

Under the Integration section in the navigation menu click Run and Monitor Jobs. On this screen, jobs can be stopped, started, and errors run.

Jobs can be filtered by Job Groups if you wish to focus on a specific set of integration jobs. See the following article for instructions on how to set the Job Groups up and use the drop-down to filter the job list by the group selected.

Set Refresh times for job runs. The page will refresh in the time period selected to show the updated job status.

Use the Search field to filter the job list for specific jobs.

Use the Show entries drop-down to select how many jobs get listed on each page.

The list of jobs displayed will tell the user the job status, how long the job took to run, how many files have been read, written, have errors, removed and skipped.

Reviewing Job Errors

If any errors occurred during the job run, you can click on the error count to get to the error report page. From the error report you will be able to see the cause of the errors and download them as a csv if needed. For additional debugging instructions see the following article: Federation Services Logging Instructions.

Resuming and Restarting Jobs

Resuming a job: Usable when a job is aborted. Reruns the job, but a special task is added that will skip documents which had errors in the previous run. Run Errors: Available when a job fails or is aborted. Only the errored documents will be processed.

Advanced Configurations

PII Detection and Reporting

While doing a document migration or synchronisation it may be prudent to check the document, or it’s metadata for PII. You can also crawl you content simply for PII by using our reporting output connector with any repository connector that we support.

The PII Scan feature allows the user to select a configured repository connector and job, then scan the repository for Personally Identifiable Information (Pii).

This can include phone numbers, social security numbers, proper names, and addresses.

A temporary job is created using the repository configuration of the selected job.

Important: File size limit is 95MB

Running the PII Scan

Under Analytics select PII Scan and complete the PII Scan page:

Select the repository connection you want to scan
Select the Job to scan for PII
Optional
- Start Date: Select a date to start searching for PII. If left blank the Job start time will be used
- Exclude Binaries
- Define list of properties. The default is all properties.
- Notify of completion by email.
Select the Scan for PII button

Viewing PII Data

Once the PII scan runs, users can view the PII report in the Discovery UI by either:

Clicking on the link provided in the Scan complete notification box.
Selecting PII Data Viewer in the left navigation pane on the home page of the Discovery UI.

Document Duplication Detection and Reporting

There may be a need to identify duplicate documents in your enterprise and Federation Services allows you to identify these duplicate documents in a variety of ways.

One way to identify duplicate documents is by using the Duplication Check Job Task which allows you to log, skip or fail documents that are duplicates. This works well for large scale integrations when combining a number of legacy source systems into one new enterprise content management system.

Another way to identify duplicate documents would be by leveraging Federation Services’s Reporting Output Connector. Using the reporting output connector you can read content in from any source system Federation Services supports and report on the content that is found. One of these reports is a hash of each document seen. Using this hash plus MongoDB’s aggregation framework we can generate a CSV or JSON reports of all duplicate records. You can obtain the hash of a document by including the Hash Generator Job Task in your Job Tasks.

After crawling your source system and outputting to the Federation Services Reporting Connector you can now run the following commands against MongoDB. Start my typing mongo in your terminal:

Depending on how many documents you found during the crawl > 100,000, you may need to add docHash index.

db.tsRecordProcessed.createIndex( { docHash: 1 } )

Next we group by docHash and output to a new collection named duplicates (you can name the new output collection anything you like):

db.tsRecordProcessed.aggregate([{$group:{_id:"$docHash",docs:{$push:"$doc_id"},doc_names:{$push:"$doc_name"}}},{$project:{docs:1,doc_names:1,numDocs:{$size:"$docs"}}},{$match:{numDocs:{$gt:1}}},{$out:"duplicates"}])

You can now export de-dupes collection to CSV or JSON using mongoexport:

mongoexport --db simflofy --collection duplicates --fields _id,docs,doc_names
--username user --password "pass" --type=csv --out duplicates.csv

Stuck Jobs

If a job is running and the Abort does not stop the job in an appropriate amount of time, the job can be killed manually through an Admin page. See Active Jobs for more information.

Federation Services Job Flow

Job Flow

Task List

Tasks are run for each document that has been read. A number of tasks can be set to remove documents from the queue if they don’t fit certain criteria. Documents removed this way will not be processed by any following tasks. It should be noted that Calculated Fields are calculated before tasks, while Fields Mappings are handled after tasks.

Commonly Used

Duplication Detection

This task checks the chosen field to find duplicate documents during the job run. It takes the selected actions against the document if a duplicate one exists. To see how each repository handles duplicates and versioning check the individual connector page.

When the duplication check is run multiple times, the original file will not be marked as duplicate.

Configuration

Field to Compare: The field whose value will be used to check for duplicates. If this value is found in any other document it will be considered a duplicate. The default is the File Content Hash.

File Content Hash
Document Type
Document Source Id
Document URI
Version ID
Version Series Id

If you wish to compare file hashes (a sort of fingerprint for a document), you will need to precede this task with a Hash Value Generator Task.

Duplication Check Scope:

Run Job: Check the documents associated with this job run only
Job: Check all documents ever ran for that job
Enterprise: Check all documents ever processed through 3Sixty

Action: What to do if a document is found.

Audit and continue
Skip the document
Fail the job

Tagging Duplicate Documents

Metadata can be added through mapping to tag documents discovered as duplicates when the Action field is set to Audit and Continue.

The fields that can be added are:

isDuplicate: true (if duplicate found) or false (if not) Important: When using the Duplication Detection task in a job the user must make sure that if they are mapping the “isDuplicate” field that they set the Target type to String and not Boolean or they will receive an error that they cannot change text to boolean. If this error is received the user has to drop the index and run the job again.
baseParentID: doc ID of the original document
duplicationParentID: comma separated list of doc IDs of the documents that it found duplicates against - blank if no duplicate detected
duplicationScope: blank if no duplicate detected, and
duplicationCriteria: which fields the duplicate was considered against; depends on what was selected below - blank if no duplicate detected

Filename Cleanse

This task uses regex (Regular Expressions) to alter filenames.

Common use cases are clearing unwanted characters, such as whitespace, or non-alphanumeric characters.

Regex to Match: The regex pattern to search for in the filename.

The default value matches any character that isn’t a letter, number, space or period an unlimited number of times.

Replacement: Replacement for matches.

EXAMPLE - ALPHANUMERICS

The pattern for alphanumeric characters is [a-zA-Z0-9], or if you wish to include underscores \w.

To select for non-alphanumeric characters we add the carat (^) before the pattern, so ^[a-zA-Z0-9].

The carat character simply translates to “Not”, so it negates whatever is after it.

EXAMPLE - CLEARING UNWANTED SPACES

The pattern \s is regex shorthand for “spaces”. If you’re worried about tabs, line breaks etc. add an asterisk (*) after the pattern for what is called “greedy” selection.

Adding this as your regex and setting the replacement as ''.

Filename Extraction

Extracts the file name from another field using regex (Regular Expressions). It will set the file name to the value that matches the regex.

Regex to match: The regex to match and convert to and set as the file name.
Data Field: The repository document field to match the regex on.

EXAMPLE - ALPHANUMERICS

The pattern for alphanumeric characters is [a-zA-Z0-9], or if you wish to include underscores \w.

To select for non-alphanumeric characters we add the carat (^) before the pattern, so ^[a-zA-Z0-9].

The carat character simply translates to “Not”, so it negates whatever is after it.

EXAMPLE - CLEARING UNWANTED SPACES

The pattern \s is regex shorthand for “spaces”. If you’re worried about tabs, line breaks etc. add an asterisk (*) after the pattern for what is called “greedy” selection.

Adding this as your regex and setting the replacement as ''.

Folder Path Cleanse

This task uses Regular Expressions (regex) to alter Folder Names. Functionally identical to Filename Cleanse Task, except that is changes the document parent path.

Regex to Match: The regex pattern to search for in the Folder Name.

The default value matches any character that isn’t a letter, number, space or period an unlimited number of times.

Replacement: Replacement for matches.

Obsolete Detection

The obsolete task can be used to identify obsolete documents when reading them from the source repository. This can be used either in the federation view for identifying documents that are obsolete based on the definition set in the task for the job by adding metadata or skipping/processing the documents which are obsolete. You can define what obsolete means for your organisation - you can define it based on the date created or updated and the content before what timeframe is considered as obsolete.

Note: The Obsolete Detection tasks uses the system time of the 3Sixty server when calculating date/time scenarios.

Fields

Before when will the files be considered obsolete
Before
Custom Date
Action

Metadata

isObsolete - Yes/No
obsoleteField - Date Created or Date Updated
obsoleteBefore - date after which the content is considered obsolete

Override Filename

The task uses the 3Sixty Expression Language to override the file name of each document. The functionally of this task is identical to the Override Folder Path task.

Configuration

Pattern: List the fields you want to use to rename the file
Deep Change: If the file has versions, update their names as well.

Override Folder Path

Add the Override Folder Path task from the drop-down on your job.

A sample pattern has been provided for you by default. You can also leverage the 3Sixty expression language when modifying your path. More information on the Expression Language can be found here. Click the Done button when you have finished modifying your job task.

The example pattern is:

'/' + '#{rd.filename}' + '/simflofy'

Using the Expression Language you can see that rd. are internal 3Sixty Fields.

However, you may want to use metadata from the source system to generate your path.

Use field in place of rd to accomplish this or leave off the prefix entirely.

The best way to put your path together is to know what fields are available and what the field values look like. We suggest running the BFS output with no mappings and Include Un-Mapped Properties set to True. This will generate a xml file such as:

<properties>
<entry key="document.name">Alfresco Ingestion.pptx</entry>
<entry key="type">document</entry>
<entry key="folderpath">test3Sixty_Partners</entry>
<entry key="separator">,</entry>
<entry key="document.Culture">en-US</entry>
<entry key="document.CustomerId">123</entry>
<entry key="document.Category">legal</entry>
<entry key="document.lastindex">23</entry>
</properties>

Now let’s say we want the actual folder path to be a combination of folderpath + Culture + Category + Customer ID. To do that we just reference each field like:

'/' + '#{rd.path}' + '/' + '#{document.Culture}'+'/' +
'#{document.Category}'+'/' + '#{document.CustomerId}'

PII Detection

The PII Detection Job Task is uses regex expressions to detect PII in any document or metadata passing through 3Sixty. The regex expressions are stored in the form of a .properties file.

Caution: File size limit is 95MB.

PII FLAG
This task will always add the boolean field hasPii for the purposes of mapping and analysis.

DEFAULT FILE LOCATION

The default file is located at 3sixty-admin/WEB-INF/classes/simflofy-pii- detection.properties:

Field To Mark: The output metadata property to store PII detected. The value of this field will be a map

{
"PhoneNumber": 20,
"Names": 200
}

Break up PII data into individuals fields: Instead of adding the PII as a map, 3Sixty will break it up as individual fields for easier mapping/processing.
Prefix for PII fields: If breaking up PII data, the prefix to use for each field. If left blank ‘pii’ will be used.
Fields To Check: Source properties and/or document to check for PII. Use ALL_PROPS to check all properties, BINARY to check the document (extracted via Tika) or individual property names.

In this case, the above fields will come across as

pii.phonenumber and pii.names

Rename File On Duplicate File Path

Functions similarly to the Duplication Check task, except if it finds a duplicate, it will rename the file using the supplied pattern.

Tika Text Extractor

Apache Tika is an open-source tool used to extract text from documents. 3Sixty most commonly uses it to extract text during indexing for federated search.

For this feature to work on larger files, memory pool settings of 4GB is required, 8GB recommended. This can be updated in the Java tab of your Apache Tomcat Properties window.

Caution: File size limit is 95MB.

Tika Content Field: The field where the task will put the extracted content.
Max Content Length (B): Set the max content length which is checked before processing. The job will not process documents over this size. Set to 0 to process documents of any length.
File Extensions to Extract: Comma delimited list of file extensions to process or leave blank to process all. The extensions are checked at the same time as content length. Fail Document on Extraction Error: Fail the Document if there is an Extraction Error during processing.
Remove Content After Extraction: Remove the content from the documents. This will happen even if the document exceeds the maximum length.
Stage on Filesystem: Stage content on the filesystem for extracting text or set to false to use in memory.

Trivial Detection

The trivial task can be used to identify content that is trivial in nature (as it holds no importance from a corporate knowledge perspective). The definition of trivial depends on your organisation. You can set whether you want to filter on documents of certain sizes or if you want to filter out files with certain extensions or document types. For example, dmg, and exe are installers and may hold no corporate importance so you may want to skip such content from being registered as a record, you can now do so by adding them to the filtered list of extensions.

Configuration

You can add several filters to include documents that meet all the stated criteria.

Filter on document’ below a specific size
Filter files below size (bytes)
Filter on document’s above a specified size
Filter files above size (bytes)
Filter on file extension
Filter on document type

Once the filters are selected you can then determine what action should be taken with the files that meet the selected criteria.

Audit and Continue
Skip the files
Fail the job

Metadata

The trivial detection task also includes the following default set of metadata:

isTrivial - Yes/No
ignoreSizeBelow - content ignored below size in bytes
ignoreSizeAbove - content ignored above size in bytes
ignoreExtensions - comma separated list of extensions that were listed in the criteria
ignoreDocTypes - comma separated list of doc types that were listed in the criteria

General

Basic JDBC

This task is for a scenario where if you have extra metadata that points to an external source repository like a JDBC Database 3Sixty will merge that data into your documents as they are being read.

Configuration

User Name: JDBC User Name
Password: JDBC Password
Driver Class: The driver for the JDBC database. A number of possible drivers are listed here.

The Database Driver Class supports:
- com.microsoft.sqlserver.jdbc.SQLServerDriver
- oracle.jdbc.driver.OracleDriver

Caution: Ensure that your driver jar file is in the 3sixty-admin/WEB-INF/lib folder at start-up.

JDBC Url: Url to connect to the database. Each type of database uses a different format. Refer to the linked table for formats.

JDBC Url examples:
- jdbc:sqlserver://localhost:1433;encrypt=false;databaseName=Objdemo
- jdbc:oracle:thin:@dos1029ecm01.eng12.ocl:1521:ora
ID Field: The field (without table name) which will be used to name binaries queried from the database.
Query: The query to execute. All results will be added to the document in the format [tableName].[fieldName]

Note: The query can insert expresions. And the query will only return one document.

Buffer Binaries to File System

This task will buffer files to temporary directory. In this case the value of the System Property java.io.tmpdir. The task takes a timeout which will fail the document if it does not complete staging in the given amount of time. The default is 10 seconds, but for larger files (100+ MB), we recommend 60 seconds.

Convert Array Value To String

Converts a property on a repository document from an array to a String.

Configuration

Delimiter: The delimiter to place in between values of the Array. If left empty will default to space
Property Name: The array repository document property to convert into a String. The new String property will have the same name. This field cannot be an expression.

Convert String to Boolean

The String to Boolean Task is designed to check the value of a string and return true of it matches the expected value. This is a quick way to convert a string field into a Boolean and is used to manipulate data as it is being processed. You also have the option to specify whether the match should be case-sensitive or not.

Here’s an example:

In the above example, we are going to check the field dc:availability. If the string value of that field equals available (Expected Value) the dc:availability field will be updated to a true (boolean). As we elected not to use case sensitivity, the string AvailAble would also return true. Otherwise, a boolean false value will be stored.

Convert To UTC

This task will convert date fields to the given format, with the given offset.

Configuration

Date Fields to Convert: A comma delimited list of repository fields or the output of calculated field.
- If you do not supply field names, the task will attempt to process all the Job’s Date/Time mappings. Note: only calculated field mappings are available to job processors.
Offset: Timezone offset.
Date Time Format: Final format for the field you’re converting.
Define Date Time Format Check box: Check to define an output format, uncheck to return default format (i.e. 2019-11-05T16:00:00Z)

Date Based Folder Path

This task takes ones of the date fields on the Repository Document and uses it to generate the parent folder path for the document. This mimics how Alfresco stores its data in the filesystem

Configuration

Repository Document Field: the field to use for the path - can be any date field
Pattern: The pattern used to break down the date into a folder path

The default is /yyyy/M/dd/kk/mm/ss/

In this case, kk is used for hours.

Example

For the document file.txt, if you set the field to use the modified date '#{rd.modifieddate}' and the date is 1999-04-20T12:01:23, then your path with the default pattern will be:

/1999/4/20/12/01/23/file.txt

Field Lookup

This task is intended to allow users the ability to perform a look-up operation and update the matching fields of the repository documents. This task was originally created for integrations in which certain field values could only be obtained by importing them from an external source.

Task Name: Name of the Lookup Job Task
Delimiter: Denotes the delimiter character if values are not in list format.
Property Field: Path Uses preset configurations to evaluate values against the path
- File Name Uses preset configurations to evaluate values against the file name
- Other Property Evaluates a property entered by the user against the given values
Property Name: A specific property entered by the user. This property will be used only if Other Property is selected from the property field drop-down list.
Values: Key=Value listing to be used as the source data for the lookup operation.
Look for an exact match: If set to true, properties that have an exact (whole case) match will be updated. If set to false, properties that contain the entered value(s) will be updated. For instance, if set to true the value 123 will only have a match with 123. If false, a match will be identified if 123 is present in any part of the field and only that portion will be updated with the new value.

3Sixty can utilise any properties that are associated with a repository document. We recommend running a BFS output job with no mappings and Include Un-Mapped Properties set to True. This will generate a xml file similar to the example below and allow you to see what properties are available for your documents:

<properties>
<entry key="document.name">3Sixty Overview.doc</entry>
<entry key="type">document</entry>
<entry key="folderpath">test3Sixty_Setup</entry>
<entry key="document.Culture">en-US</entry>
<entry key="document.Customer">123456</entry>
<entry key="document.Category">training</entry>
</properties>

In the example above you will notice that the document.Customer field has a numerical value associated with it, but not the actual customer name. The task will allow you to import a list of customer IDs along with their associated customer names and update those values as desired.

The updated values will then be available in subsequent job tasks. For instance, you could then use override folder path task to build out a folder structure with the updated values. Using the scenario above, you could build out a folder for each customer by name instead of numerical value.

HTTP

The purpose of this task is to execute a GET or POST HTTP call for each repository document. If the status is any of the following, then processing will continue, if not, the document will be skipped.

200 OK 201 Created 202 Accepted 203 Non-Authoritative Information 204 No Content 205 Reset Content 206 Partial Content 207 Multi-Status (WebDAV) 208 Already Reported (WebDAV) 226 IM Used (HTTP Delta encoding)

Configuration

Username/Password: Credentials to access the endpoint
Action URL: The endpoint being contacted, without parameters
POST or GET: Selects the HTTP Method
Request parameters: In the form of param1=value1&m2=value2
- The ‘values’ can be dynamically created using the Expression Language
Send body with request: The body can be raw text and can use the Expression language.
- If the text begins with the symbols [ or {, then the body will be need to be valid JSON
Action to take if the call does not return 200 (OK): Continue, Skip or Fail are the options
Timeout in milliseconds: How long to wait for the call to execute before attempting to continue.
- If the timeout is reached, the action selected will be used.

JavaScript Processing

This task is a custom operation that allows the user to ask 3Sixty to do things that are not already available in the software. 3Sixty’s JavaScript Task allows you to run JavaScript against a repository document during processing. The task uses the ECMA 5.1 specification to execute the JavaScript against the native Java API. Currently, only one root object is exposed called rd (repository document). Any method that can be called against the repository document can be called using JavaScript syntax.

Fields that are also available:

Variable Name : Variable The Job ID : jobId The Job Run ID : jobRunId

The variable result is automatically declared in each run of the JavaScript task, and its return value can be used to control which documents continue processing.

The task only supports single quotes for strings. Double quotes will cause errors due to how the content of the script is saved to the database.

Caution: When creating the database table name please note that it must be all lowercase.

Pause

This task can be useful when outputting to a repository with rate limits, such as Box or SharePoint.

Pause in seconds: How long to pause for each document.

Remove Mappings

Removes mappings from a repository document if the field has no values.

Source Fields: Comma delimited list of the mappings to remove if it has no values.

Two Way Sync

The Two Way Sync Job Task is to filter out any unnecessary documents when doing an incremental sync between two systems. Documents can be filtered by matching MD5 Hash or If the document was just seen in the last run by the other sync Job.

Configuration

For Two Way Sync you will need to set up two Incremental Migration jobs and chain one Job after the other. You will want to put the Job that should win on a collision as the first Job in the chain.

On each Job add the TWO Way Sync Job Task and, if using the MD5 Hash option, you will need to add the MD5 Hash Job Task before the Two Way Sync Job Task.

Check Last Run: If this document was just moved by the other Job in the Sync to this Repository, then let’s skip it. If you want to sync it back (maybe after a rule/formula set to false).
Compare MD5 Checksums: Check MD5 Hash of this Document and the Last Document moved by the synced Job. If set to true and MD5 Hashes match, then skip this Document.
Job ID of Other Job: The Job ID of the other chained Job in this Two Way Sync

Metadata

Join Mongo Document Metadata

The purposes of this task is to retrieve metadata from an outside mongo database during a migration, adding it to the repository document before mappings are performed.

This task is called a “join” because it checks for a value on both the Repository Document and in the Mongo Document.

The checked field on the Repository Document can be either a standard field mapping, or it can be calculated using the Expression Language.
If either checked field is missing, the process will be skipped.
If the values of the checked field matches, the comma delimited list of fields will be added from the Mongo Document, in lowercase.
They can then be added as normal field mappings in a job mapping, as Field mappings take place after Tasks.
This task supports multiple fields to check on the Mongo document and repository document
Users can sort the order in which the fields are checked

Configuration

Fill in the following fields to configure this task. Some fields come with default configurations. Update them as necessary.

Task Name
The fields you’re checking for on the mongo document (comma separated)
- Default: doc_id,doc_name
The fields you’re checking for on the repository document
- ’#{rd.id}’
- ’#{rd.filename}’ Each line contains one field. The number of fields must match the ones you are checking on in the Mongo document.
Comma delimited list of fields to add to the repository documents, should the fields match
Mongo field to be used for sorting followed by a comma and sort order
- docCreated:1 The Sort order can be +1 for ascending or -1 for descending.
The MongoDB Url
- localhost
The Mongo Port
- 27017
The mongo database to query
- admin
The mongo collection to query
- collection
The mongo user, leave blank if no authentication is set
The mongo password, leave blank if no authentication is set

Metadata Extraction - EML

An EML file is an email message saved by an email application, such as Microsoft Outlook or Apple Mail. It contains the content of the message, along with the subject, sender, recipient(s), and date of the message. EML files may also store one or more email attachments, which are files sent with the message.

This task extracts metadata from emails and adds as fields to the repository document. These extracted fields appear as:

Email.Subject: The subject line of the email.
Email.From: The ‘from’ line of the email.
Email.To: The ‘to’ line of the email.
Email.Body: The body of the email.
Email.CC: The CC line of the email.
Email.BCC: The BCC line of the email.
Email.EmailList: Space separated list of To, From and CC addresses.
Email.InternetMessageId: The message ID of the email.
Email.Size: The size of the email.
Email.HasAttachments: Whether the email has attachments or not.
Email.Attachments: Comma delimited list of email attachments by name. Blank if none.
Email.AttachmentCount: The number of attachments in this email.
Email.DateTimeSent: The date time the email was sent.
Email.DateTimeCreated: The date time the email was created.
Email.LastModifiedTime: The modified date time of the email.
Email.DateTimeReceived: The date and time the email was received.

Metadata Extraction - Path

This task will extract the metadata from the file and add it as fields to the repository document.

Configuration

Path Field: Which field to use to extract metadata. Options are:
- Repository ID (for repos that use the entire path as an ID, such as Filesystem, or Amazon S3)
- Parent Path, for repos that don’t use paths as ID.
Path Rules: Comma delimited list of rules to extract. The format is [New Field Name]=#{Location}:
- Location takes the form of an integer, starting at 0 for the root folder.
- Additionally, 'filename','parent',and 'grandparent' with **quotes included can be used.

Output Metadata As JSON

This task fully converts each document to a json object and exports to it to the configured path on the local filesystem where 3Sixty is running.

Output Path: Folder where the json files will be written. Path will be created if it does not exist.

File name pattern = [filename].document.metadata.json

Folder name pattern = [foldername].folder.metadata.json

Replace Metadata

This task will perform a Regex search on the configured metadata field and replaces it with supplied text.

Source Metadata Field: The field to check.
Regex to Match: A regular expression to use as the first argument of a Java String.replaceAll() method.
Replacement Text: The text to use as the second argument of the replaceAll() method.

Metadata from Properties File - Filesystem

If there is a separate properties file then 3Sixty will get the metadata from that file. This only works for a File Systems connector. Configure this task by completing the following fields.

Properties filename
Skip files with no property file found in hierarchy

File

Attach Content Binaries

This task is designed to grab a binary from the filesystem given a path set in the form field. The task will look for a file of the same name.

Expressions should be used to set the path dynamically. ie /path/to/ + #{rd.FileName}
If there is no file at the specified path, the documents will be skipped.
If there is an error reading the file, an error will be noted in the tasks/job run.
If the task does not detect the use of an expression through the characters '${', it will search the document metadata for a field which contains the absolute path to the file.

Attach Content - External Repo

This task is for a scenario in which the source contains document metadata and also contains the file path to it’s content.

This allows the user to read metadata and attach the file to it that matches the metadata field given.

Configuration

Enter the field that references the Object Id for the file content.
If you include metadata 3Sixty will combine the properties between the two files

Note: The text input does not currently process expressions.

Attach Content - File System

This task is designed to grab a binary from the filesystem given a path set in the form field. The task will look for a file of the same name.

Configuration

Expressions should be used to set the path dynamically. ie /path/to/ + #{rd.FileName}
If there is no file at the specified path, the documents will be skipped.
If there is an error reading the file, an error will be noted in the tasks/job run.
If the task does not detect the use of an expression through the characters '${', it will search the document metadata for a field which contains the absolute path to the file.

Attach Content - FTP

Sets the binary of a document to a binary found on an FTP server at the location found using the Path Expression.

Username: The username needed to log in to your FTP server
Password: The password needed to log in to your FTP server
Host URL: The URL of the FTP server
Encrypt Data: Sets whether to encrypt the data connection to the FTP server with TLS. Note that Use FTPS must be checked if you use this.
Use FTPS: Sets whether to use FTP with SSL protocol.
Is Implicit: Sets whether to use FTP implicit SSL or FTP explicit SSL
EPSV With IPV4: Sets whether to use EPSV with IPV4.
Server Port: The port your FTP server is listening on
Thread Count: Number of FTP poster threads

Attach Content - S3

Attaches a binary to the repository document from an S3 bucket.

S3 Access Key: The access key for your S3 connection
Client Secret: The client secret for your S3 Connection
Base Folder Path: A base path to be prepended to your key
S3 Bucket: The bucket to search for the binary. Also, list items either all have periods or none of them do.
Key Field: The field on the repo document that contains the key for S3 bucket lookup.
S3 Region: The region of your S3 bucket

Filter Expression

The filter expression task allows you to remove files based on the expression used. The task has two fields, the task name and the filter expression. The task name is arbitrary but should identify the type of filter you’re creating.

Filter Expression

The filter expression can use any of the 3Sixty Expression Language components to form simple or more complex expressions for evaluation by 3Sixty.

Note: If the expression evaluates to true, the document will continue to be processed.

For example: If you want to exclude the OS level file Thumbs.db in your job, you can filter those files out using the following statement: !equalsIgnoreCase('#{rd.filename}', 'Thumbs.db')

The above example will find all the files that have the name Thumbs.db and set the expression to false for the documents. Therefore, all files with the name Thumbs.db will be skipped.

Generate Thumbnail

Generates a thumbnail for a repository document and adds it as a rendition.

Note: This task currently only supports jpg, png, bmp, wbmp, and gif.

Resize Width: The width of the thumbnail image in pixels
Action: The integration action to take following thumbnail generation
Types of Thumb Nails: List of comma delimited file extensions to support

Hash Value Generator

Creates a hash of the document content and sets it on the repository document. Can be used in tandem with the Duplication Check task to find duplicates.

Staging: Where to stage the data locally, while creating the MD5 Hash.
Algorithm: The type of hash to use. MD5 is the default hash, but SHA hashes are available

Federation Services can generate Hash for any File Size. But users are provided with the option to add File Size limit while generating Hash. If the user does not want to generate a Hash for a file of a particular size, then they can put it in megabytes.

HTML to PDF

This task takes a single argument, which is a file path to an XHTML stylesheet ( *.xsl).

The binary stream is taken from each repository document and converts it to a pdf using the template.

Property XML Parser Job

This task will take an XML file and use the content to create metadata fields.

Field to Parse: If the content of the file is the xml to parse, use BINARY. If the xml appears in a field, use the field name. This is a required feild.

The xml is expected in the following form:

<rootElement>
<comment>Comment Text Here</comment>
<entry key="fieldName1">fieldValue1</entry>
<entry key="fieldName2">fieldValue2</entry>
</rootElement>

The root element is named as such for the example. The task searches for “entry” children.
Comment is an optional field and will be added as comment to the document properties.

Remove Renditions Matching Binary MimeType

This task takes no arguments. It will remove all renditions that have the same MimeType as the Original Document Binary.

Skip Blank Filename

Skips a file during migration if its file name is blank.

No additional Task Configurations needed.

Skip on Empty Field

This task will skip a document if the supplied fields are all blank.

Text to PDF

This task takes no arguments, and simply converts text binaries from a Repository Document into a PDF file on output. If the mimetype of the document is not ‘text/plain’, it will be skipped.

Text/HTML to EML

This task takes the content of an document and converts it into email.

Unzip

This task will unzip any compressed files during the migration process and send the documents directly to the output connector. Documents that are queued this way will not increment Read, but will increment processed.

Configuration Fields

File staging location: The location unzipped files are stored temporarily while processing
Skip the zip file after unzip: Check this box to skip the zip file during processing. If the box is unchecked the zip file will be included in the document transfer

Caution: When unzipping files to an ECM output location, if the zip file has the same file name as another document in the folder, the file will not unzip to that location as ECM doesn’t allow folders to be created with the same name as a file. It will result in a duplication error.

Text Extraction

AWS Textract

This task will extract text from PNGs, JPGs, and PDFs and stores the text on the repository document in the simflofy_ai_text field.

Configuration

Authentication Connection: An Amazon authentication connection with your Amazon AWS credentials

Google Vision Text Extraction

Extracts text from .tiff, .pdf and .gif files and stores it on the repository document in the simflofy_ai_texts field.

Authentication Connector: Your authentication connection for Google You can find it in the url while edit or view page for the connection

Tesseract Text Extraction

PREREQUISITE This task requires Tesseract to be installed on the system that Federation Services is running on.

This task uses Tesseract OCR to scan for text from images and PDF files, saving that text to a field in the repository documented called simflofy_ai_texts. Supported formats are .png, .jpg, .pdf, .tiff, .gif, and .bmp. PDFs are saved on a per-page basis to simflofy_ai_texts.

Note: Tesseract OCR will be an optional dependency of Federation Services.

Configuration

Tessdata Directory: The path to your Tessdata folder. This folder should have the trained data of the language you plan to OCR.
Tesseract Library: The path to your Tesseract library folder containing the proper library files for your OS.
Engine Mode: Select which engine Tesseract should use, legacy or LTSM. Ensure that ensure is installed before selecting it, or leave it on the default config for it to detect your engine.
Page Segmentation Mode: By default Tesseract expects a page of text. You can change the way it segments a page if your images differ from this.
Tesseract Language Code: The language code for the installed trained data in your Tessdata directory. This is in ISO 639-1/T format and is the letters before the .trained data extension for the trained data file.
Use HOCR: Whether to use HOCR. When enabled, text will be output in HTML format rather than as raw text.

Image Analysis

AWS Image Recognition

Detects real world objects in images and adds these labels to the repository document on the field simflofy_ai_labels using the AWS Rekognition system.

For your reference, these are ‘MinConfidence’ and ‘MaxLabels’ here. This page is what this task uses.

Configuration

Minimum Threshold: The minimum confidence threshold for labels to return following label detection. Labels with a confidence level lower than this will not be returned.
Max Number of labels: The maximum number of recognised labels to be returned, by highest confidence.
Auth conn: Your Amazon AWS credentials
Max number of labels: The number of labels you want to return

Google Vision Image Labels

Detects real world objects in images and adds these labels to the repository document on the field simflofy_ai_labels using Google Vision.

Max Number of labels: The maximum number of recognised labels to be returned
Authentication connection: Your authentication connection for Google. You can find it in the url while edit or view page for the connection

Watson Image Analysis

Uses IBM Watson Image Analysis to analyse an image, adding its response to a specified field.

The IAM API: Key for IBM Cloud. See instructions here on how to get it.
Collection ID: If left blank, 3Sixty will create one
Collection Name: Name of the collection to create. If left blank, will be new-training-collection with the date appended.
Training Mode: Images will be used in conjunction with the training json to train your analysable. In this mode, no analysis will be performed.
The training json: Required for training mode. The format is {“dog”:[25,35,105,215]}. Array order is top, left, width, height with object type as the key.
Field that contains the image urls: If left blank the file content will be used.

Alfresco

Alfresco Job Run History Nodes

The purpose of this task is to get the Alfresco Node Reference from the Current Job Run History, in order to update an existing document from a previous Job run in Alfresco, rather than creating a new one.

It’s used in Incremental or jobs where the Alfresco connector is being used. This task uses the Source Repository ID from the Repository the document is pulled from, in order to find the document in the Job Run History.

Configuration

Set up: There are no configuration options for this task. It uses the Document’s Source Repository ID in order to find the node Reference in Alfresco. Exception/Requirements: Since the Source Repository ID is used, any source repositories that use a Path Based Source ID will not work if the path of the document changes.

Alfresco Property Mapping Nodes

This task is for getting existing node references in an Alfresco instance, in order to update them, rather than create a new one if the existing file has moved from its original ingestion location. It’s used in jobs where the Alfresco Output connector is used. Typically, in incremental or sync jobs, where changes to a file from a source system need to make it to the node it previously integrated to.

Configuration

How to set up: Provide the query JSON object, which is then sent to Alfresco which will return the matching Alfresco Node Reference, which we’ll use to update the document.

Exceptions / requirements: If the query is too vague, and matches more than one node/document, then none of the documents will be updated. The query has to uniquely find a single node/document in Alfresco.

ACL

CMIS ACL modification

This task is meant to be used with a CMIS Repository connection. It will use the Repository Document id, and gather the current ACL for the document. It will then generate a new ACL based on the parameters. This task establishes a session upon initialisation, and keeps it open until the job run is complete.

The principal lists are pipe (|) delimited, to account for LDAP style principals. You will need to know the exact principal ids of the ACEs (Access Control Entries). The task works by cycling through the current ACEs and a series of if-then logic to construct a new ACL.

Configuration

User name: CMIS server user name
Password: CMIS server password
Connection URL: CMIS server url
Repository ID: CMIS repository ID
Pipe (|) delimited list of principals to ignore from each document: A list of principals to ignore and not add modifications to. The ‘ignore’ list will bypass any modification to a matching ACE and add it directly back to the new ACL.
Pipe (|) delimited list of principals to remove from each document: A list of principals to be removed from the ACL. The remove list will skip over a matching ACE, leaving it out.
Comma delimited list of principals to add to each document: A list of principals to be added to the ACL. The add list is checked separately and will generate a new ACE will the selected permissions.
Permission to add to the principals: Not ignored ACEs and added ACEs will have this permission added to them in the ACL.
- Read
- Write
- All

FileNet ACL Modification

The purpose of this task is change the permission lists of integrated documents in the IBM FileNet Repository.

Authentication Connector ID: The ID of your P8 Authentication connection is available in the Authentications Connection under the Integration menu. You can find it the url of the edit or view page for the connection
Change List: Pipe (|) delimited list of principals to change from each document.
Permissions to Change: Access level for the changed permissions.
Add List: Pipe (|) delimited list of principals to add to each document
Permissions to Add: The access level to add to the new permissions.
Removal List: Pipe (|) delimited list of principals to remove from each document.

Note that any permission not added to these lists will be ignored.

File System ACL Extraction

Extracts ACLs from the Windows or Linux filesystem document and adds them to the repository document.

File System Operating System: The operating system that these files are being read from.
Append _DENY to Deny ACL Type permissions: Appends _DENY to the end of an extracted permission if it’s of the type DENY, in case you want to track this later on.

This task will have some different behaviour depending on your operating system. In a POSIX environment (macOS or Linux) permissions may be added as the field document.permissions with the permissions in a semicolon(;) delimited list, if any exist.

If the filesystem supplies an owner, it will be added as simflofy.owner.

Additionally, simflofy will create a permission map of the principals and their permissions. It will set is as the originalPermissions field:

Map<String, Set<String>> permissions = new Map<>();
//process acls
rd.setOriginalPermissions(permissions);

Finally, if any User Defined File Attributes (extended attributes), they will be added as a semicolon delimited list in the field simflofy.userattributes.

Generic ACL Mapper

The generic ACL mapper job task allows you to create simple rules for matching principles and permissions from one system to another. ACLs will need to be extracted from each document. This task reads the originalPermissions field of the document and sets the transformedPermissions field.

LIMITED USAGE Only the Azure Blob, CMIS, and Alfresco Connectors can use this task. For all other acl mapping, a JavaScript task is required.

Process Files and Process Folders tells the task what to process.
Mapping Rules: These rules will map the role/permissions on the left with the ones you want to match on the right.
- Permissions on the right will be from the source and those on the left will be for the target.
- Both side of a rule can be a comma delimited list.
- Each rule must end with a semicolon (;)
Authority Clean up: A comma delimited list of principals to remove as part of the task.
Enabling parsing: Enable parsing for multiple permissions to be assigned to an authority after mapping?
- For example, [Write]=WRITE_DATA with this box checked will allow ‘Write’ to be added on as an extra permission to an authority’s newly mapped permissions.

Classifier

IBM Watson Natural Language Classification

Uses IBM Watson Natural Language classifier to analyse and label text into categories. This uses Watson’s Natural Language Classifier APIs.

The IAM API Key for IBM Cloud: See instructions here on how to get it.
The Classifier ID: The ID of you classifier, created in Watson.
Field to process If left blank: the task will search for the field ‘content’.
Field to store the Watson response: If left blank, the field name will be ‘nlc’.
Comma delimited list of file extensions: Which content types to process or leave blank to process all.

Others

Index User Groups

Used for Search Security to index user and group information onto each document. This task adds two metadata fields to each document in order to restrict or allow access to documents indexed through this task.

Select User Groups To Index: The selected user groups will be added to the field simflofyUserGroups as well as adding a mapping.
Select Users to Index: The selected users will be to the field simflofyUsers as well as adding a mapping.

Lookup Destination Id From Job Run History

Purpose: Getting the rd destination Id from the Current Job Run History, in order to update an existing document from a previous Job run, rather than creating a new one. Use Case: It’s used in Incremental or Sync jobs. This task uses the Source Repository ID from the Repository the document is pulled from, in order to find the document in the Job Run History. Set up: There are no configuration options for this task. It uses the Document’s Source Repository ID in order to find the rd destination Id. Exception/Requirements: Since the Source Repository ID is used, any source repositories that use a Path Based Source ID will not work if the path of the document changes.

Remove Empty Fields

This task will remove any empty fields from the metadata of the document if there is no set value in those fields.

Audit Reports

Federation Services Reports can provide documentation and results from each time a job has been run. In this section you will find overviews, set up instructions, and use cases of the various reports you can use to see what’s going on with your data. These reports can be found under the Auditing section in the left sidebar.

Job Run History

The Job Run History page allows you to search collections of audits based on the job, depending on what types of audits were selected in the Job Configuration(link). A Job Run History contains, at least, the Name, Size, Mimetype, and source ID of each audited document.

Downloading the audit report will include more fields such as the path of the document’s parent, as well as its new id and any audited mappings. If the audits are for errors or skips, the reasons will be included.

This list can be filtered by selecting a job from the Jobs drop-down as well as entering start and/or end dates for the time period you want to search for. You can also search for a specific job run id.

Click on the eye icon next to the job to view the full details of the job run. This page will give you more information on the particular job. Including a list of records, details on the run, and general statistics on the data.

Tip: Tip: Users can access a specific sets of audits for a job by clicking on the processed/read/error numbers in the Run and Monitor Jobs page.

Records

The Records tab will give you a list of all of the documents that have been processed through this job run. This list can be filtered by record type. Allowing the user to view records that have been: processed, read, deleted, errors, error summary, skipped, and duplicates. You can use the search field to filter the list by record content such as document name and content type.

Details

The Details tab will give you all of the details about the job run.

Job Details: Job ID, Job Name, Job Status, Status Message, Job start and end time, Total time, Query From and to time, Records processed per sec, number of records read, written, in error and skipped, and the Federation Services server port number.

Stats

Clicking on Stats tab will provide a basic performance report.

In the figure, the input is tracked as the blue line, the output as yellow. This can help determine the “bottleneck” if the job performance is not as expected. It also shows the bytes per second allowing for the debugging of network throughput issues.

Deleting Job Run History

To delete individual job run histories, you can simply delete them while viewing them, or, in 3.1.1+ a button will be available in the Job Run History table.

Automatically Deleting Job Run History

Federation Services has a background service that will check for job run histories of a certain age, and remove them. This service can be configured through the Federation Services Global Properties.

History retention is on a per-job basis and can be configured under the Advanced Options section in the job’s Details tab.

Document Search

Search job run history for specific documents. Results will contain a link to the document’s details page, containing a full history for that document across all job runs.

Filter by Record Type

AUDITING:

Auditing of record types is set in the Auditing menu in a job’s Details tab

Processed: Documents that completed processing
Read: Documents added to the queue for processing. Not Audited By default
Errors: Documents that had errors
Skipped: Documents that were skipped due to falling outside the time range, or by a processor
Duplicate: Detected duplicates. Requires a Duplicate Detection Task.

Filter by Job

Choose the job you want to filter by or search all jobs

Filter by Field Value

No field: Selecting this option will perform an open search for all documents in the selected job
Job Run ID
Output Document URI: Similar to document ID, but for file paths, backslashes may be converted to forward slashes
Source Document ID: The document ID, as it appeared when read.
Document Name
Document Type: Not the same as Content Type, this refers to a document’s type in the repository
Batch ID: Only applicable if batching was enabled
Version ID: Only applicable if versions were retrieved
Document Hash: Only applicable if a hash value was generated using the Hash Value Generator Task
Size Greater Than (B)
Size Less Than (B)
New ID: Only applicable for Processed records.

TABLE SEARCH

The results table also has a search field to further narrow down the results

Exporting

You have the option of exporting the current page or entire result set as a CSV file, using the buttons at the top of the page. Will only activate if there are search results.

Mapping Audits

Mapping audits are a way to track the values of individual mappings on a per job run basis

Enabling Mapping Audits

There are two ways to enable mapping audits.

In the Mappings tab of your job. Click the eyeball next to any mapping. A green eye means the mapping is being “watched”, and its values will be track
In the Details tab of your job, under the Auditing section, check Audit all Mappings. This will collect the audit data for every mapping on the job, including additional mappings from Job Mappings and Mapping Groups

Viewing audit data

Under the Audit Reports menu in the sidebar, click Mapping Audits. If there are no jobs with audited mappings, you should see a message saying so. Otherwise, you will be presented with a drop-down of jobs with mapping audits.

Once you select the job, a drop-down with all job runs with audits will appear.
Once you select your job run, a drop-down containing the audited mappings will appear.
When you select an audited mapping, a table will appear showing you a count of the final values for that mapping.

These audits can be exported using the Export as CSV button.

Example: Content Type Breakdown

If you add a calculated field with the source ’#{rd.mimetype}’ (see the Federation Services Expression Language) and audit it, you will be able to see a breakdown of all content types (pdf, docx) for documents in the job.

Example: Auditing a PII Job

If you add a PII Detection Task to a job, the field hasPii is automatically added to each document. You can add this as the source of a field mapping and audit it.

When viewing the mapping in the Mapping Audits page, you will get a breakdown of how many documents in the job contained PII.

Search Audit

Search audits are collected by all search API endpoints. This includes searches performed through Discovery. In this page you can view the audits for the last week. For a broader view, these reports can be retrieved through the Audit Report Service REST API.

Event Audit

All API actions in Federation Services are audited. This page allows you to see those audits using search criteria.

Types of Event Audits

BATCH_DOWNLOAD
CREATE_DOCUMENT
UPDATE_DOCUMENT
UPDATE_PROPERTIES
UPDATE_CONTENT
CREATE_FOLDER
CREATE_VERSION
DELETE_VERSION
REVERT_VERSION
FOLDER_ITEMS
FILE_DOWNLOAD
FILE_HEADER
VERSION_DOWNLOAD
FIND_VERSION_PROPERTIES
FIND_PROPERTIES
STREAM_CONTENT
EDIT_PROPERTIES
DELETE_OBJECT
TYPE_DEFINITION
GET_ACL
DELETE_ACL
EDIT_ACL
FILE_ITEM
ROOT_FOLDER
REPOSITORY_INFO
CHECKOUT
CHECKIN
GET_TYPES
ID_BY_PATH
LIST_CONNECTIONS
DROP_COLLECTION
LIST_VERSIONS

Using Federation Services

Quick Start User Guide

Discovery and Analytics

Discovery Schema Instances

Schema Instances

New Discovery Connection Instance

Discovery Schema View

Data Collection

How to use Data Collection

Example Duplicate Checking

Discovery Report Views

Generate Reports

Filter Dataset

Available Reports

Schedule Reports

Pii Scan

Federated Search

Content Search

Content Search Connection Configuration

Available Content View Connection Types

Basic Configuration

Result Links

Search Configuration

Search Security

Connection Configuration

Content Services

Content Service Connections

Commonly Supported Operations

Basic Configuration

Connection Configuration

Content Service Mapping

Add Mappings to a Content Service Connector

Creating Mappings for Content Services

Creating A Content Service Connection

List of Repositories supported by Content Services

Content Service REST Endpoints

GET FILE

GET OBJECT ID BY PATH

POST FILE

UPDATE FILE

UPDATE FILE CONTENT

UPDATE FILE PROPERTIES

CHECKIN FILE

CHECK OUT FILE

GET FILE BATCH (3.1.1 and older)

GET FILE BATCH (3.1.2+)

GET FILE VERSIONS

DELETE FILE OR FOLDER

GET FOLDER ITEMS

CREATE FOLDER

GET ITEM PROPERTIES

Discovery

Federation Configuration

Widgets

Widget Instances

Widget Instance List

Customisation

Create a New Widget Instance

Content Views

Creating a View

Configuring Views

Content View Examples

Integrations

Job Setup and Configurations

Event Job Configurations

Field Options

Creating and Event Job

Getting Started

Creating a Federation Services Event Job

Event Types and Endpoints

Job Mappings

Overview

Creating Mappings

Reusable Mappings

Mappings Tab

Mapping Types

Discovery Within Mappings

Audit Mappings

Calculated Fields

Importing and Exporting Mappings