Components of the Intelligence Services module
- Content Repository (ACS): This is the repository where documents and other content resides. The repository produces and consumes events destined for the message broker (such as ActiveMQ or Amazon MQ). It also reads and writes documents to the shared file store. AI overrides for the content repository (via an AMP), Digital Workspace (using a configuration file), and Share (via an AMP) are required to work with the Intelligence Services module.
- ActiveMQ: This is the message broker (either a self-managed ActiveMQ instance or Amazon MQ), where the repository and the Transform Router send image transform requests and responses. These JSON-based messages are then passed to the Transform Router.
- Transform Router: The Transform Router allows simple (single-step) and pipeline (multi-step) transforms that are passed to the Transform Engines. The Transform Router (and the Transform Engines) run as independently scalable Docker containers. This requires an AI override to work with the Intelligence Services module.
- Transform Engines: The Transform Engines transform files referenced by
the repository and retrieved from the shared file store. Here are some example
transformations for each Transform Engine (this is not an exhaustive list):
- LibreOffice (e.g. docx to pdf)
- ImageMagick (e.g. resize)
- Alfresco PDF Renderer (e.g. pdf to png)
- Tika (e.g. docx to plain text)
- AI Transform Engine (e.g. extracts data from images, such as png, jpeg, gif & tiff, and text from various file types such as pdf, docx, xlsx, pptx, etc.). Note that Comprehend can't process images directly, so the rendition is produced by using multi-step transforms. For example, Textract gets the text from an image, that can then be processed by Comprehend. For a list of supported transformations, see ai-pipeline-routes.json (included in the Intelligence Services distribution zip). The data extracted by the AI Engine is saved as AI aspects in the original source file.
- Shared File Store: This is used as temporary storage for the original source file (stored by the repository), intermediate files for multi-step transforms, and the final transformed target file. The target file is retrieved by the repository after it's been processed by one or more of the Transform Engines.
The following diagram shows a simple representation of the Intelligence Services components:
This shows an example implementation of how you can deploy into AWS, using a number of managed services:
- Amazon EKS - Elastic Container Service for Kubernetes
- Amazon MQ - Managed message broker service for Apache ActiveMQ
- Amazon EFS - Amazon Elastic File System
You can replace the AWS services (EKS, MQ, and EFS) with a self-managed Kubernetes cluster, ActiveMQ (configured with failover), and a shared file store, such as NFS.
Integrated AWS Services
Alfresco Intelligence Services requests renditions for all three services (Comprehend, Rekognition, and Textract), using the default configuration. However, the API processing calls only take place for the relevant AWS service. With the release of version 1.1, you can configure the requested renditions.
Before you can add these services to your deployment, some configuration is first required in AWS. Follow the links at the end of this page to review the key features and detailed setup for each service.
Some of the Docker images that are used by the Intelligence Services module are uploaded to a private registry, Quay.io. Since the Intelligence Services module adds AI capabilities to Alfresco Transform Service (see Transform Service Deployment overview), you'll also need access to the following image:
Use the following links to review the key features and detailed setup for each AWS service.