Before continuing, make sure that you've already set up an IAM user. See Configuring AWS Identity and Access Management for more.
Roles and permissions - Comprehend
In order to use IAM roles, a new policy must be created that will be used by the IAM role. Policies are used to grant permissions to groups. If there isn't a policy already in place for Amazon Comprehend access, a new policy must be created. The credentials associated with your IAM user must have permissions to access Amazon Comprehend actions. These permissions are customized through roles associated with your IAM user.
In order to use Amazon Comprehend, you'll need to create a new IAM role, and configure a policy to access the desired services within Comprehend. You can use one of the predefined policies, ComprehendFullAccess or ComprehendReadOnly. Both grant you full access to Amazon Comprehend, but the second one doesn't allow you to use asynchronous jobs.
You must grant Amazon Comprehend access to the Amazon S3 bucket (i.e. AI S3 bucket) that contains your document collection. You can do this by creating a data access role in your account to trust the Amazon Comprehend service principal.
Roles and permissions - Rekognition
The credentials associated with your IAM user must have permissions to access Amazon Rekognition actions. These permissions are customized through roles associated with your IAM user.
In order to use Amazon Rekognition, you'll need to create a new IAM role, and configure a policy to access the desired services within Rekognition. You can use one of the predefined policies, AmazonRekognitionFullAccess or AmazonRekognitionReadOnlyAccess. Both grant you full access to Amazon Rekognition, but the second one doesn't allow you to create or delete collections.
When analyzing images larger than 5MB (and up to 15MB), they will first be uploaded to an S3 bucket. Make sure that you setup a bucket in the same region as you intend to deploy Alfresco Intelligence Services.
You must grant Amazon Rekognition access to the S3 bucket used above.
Roles and permissions - Textract
The credentials associated with your IAM user must have permissions to access Amazon Textract actions. These permissions are customized through roles associated with your IAM user.
In order to use Amazon Textract, you'll need to create a new IAM role and configure a policy to access the desired services within Textract. The easiest way to do this is to attach the AWS managed policy AmazonTextractFullAccess to the IAM role.
You must grant Amazon Textract access to the S3 bucket used above.
Configuring the minimum confidence level
################################# # Alfresco-AI Parameters # ################################# ai.transformations.aiLabels.minConfidence=0.8 ai.transformations.aiFeatures.minConfidence=0.8 ai.transformations.aiTextract.minConfidence=0.8
Cleaning up in S3
Whenever files are written to S3 for processing, they're removed once processing finishes or an exception is encountered. However, if the service is stopped or an uncaught exception is thrown, it's possible that files may be left in S3. It is recommended that you set up a policy on the S3 bucket so that objects that are older than a day are removed from the bucket.
To do this, you can create a lifecycle rule that expires all versions of objects after 1 day, and then permanently deletes them one day after that.
See the AWS site for more details on Object Lifecycle Management.