Automate - Workflow Design Guide

Disclaimer

The material in this document is for informational purposes only. The products it describes are subject to change without prior notice, due to the manufacturer’s continuous development program. Nuix makes no representations or warranties with respect to this document or with respect to the products described herein. Nuix shall not be liable for any damages, losses, costs or expenses, direct, indirect or incidental, consequential or special, arising out of, or related to the use of this material or the products described herein.

Introduction

This guide describes the options and options of the Workflow Design Web component of Automate. This document works like a reference - use the table of contents to look for the topic that you find out about.

The Automate software and this documentation may contain bugs, errors, or other limitations. If you encounter any issues with the Automate software or with this documentation, please contact nuix support.

Styles Used in This Guide

Note: This is icon indicates that additional clarifications are provided, for example what the valid options are.

Tip: This icon lets you know that some particularly useful tidbit is provided, perhaps a way in which to use the application to achieve a certain behavior.

Warning: This icon highlights information that may help you avoid an undesired behavior.

Emphasized: This style indicates the name of a menu, option or link.

code: This style indicates code that should be used verbatim, and can refer to file paths, parameter names or Nuix search queries.

1. Editing Workflows

Workflows are managed in Automate in the Libraries section.

To edit, delete, deactivate or activate a Workflow, select the Workflow from the Library and then click on the button at the right of the Workflow name.

To create a new Workflow, click on the Add+ Workflow button in the desired Library. A Workflow can be created in different ways:

Blank Workflow: Create a new workflow starting with a blank canvas.
Template: Build a workflow by starting from an existing template.
Workflow Wizard: Create a workflow that processes and exports data by answering a series fo questions.
Workflow File: Upload a previously created workflow file.

Parameters can be used in the Workflow along with static text in every field which accepts user input such as search queries, file paths, production sets names, etc. See the Parameters Guide for more details.

1.1. Operation Actions

The following actions can be performed on operations using the operation list buttons:

Add () an operation to the workflow.
Remove () the selected operations from the workflow.
Move Up () the selected operation in the workflow.
Move Down () the selected operation in the workflow.
Search () in the list of operations by name.

Additionally, the following actions can be performed using the Actions operations list menu:

Enable / Disable: An operation that is disabled has no effect to the workflow execution.
Make Skippable / Remove Skippable: If an operation is marked as Skippable, a user can skip the execution of the operation while the operation runs.

Skippable operations might leave the Job execution in an unexpected state. They should only be enabled if the subsequent workflow logic is not affected by the skipped operation.

Enable Soft Fail / Disable Soft Fail: An operation that is marked as Soft Fail will not halt the workflow execution in the event that the operation encounters an error.
Enable Field Ovverwrite / Disable Field Ovverwrite: An operation that is marked as Field Ovverwrite can have all of its fields overwritten with parameters starting with the operation name and followed by the field name, for example {set_purview_case_case_identifier_type}.
Insert Workflow: Insert operations from a workflow file at the selected position.
Cut: Cut the selected operations (CTRL+X).
Copy: Copy the selected operations (CTRL+C).
Paste: Paste the operations previousy cut or copied at the selected position (CTRL+V).
Delete: Delete the selected operations (Del).

2. Operations

Operations are categorized by the platform under which they execute.

When an operation has multiple platforms, for example the Metadata To SQL operation exports data from a Nuix case to a SQL server, it is documented under the platform it is most specific to, in this example SQL.

2.1. Azure Storage Operations

These operations perform actions related to Azure Storage accounts.

2.1.1. Azure Container Copy

This operation copies the contents of an Azure Container to another Azure Container using the Microsoft AzCopy command.

The following settings can be configured:

Source storage container URL: The URL of the source storage container.
Source storage account SAS token: The SAS access token of the source container. To create a token, see https://learn.microsoft.com/en-us/azure/cognitive-services/translator/document-translation/create-sas-tokens?tabs=Containers.
Destination storage container URL: The URL of the destination storage container.
Destination storage account SAS token: The SAS access token of the Destination container.
Command-line flags: Optionally, additional command-line flags for the AzCopy command.

2.1.2. Azure Container Download

This operation downloads the contents of an Azure Container to local storage using the Microsoft AzCopy command.

The following settings can be configured:

Storage container URL: The URL of the storage container.
Storage account SAS token: The SAS access token. To create a token, see https://learn.microsoft.com/en-us/azure/cognitive-services/translator/document-translation/create-sas-tokens?tabs=Containers.
Download location: The folder to download the data to.

2.1.3. Configure Azure Storage Account Connection

This operation sets the configuration used to connect to the Azure Storage account. This operation is required for all Azure Storage related operation with the exception of Azure Container Copy and Azure Container Download.

The Azure Storage Account ID must be specified as a parameter of the type Azure Storage Account.

2.1.4. Create Azure Storage Account Container

This operation create a container in the configured Azure Storage account.

The Container name will be normalized to respect the Azure requirements described at https://learn.microsoft.com/en-us/rest/api/storageservices/naming-and-referencing-containers--blobs--and-metadata#container-names

2.1.5. Delete Azure Storage Account Container

This operation deletes a container in the configured Azure Storage account.

2.1.6. Generate Azure Storage Account SAS Token

This operation generates a SAS access token in the configured Azure Storage account.

2.2. Brainspace

These operations transfer data between the Nuix case and Brainspace and allow managing various operations in Brainspace.

2.2.1. Set Brainspace Dataset

This operation connects to the Brainspace environment and retrieves the specified dataset ID, using the following settings:

Brainspace API URL: The URL to of the Brainspace environment, for example https://app.brainslace.local
Certificate fingerprint: Optional, the SHA-256 fingerprint of the Brainspace app server certificate that should be trusted even if the certificate is self-signed.
API key: The API key. This value can be obtained from the Brainspace Administration page → Connectors → API Authentication.
Dataset identifier:
- ID: The Brainspace dataset ID.
- Name: The Brainspace dataset name.
- Name (Regex): A regular expression to match the Brainspace dataset name by.
Existing dataset: The action to take if the case does not exist:
- Clone dataset if it does not already exist creates a new dataset by cloning the source dataset.
- Only use existing dataset triggers an error if the dataset does not exist.
Clone settings: The settings to use when cloning a dataset.
- Copy groups: Copy the groups of the source dataset to the newly created dataset.
- Add new dataset to group: Adds the newly created dataset to the specified group.

2.2.2. Load Items to Brainspace

This operation exports the text and metadata of items from the Nuix case and loads it to Brainspace.

The following settings can be configured:

Scope query: The Nuix query to select the items to load into Brainspace.
Export standard metadata: Export items standard metadata to Brainspace.
Export custom metadata from profile: Optional, the metadata profile to use for additional metadata to export to Brainspace. When using this option, a Custom field mapping file must be provided.
Custom fields mapping file: The JSON mapping file defining the mapping of the custom metadata profile to Brainspace.
Export DocIDs from production set: If checked, the name of the production set to export the DocID numbers from.
Trim body text at: if checked, the size in characters after which the body text of items is trimmed before loading to Brainspace.

When the body text of items is trimmed, the field Text Trimmed is set to true in Brainspace on the items in question.

The Tag failed items as option has the same behavior as in the Legal Export operation.

Sample Custom fields mapping file mapping 2 custom Nuix fields named Custom Field 1 and Custom Field 2 :

{
  "name": "Custom Mapping",
  "fields": [
    {
      "name": "Custom Field 1",
      "mapTo": "STRING"
    },
    {
      "name": "Custom Field 2",
      "mapTo": "ENUMERATION",
      "faceted": true
    }
  ]
}

2.2.3. Manage Brainspace Build

This operation manages the builds on the Brainspace dataset.

The following settings can be configured:

Wait for previous build to complete: Waits for a build that was running at the operation start to complete.
Build dataset: Triggers a build of the dataset.

The Build dataset option should be used after the Load Items to Brainspace operation to make the loaded items available for refiew.

Wait for build to complete: Waits for the build triggered by this operation to complete.

If a Wait option is selected and the build does not complete in the alloted time, the operation will fail.

The percentage progress of this operation reflects the elapsed timeout and is not an indication of the build progress.

2.2.4. Propagate Tags to Brainspace

This operation propagates tag values from Nuix items to the corresponding Brainspace documents as tag choices.

The following settings can be configured:

Scope query: The query to retrieve the Nuix items for which to propagate tags.
Nuix root tag: The name of the Nuix root tag.

When using this operation, it is expected that in Nuix, a root tag is created, for example Relevancy. Then, the Nuix items should be assigned subtag values under the root tag, for example Relevancy|Relevant and Relevancy|Not Relevant. The root Nuix tag will be mapped to a Brainspace tag (Relevancy in this example), and the Nuix subtag values will be mapped to Brainspace choices (Relevant and Not Relevant in this example.)

The Nuix items should only have one subtag value, because in Brainspace these are mapped to single-choice tags.

Nested subtags values such as Relevancy|Not Relevant|Personal are not supported.

This operation updates previous tag choices, but does not update items for which in Nuix no subtag exists. As a workaround, to indicate that a document should not have any of the previous tag choices, assign it to a new dedicated choice, for example Relevancy|Unassigned.

2.2.5. Retrieve Metadata from Brainspace

This operation reads metadata form items from Brainspace and applies is to the Nuix items.

The following settings can be configured:

Nuix scope query: The Nuix query to select the items to update.
Brainspace scope:
- All Items: Retrieve metadata from all Brainspace items in the dataset.
- Notebook: Only retrieve metadata from the Brainspace items in the specified Notebook.
Tag matching items: The tag to apply to the Nuix items that were matched to Brainspace items.
Retrieve Brainspace tags: Select whether to retrieve the tags assigned to items in Brainspace, and what prefix to use when applying the tags to the matching Nuix items.
Retrieve Brainspace classifier scores: Select whether to the values of Brainspace fields corresponding to classifiers. These fields are identified as having a numeric type and the word score in their name.
Retrieve Brainspace fields: Select whether to retrieve metadata fields from Brainspace to be assigned as custom metadata to the Nuix items, and which Brainspace fields to retrieve.

2.3. Gen AI

These operations perform enrichment on the Nuix items using Gen AI services.

2.3.1. Configure Gen AI Connection

This operation sets the configuration used to connect to the Gen AI service:

Gen AI service ID: The ID of the Gen AI service, {gen_ai_service_id}
Override model: If set, the model configured on the Gen AI service will be overridden with the value set below.
Model: If Override model is set, the model to use for subsequent Gen AI operations.

2.3.2. Gen AI Estimate Tokens

This operation will estimate the number of tokens required to run prompts on each of the documents in scope.

The following options can be configured:

Scope query: The Nuix query to select the items for which the size in tokens will be estimated.
Run on a sample of: The sample size to run the analysis on.
Content location:
- Item text: Estimate the token count of the item text body.
- Custom metadata: Estimate the token count of a custom metadata field.
- Images: Estimate the token count of the document images.
- Metadata profile field: Estimate the token count of a field from a metadata profile.

For the estimate of tokens of images, the calculation is performed by expecting that the document pages have a rectangular shape and that each page requires 6 image blocks during the tokenization.

Results metadata field: Store the prompt tokens count as custom metadata
Create metadata profile: If set, a metadata profile will be created in the case with the field reporting the tokens count.
Tag analyzed items as: The tag to apply to items analyzed.
Tag failed items as: The tag to apply to items that failed.

The responses of the Gen AI service are recorded on each document in custom metadata with the prefix GenAI. Additionally, the following system medata is recorded:

GenAI|System|Model: The model that was used for the last analysis, if applicable.
GenAI|System|Service: The hostname of the service that was used for the last analysis.
GenAI|System|Warning: The warning that was encountered during the last analysis, if any.
GenAI|System|Error: The error that was encountered during the last analysis, if any.

2.3.3. Gen AI Extract Entities

This operation will extract entities from each document in scope by running Gen AI prompts.

The following options can be configured:

Scope query: The Nuix query to select the items for analysis with Gen AI.
Run on a sample of: The sample size to run the analysis on.
Item text:
- Original item text: Prompt on the original item text body.
- Custom metadata: Prompt on the value of a custom metadata field.
- Metadata profile field: Prompt on the field from a metadata profile.
Sanitize content before prompting: If set, the content will be searched for the regex patterns listed. The matched patter will be replaced with the replacement string provided. The expression $1 can be used to reference the first matched group.
Context prompt: The prompt with which to initialize the Gen AI analysis.
Document prompt: The document prompt to send to the Gen AI service. The parameter {item_text} will be replaced with the item text, extracted using the rule selected in the Item text option. The parameter {item_properties} will be replaced with a JSON representation of all the item properties. The parameter {email_header} will be replaced with the email header of the item, if applicable, or an empty string otherwise.
Split content with separator: The separator to use when splitting the document to fit within the token limits.
Extraction prompt: The prompt with instructions on how to extract the entities.

The result from the prompt is expected to be a JSON list of entities, with each entity being a JSON object with the type and value fields.

Clean up prompt responses: If set, the prompt responses will be searched for the regex patterns listed and the matched text will be replaced.
Output JSON schema: Request the prompt response to follow the specified JSON schema.

Not all Gen AI services and models support JSON schemas.

Temperature: The temperature setting, from 0 to 1, to set to the Gen AI service.
Max response tokens: The maximum number of tokens to accept in the response for each prompt.
Results location:
- Entities: Store the results as entities on the item
- Custom metadata: Store the results as custom metadata with the supplied field prefix.
Create metadata profile: If set, a metadata profile will be created in the case with the fields corresponding to the entity names extracted.
Tag analyzed items as: The tag to apply to items analyzed.
Tag failed items as: The tag to apply to items that failed.

The responses of the Gen AI service are recorded on each document in custom metadata with the prefix GenAI. Additionally, the following system medata is recorded:

GenAI|System|Model: The model that was used for the last analysis, if applicable.
GenAI|System|Service: The hostname of the service that was used for the last analysis.
GenAI|System|Warning: The warning that was encountered during the last analysis, if any.
GenAI|System|Error: The error that was encountered during the last analysis, if any.

2.3.4. Remove Entities

This operation removes entities previously assigned to the items.

Specific entity types can be specified to be removed, or alternatively, all entities for the items in scope can be removed.

2.3.5. Gen AI Prompt On Documents

This operation will run Gen AI prompts on each document in scope and record the results in custom metadata fields of each document.

The following options can be configured:

Scope query: The Nuix query to select the items for analysis with Gen AI.
Run on a sample of: The sample size to run the analysis on.
Item text:
- Original item text: Prompt on the original item text body.
- Custom metadata: Prompt on the value of a custom metadata field.
- Metadata profile field: Prompt on the field from a metadata profile.
Sanitize content before prompting: If set, the content will be searched for the regex patterns listed. The matched patter will be replaced with the replacement string provided. The expression $1 can be used to reference the first matched group.
Context prompt: The prompt with which to initialize the Gen AI analysis.
Document prompt: The document prompt to send to the Gen AI service. The following parameters are available:
- The parameter {item_text} will be replaced with the item text, extracted using the rule selected in the Item text option.
- The parameter {item_properties} will be replaced with a JSON representation of all the item properties.
- The parameter {email_header} will be replaced with the email header of the item, if applicable, or an empty string otherwise.
- The parameter {doc_id} will be replaced with the item DocID.
- The parameter {item_name} will be replaced with the item name.
- The parameter {item_guid} will be replaced with the item guid.
Split content with separator: The separator to use when splitting the document to fit within the token limits.
Prompt in parts when content exceeds max prompt tokens: If set, a separate prompt is run for each part. If not set, a prompt is run on the first part only and a warning is recorded.
Keep responses from individual batches: If set, the responses from each part of the document will be stored as custom metadata field, with the name GenAI|name|Level x|Part y where name is the name of the prompt, x is the recursion level and y is the part number.
Context prompt for prompt responses: THe prompt with which to initialize the Gen AI analysis when prompting on the responses the previous level.
Content prompt for part responses: The prompt in which to send each response from the previous level.
Question prompts: The question prompts to send to the Gen AI service, with the following settings:
- Name: The name of the question
- Prompt: The prompt to use for the document content (i.e. level 1)
- Content prompt for part responses: The prompt for assembling the responses from the previous level. This option is only applicable when prompting in parts.

At the first prompt level, for each question, the Gen AI service will be sent the context prompt (if any), followed by the document prompt, followed by the question prompt.

When prompting in parts, at subsequent prompt levels, for each question, the Gen AI service will be sent the context prompt for part responses (if any), followed by the multiple content prompts for part responses (within the token limits), followed by the question prompt.

Clean up prompt responses: If set, the prompt responses will be searched for the regex patterns listed and the matched text will be replaced.
Parse JSON output: If set, the prompt responses will be parsed as a JSON dictionary and the resulting keys and values will be stored in nested custom metadata fields. If the data cannot be parsed as JSON, the response will be stored as a string in the custom metadata field. Dates will be parsed if they are in the ISO 8601 format, for example 2023-12-31T12:00:00Z, or in the yyyy-MM-dd format, for example 2023-12-31
Output JSON schema: Request the prompt response to follow the specified JSON schema.

Not all Gen AI services and models support JSON schemas.

Use follow-up prompts: The follow-up prompts to send to the Gen AI service following each question prompt. This option is only available when not prompting in parts.
Only run follow-up prompts if response matches regex: For each response, the follow-up prompts will only be run if the response matches the regex pattern.
Clean up follow-up prompt responses: If set, the follow-up prompt responses will be searched for the regex patterns listed and the matched text will be replaced.
Temperature: The temperature setting, from 0 to 1, to set to the Gen AI service.
Max response tokens: The maximum number of tokens to accept in the response for each prompt.
Create metadata profile: If set, a metadata profile will be created in the case with the fields corresponding to the prompt and follow-up prompt responses.
Tag analyzed items as: The tag to apply to items analyzed.
Tag failed items as: The tag to apply to items that failed.

The responses of the Gen AI service are recorded on each document in custom metadata with the prefix GenAI. Additionally, the following system medata is recorded:

GenAI|System|Model: The model that was used for the last analysis, if applicable.
GenAI|System|Service: The hostname of the service that was used for the last analysis.
GenAI|System|Warning: The warning that was encountered during the last analysis, if any.
GenAI|System|Error: The error that was encountered during the last analysis, if any.

2.3.6. Gen AI Prompt On Images

This operation will run Gen AI prompts on the images of the items in scope and record the results in custom metadata fields.

When running the Gen AI Prompt On Images operation on Nuix items which are images, it is recommended for the Nuix items to have stored binaries. When running the operation on Nuix items which are not images, it is recommended to run the Generate Printed Images operation first to generate the printed images.

The following options can be configured:

Scope query: The Nuix query to select the items for transcription with Gen AI.
Run on a sample of: The sample size to run the transcription on.
Context prompt: The prompt with which to initialize the Gen AI transcription.
Question prompts: The question prompts to send to the Gen AI service. For each question, the Gen AI service will be sent the context prompt (if any), followed by the image and the question prompt. Each question prompt will be sent to the Gen AI service independently of the other question prompts. The following parameters are available:
- The parameter {page_number} will be replaced with the number of the page on which the prompt is running, and the parameter {page_transcription} will be replaced with the value of the custom metadata GenAI|Transcription|i, where i is the page number on which the transcription is running.
- The parameter {page_number} will be replaced with a JSON representation of all the item properties.
- The parameter {email_header} will be replaced with the email header of the item, if applicable, or an empty string otherwise.
- The parameter {doc_id} will be replaced with the item DocID.
- The parameter {item_name} will be replaced with the item name.
- The parameter {item_guid} will be replaced with the item guid.
Parse JSON output: If set, the prompt responses will be parsed as a JSON dictionary and the resulting keys and values will be stored in nested custom metadata fields. If the data cannot be parsed as JSON, the response will be stored as a string in the custom metadata field. Dates will be parsed if they are in the ISO 8601 format, for example 2023-12-31T12:00:00Z, or in the yyyy-MM-dd format, for example 2023-12-31
Output JSON schema: Request the prompt response to follow the specified JSON schema.

Not all Gen AI services and models support JSON schemas.

Temperature: The temperature setting, from 0 to 1, to set to the Gen AI service.
Max response tokens: The maximum number of tokens to accept in the response for each prompt.
Printing options:
- Native images sub-scope query:: The items which should be sent as images to the Gen AI service. All other items will be printed and the printed images will be sent to the Gen AI service.
- Non-native images print megapixels:: For items that are printed, the size of each printed page, in megapixels.
Multi-page options:
- Max pages: The maximum number of pages to send to the Gen AI service for each item.
- Repeat prompt for each page:: If set, the prompts will be repeated for each page of the document.
- Text output page separator:: When prompting on multiple pages, the separator to use when concatenating the response from each page. The parameter {page_number} will be replaced with the number of the page.
- Add separator before first page: When selected, the separator will be added before the first page of the document.
- Merge multi-page JSON:: When prompting on multiple pages, if the response is provided in a JSON list format and this option is set, then the responses will be merged into a single JSON list.
Tag analyzed items as: The tag to apply to items analyzed.
Tag failed items as: The tag to apply to items that failed.

The responses of the Gen AI service are recorded on each document in custom metadata with the prefix GenAI. Additionally, the following system medata is recorded:

GenAI|System|Model: The model that was used for the last analysis, if applicable.
GenAI|System|Service: The hostname of the service that was used for the last analysis.
GenAI|System|Warning: The warning that was encountered during the last analysis, if any.
GenAI|System|Error: The error that was encountered during the last analysis, if any.

2.3.7. Gen AI Transcribe Items

This operation will run a Gen AI prompt on the images of the items in scope and record the results either in the document text or as a custom metadata field, with options similar to the Gen AI Prompt On Documents operation.

This operation can be used to transcribe images or PDF files.

The following options can be configured:

Scope query: The Nuix query to select the items for transcription with Gen AI.
Run on a sample of: The sample size to run the transcription on.
Results location:
- Item text: Store the result in the item text, either appending to the existing text, or overwriting it.
- Custom metadata: Store the results in a custom metadata field.
Context prompt: The prompt with which to initialize the Gen AI transcription.
Transcription prompt: The transcription prompt instructions.
Temperature: The temperature setting, from 0 to 1, to set to the Gen AI service.
Max response tokens: The maximum number of tokens to accept in the response for each prompt.
Printing options:
- Native images sub-scope query:: The items which should be sent as images to the Gen AI service. All other items will be printed and the printed images will be sent to the Gen AI service.
- Non-native images print megapixels:: For items that are printed, the size of each printed page, in megapixels.
Multi-page options:
- Max pages: The maximum number of pages to send to the Gen AI service for each item.
- Repeat prompt for each page:: If set, the prompts will be repeated for each page of the document.
- Text output page separator:: When prompting on multiple pages, the separator to use when concatenating the response from each page. The parameter {page_number} will be replaced with the number of the page.
- Add separator before first page: When selected, the separator will be added before the first page of the document. format, if this option is set, then the responses will be merged into a single JSON list.
Tag analyzed items as: The tag to apply to items analyzed.
Tag failed items as: The tag to apply to items that failed.

The responses of the Gen AI service are recorded on each document in custom metadata with the prefix GenAI. Additionally, the following system medata is recorded:

GenAI|System|Model: The model that was used for the last analysis, if applicable.
GenAI|System|Service: The hostname of the service that was used for the last analysis.
GenAI|System|Warning: The warning that was encountered during the last analysis, if any.
GenAI|System|Error: The error that was encountered during the last analysis, if any.

2.3.8. Gen AI Transcribe Files

This operation will run Gen AI prompts on each file in scope and create a text file with the transcription. This can be used to transcribe, describe or OCR images or PDF files.

Source PDF files are rasterized to images. Remaining items are sent as-is to the Gen AI platform.

For each source image file, a corresponding text file is written in the Output text files folder. A CSV report named summary_report.csv is produced, listing all source files, the transcription success status, the path and size of the resulting text file, as well as the output of the transcription engine.

The operation has the following settings:

Source image files folder: The folder containing the image files to be OCRed.
Scan folder recursively: If set, the source folder will be scanned recursively, and the output files will be created using the same folder structure.
Skip images with existing non-empty text files: If set, images will be skipped if a text file with the expected name and a size greater than 0 exists in the destination folder.
Assemble pages regex: The regular expression to use to detect documents with multiple pages, which were exported with one image file per page. The regex must have at least one matching group which is used to select the document base name.
Output text files folder: The folder in which the text files will be created.
Keep incomplete files: If set, empty files and incomplete text files from the OCR Engine are not deleted.
Context prompt: The prompt with which to initialize the Gen AI transcription.

The Temperature and Max Tokens options have the same behavior as in the Gen AI Prompt On Documents.

2.4. ElasticSearch

These operations transfer data between the Nuix case and ElasticSearch.

2.4.1. Configure ElasticSearch Connection

This operation sets the configuration used to connect to the ElasticSearch environment:

Host: The ElasticSearch host name, for example es.example.com, or 127.0.0.1.
Host: The port on which the ElasticSearch REST API is deployed, by default 9200.
Username: The username to authenticate with.
Password: The password for the username above.
Certificate fingerprint: Optional, the SHA-256 fingerprint of the ElasticSearch certificate that should be trusted even if the certificate is self-signed.
Bulk operations: The number of operations to submit in bulk to ElasticSearch. Using a higher value can increase throughout but requires more memory.

2.4.2. Export Items to ElasticSearch

This operation will export the metadata of items matching the scope query to ElasticSearch.

Scope query: The Nuix query to select the items to export to ElasticSearch.
Metadata profile: The Nuix metadata profile used during the export.
Index name: The ElasticSearch index name.
Export items text: If selected, the operation will export the item text in addition to the metadata. The text is exported in ElasticSearch under the item property _doc_text.
Trim item text at: The maximum number of characters to export from the item text. If the item text is trimmed, the ElasticSearch property _doc_text_trimmed is set on the item.

2.5. Knowledge Graph

These operations configure the connection to Knowledge Graph send promote data to Knowledge Graph.

2.5.1. Configure Knowledge Graph Connection

This operation is only available in Nuix Neo.

This operation sets the configuration used to connect to the Knowledge Graph service.

The Knowledge Graph service ID should be set to a parameter of type Knowledge Graph Service. During the submission of the workflow in Scheduler, the user will be prompted to select the Knowledge Graph Service and authenticate to the service if required.

The Playbook file should be set to a Nuix playbook file that will transform items to Knowledge Graph nodes and edges.

The Transaction size is used group and deduplicate similar transactions.

2.5.2. Promote Items to Knowledge Graph

This operation is only available in Nuix Neo.

This operation sends the items in scope to Knowledge Graph.

2.6. Microsoft Purview

These operations perform actions in Microsoft Purview eDiscovery (Premium).

For an overview of Microsoft Purview, see https://learn.microsoft.com/en-us/purview/ediscovery-overview.

2.6.1. Configure Purview Connection

This operation sets the configuration used to connect to Purview. This operation is required for all other operations performing actions in Purview.

The Microsoft Purview service ID must be specified as a parameter of the type Microsoft Purview Service.

2.6.2. Set Purview Case

This operation selects a Purview case, using the following settings:

Case identifier: The Name or ID of the Purview case.
Create case if it does not exist create a new case, with the following settings
- Case number: Optional, the case number to set on the case.
- Description: Optional, the description to set on the case.

2.6.3. Update Purview Case Settings

This operation updates the settings of the selected Purview case.

2.6.4. Manage Purview Case

This operation performs the following management actions on the selected Purview case:

Close: Closes the case.
Close and Delete: Closes the case and attemps to delete it.
Reopen: Opens a previously closed case.

2.6.5. Query Purview Objects

This operation queries objects from a Purview case and tracks them in parameters, for the following objects:

Custodians: All custodians from the case.
Custodial data sources: All custodial data sources from the case.
Non-custodial data sources: All non-custodial data sources from the case.

2.6.6. Add Custodial Data Sources to Purview

This operation adds custodial data sources to a Purview case using following settings:

Data sources file: A file containing the list of data sources to add.
Data sources: A table with the data sources to add.

2.6.7. Add Non-Custodial Data Sources to Purview

This operation adds non-custodial data sources to a Purview case using following settings:

Data sources file: A file containing the list of data sources to add.
Data sources: A table with the data sources to add.

2.6.8. Apply Hold to Purview Custodians

This operation applies a hold to Purview custodians, using the following settings:

All case custodians: Apply the hold to all custodians in the selected Purview case.
Custodians file: A file containing the list of custodians to apply the hold to.
Custodian IDs JSON: A JSON formatted list of Purview custodian IDs.
Wait for completion: Waits until the hold was applied.

2.6.9. Apply Hold to Purview Non-Custodial Data Sources

This operation applies a hold to Purview non-custodial data sources, using the following settings:

All case non-custodial data sources: Apply the hold to all non-custodial data sources in the selected Purview case.
Non-custodial data sources file: A file containing the list of non-custodial data sources to apply the hold to.
Non-custodial data sources IDs JSON: A JSON formatted list of Purview non-custodial data sources IDs.
Wait for completion: Waits until the hold was applied.

2.6.10. Remove Hold from Purview Custodians

This operation removes the hold from Purview custodians.

For the available settings, see Apply Hold to Purview Custodians

2.6.11. Remove Hold from Purview Non-Custodial Data Sources

This operation removes the hold from Purview non-custodial data sources.

For the available settings, see Apply Hold to Purview Non-Custodial Data Sources

2.6.12. Release Purview Custodians

This operation releases the Purview custodians from the case.

For the available settings, see Apply Hold to Purview Custodians

2.6.13. Release Purview Non-Custodial Data Sources

This operation releases the Purview non-custodial data sources form the case.

For the available settings, see Apply Hold to Purview Non-Custodial Data Sources

2.6.14. Add to Purview Search

This operation creates a Purview search and/or adds data sources to the search.

For the list of available fields to query against, see https://learn.microsoft.com/en-us/purview/ediscovery-keyword-queries-and-search-conditions

2.6.15. Estimate Purview Search Statistics

This operation estimates the items in scope of the Purview search, and is required before adding the items from the search to a review set.

2.6.16. Add to Purview Review Set

This operation creates a Purview review set and/or adds the results from a search to the review set.

2.6.17. Create Purview Review Set Query

This operation creates a query to apply to a Purview review set.

For the list of available fields to query against, see https://learn.microsoft.com/en-us/purview/ediscovery-document-metadata-fields

2.6.18. Delete Purview Review Set Query

This operation deletes a Purview review set query.

2.6.19. Delete Purview Search

This operation deletes a Purview search.

2.6.20. Export Purview Review Set

This operation exports the items from the review set or from a review set query.

To transfer the export, see the Azure Container Copy and Azure Container Download operations.

2.6.21. Convert Purview Export

This operation converts emails from a Purview Condensed directory structure (CDS) export to a Nuix Logical Image (NLI).

This operation does not support Teams and Copilot conversations. The operation is deprecated and is replaced by the Convert Purview CDS operation.

The following settings can be configured:

Purview export folder: The folder where the Purview data was downloaded to.
Resulting NLI location: The location of the resulting NLI.
Advanced options: Settings used to identify the Purview CDS loadfiles and Column names used to extract metadata from items.
Reduce noise items: This option reduces noise items from extensions provided in the field Noise extensions.

The option to reduce noise items removes children items that match the following criteria:

The Original_file_extension is in the Noise extensions list
The Native_extension is different then Original_file_extension
The Input_path ends in Original_file_extension
The Compound_path starts with Input_path and is followed by a suffix (ex: ../Presentation.pptx/slide25.xml.rels)

2.6.22. Convert Purview CDS

This operation converts emails, files and conversations from a Purview Condensed directory structure (CDS) export to a Nuix Logical Image (NLI).

The following settings can be configured:

Purview export folder: The folder where the Purview data was downloaded to.
Resulting NLI location: The location of the resulting NLI.
Export options: Settings used to identify the Purview CDS loadfiles
- Convert emails to RFC 5322 (.eml): Converts all emails to .eml format with standardized properties. Disabling this option retains emails in MAPI format but may have inconsistent parsing of attachments on older versions of the Nuix Engine.
- Detach regular attachments: Removes regular attachments from emails and replaces them with a stub attachment. This option does not impact the family items, but does have an impact on the MD5 hash calculation.
- *Stub modern attachments *: Adds a stub item for modern attachments to the email. This option does not impact the family items, but produces a different MD5 hash value than a standalone email without the modern attachment.
Item options: Column names used to extract metadata from general items.
Conversation options: Column names used to extract metadata from conversations.
Split export at: Break down the export and splits into multiple part of the maximum number of items specified.

NOTE: In some cases the split .NLI files may be larger or smaller than the specified split export at size, this is due to families needing to be kept in the same loadfile.

2.6.23. Purview Convert Advanced

This operation converts emails, files and conversations from a Purview MSG or PST export to a Nuix Logical Image (NLI).

The following settings can be configured:

Purview export folder: The folder where the Purview data was downloaded to.
Resulting NLI location: The location of the resulting NLI.
Export options: Settings used to identify the Purview export files
- Convert emails to RFC 5322 (.eml): Converts all emails to .eml format with standardized properties. Disabling this option retains emails in MAPI format but may have inconsistent parsing of attachments on older versions of the Nuix Engine.
- Detach regular attachments: Removes regular attachments from emails and replaces them with a stub attachment. This option does not impact the family items, but does have an impact on the MD5 hash calculation.
- *Stub modern attachments *: Adds a stub item for modern attachments to the email. This option does not impact the family items, but produces a different MD5 hash value than a standalone email without the modern attachment.
Item options: Column names used to extract metadata from general items.
Conversation options: Column names used to extract metadata from conversations.
Split export at: Break down the export and splits into multiple part of the maximum number of items specified.

NOTE: In some cases the split .NLI files may be larger or smaller than the specified split export at size, this is due to families needing to be kept in the same loadfile.

2.6.24. Convert Loadfile to Nuix Logical Image

This operation converts a CSV loadfile to a Nuix Logical Image (NLI).

The following settings can be configured:

Loadfile: The CSV loadfile to be converted.
Resulting NLI: The location of the resulting NLI file.
DocID column: The name of the column containing the document ID, or a unique identifier for each item.
Family ID column: Optional, the name of the column containing the family ID.
Path column: Optional, the name of the column containing the path of the document, excluding the document name.
Name column: Optional, the name of the column containing the name of the document.
Native file column: Optional, the name of the column containing the path to the native file.
Custodian column: Optional, the name of the column containing the custodian associated with the document.
MD5 column: Optional, the name of the column containing the document MD5.
Load all fields: Select this option to convert all columns from the loadfile to metadata fields in the NLI.
Fields metadata prefix: Optional, the prefix to use for metadata fields in the NLI.

2.6.25. Create Nuix Logical Image

This operation packages a local folder to a Nuix Logical Image (NLI).

The following settings can be configured:

Source location: The folder to be packaged.
Resulting NLI location: The location of the resulting NLI.

2.7. Google Vault

These operations perform actions in Google Vault.

For an overview of Google Vault, see https://support.google.com/vault/answer/2462365?hl=en

2.7.1. Configure Vault Connection

This operation sets the Google Vault Third-Party Service that will be used to connect to Google Vault. This operation is required for all other operations performing actions in Vault.

The Google Vault service ID must be specified as a parameter of the type Google Vault Service.

2.7.2. Set Vault Matter

This operation selects a Vault matter using the following settings:

Matter identifier: The ID, Name, or Name (Regex) of the Vault matter.
Matter state filter: The required state of the Vault matter.
Create matter if it does not exist: If the matter does not exist with the required state filter, creates a new matter with the following settings:
- Description: Optional, the description to set on the matter.

2.7.3. Manage Vault Matter

This operation performs the following management actions on the selected Vault matter.

Close: Closes the matter.
Delete: Deletes the matter.
Reopen: Reopens the matter.
Undelete: Undeletes the matter.

2.7.4. Create Vault Saved Queries

This operation creates Vault saved queries in the selected Vault matter using the following settings:

Query name prefix: The prefix used in the name for the saved queries.
Data scope: The scope of data for the saved queries.
Use date range: Set a date range to filter the data covered by the saved queries with the following settings:
- Time zone: The time zone of the date range.
- Start date: The start date for the date range.
- End date: The end date for the date range.
Query locations and terms:
- Read from CSV files: Read the query locations and terms from CSV files.
  - Query locations file: A file containing the list of query locations.
  - Query terms file: A file containing the list of query terms.
- Manually input: Manually input the query locations and terms.
  - Query locations: A table with the query locations to add.
  - Query terms: A table with the query terms to add.

A Location is the unit used for Vault queries and holds. It specifies the Google service, the location type and the value, for example:

MAIL,ACCOUNT,user1@example.com
GROUPS,ACCOUNT,group1@example.com

A Query term is a filter applied on the data covered by Vault queries and holds. It specifies the Google service and the service specific terms, for example:

MAIL,from:user1 subject:Hello has:attachment
GROUPS,from:group1

For a location, the available location types depend on the selected Google service. For example, for the Google Mail service, only Email, Organization Unit, and Entire Organization location types can be used. The Entire Organization location type is also only available for the Mail service.

2.7.5. Export Vault Saved Queries

This operation creates Vault exports in the selected Vault matter using the following settings:

Export name prefix: The prefix used in the name for the exports.
Region: The requested data region for the exports.
Message format: The file format for exported messages.
Mail options:
- Include Gmail confidential mode content: Export confidential mode content.
- Use new export system: Use the new export system.
- Export linked Drive files: Create a linked export for linked Drive files.
Drive options:
- Include access level info for users with indirect access to files: Include access level information for users with indirect access to files.
Saved query identifier type: The type of the identifiers.
Saved query identifiers: The identifiers used to find the saved queries.
Wait for completion: Wait for exports to complete.

Message format only applies to Gmail, Groups, Chat and Voice services.

2.7.6. Download Vault Exports

This operation downloads Vault exports from the selected Vault matter using the following settings:

Download location: The folder to download the exports to.
Explicitly include linked exports: Download linked exports.
Export identifier type: The type of the identifiers.
Export identifiers: The identifiers used to find the exports.

Linked exports are created when using the Export linked Drive files setting in the Export Vault Saved Queries operation.

2.7.7. Set Vault Exports

This operation selects Vault exports using the following settings:

Explicitly include linked exports: Include linked exports.
Wait for completion: Wait for exports to complete.
Export identifier type: The type of the identifiers.
Export identifiers: The identifiers used to find the exports.

2.7.8. Add Vault Holds

This operation adds Vault holds in the selected Vault matter using the following settings:

Hold name prefix: The prefix used in the name for the holds.
Mail/Groups options:
- Use date range: Set a date range to filter the data covered by the hold with the following settings:
  - Start date: The start date in UTC for the date range.
  - End date: The end date in UTC for the date range.
Drive/Chat options:
- Include items in share drives: Include files in shared drives.
- Include conversations in Chat spaces: Include messages in Chat spaces the user was a member of.
Hold locations and terms:
- Read from CSV files: Read the hold locations and query terms from CSV files.
  - Hold locations file: A file containing the list of hold locations.
  - Query terms file: A file containing the list of query terms.
- Manually input: Manually input the hold locations and terms.
  - Hold locations: A table with the hold locations to add.
  - Query terms: A table with the query terms to add.

See Create Vault Saved Queries for the definitions and examples of a Location and a Query term.

2.7.9. Remove Locations from Vault Holds

This operation removes locations from Vault holds in the selected Vault matter using the following settings:

All hold locations: Remove all hold locations.
Hold locations file: A file containing the list of hold location values to remove.
Hold locations: A table with the hold location values to remove.
Remove from all holds: Remove the specified locations from all holds.
Hold identifier type: The type of the identifiers.
Hold identifiers: The identifiers used to find the holds.

If all locations are removed from a hold, the hold will also be deleted.

2.8. Nuix Investigate

These operations assign permissions on items in the Nuix case for use in Nuix Investigate.

2.8.1. Add Items to Folder

This operation assigns items from the Nuix case matching the Scope query and the specific folder Query to the specified Folder.

If the option Include items in path is selected, all items in the path up to and including the root item will be included.

2.8.2. Remove Items from Folder

This operation removes items from the Nuix case matching the Scope query and the specific folder Query to the specified Folder.

2.8.3. Assign Folders to Group

This operation assigns Folders to Nuix Investigate Groups, identified by Name or ID.

2.9. Nuix Discover

These operations transfer data between the Nuix case and Nuix Discover and manage the build in Nuix Discover.

2.9.1. Configure Nuix Discover Connection

This operation sets the configuration used to connect to the Nuix Discover environment.

Optionally, the Discover Service can be used and point to a parameter of type Discover Service. During the submission of the workflow in Scheduler, the user will be prompted to select the Nuix Discover Service and authenticate to the service if required.

When not using a Nuix Discover Service, the following options are explicitly defined in the operation:

Discover hostname: The hostname of the Nuix Discover API, for example ringtail.us.nuix.com
API token: The API token to connect with. This token can be obtained from the Nuix Discover User Administration page → Users → username → API Access.

2.9.2. Set Nuix Discover Case

This operation retrieves the specified case ID, using the following settings:

Case identifier:
- ID: The Nuix Discover case ID.
- Name: The Nuix Discover case name.
- Name (Regex): A regular expression to match the Nuix Discover case name by.
File repository: The type of repository to use for uploading native files. For local Nuix Discover deployments set to the Windows File Share location corresponding to the imports folder of the Nuix Discover case. For SaaS deployments, use the Amazon S3 repository.

The File repository location can typically be derived from the name of the Nuix Discover case, for example using a path similar to \\DISCOVER.local\Repository\Import\{discover_case_name}. However in certain situations, the name of the import folder can be different than the name of the Nuix Discover case, for example if the case name has spaces or non-alphanumeric characters such as punctuation, or if 2 cases with the same name exist. In this scenarios, a script can be used to normalize the Nuix Discover case name and derive the expected import folder.

Existing case: The action to take if the case does not exist:
- Clone case if it does not already exist creates a new case by cloning the source case.
- Only use existing case triggers an error if the case does not exist.
Wait for case to be active: Waits for the specified time for the case to become active.

Use the Wait for case to be active option in a dedicated operation before promoting documents to Nuix Discover, to ensure that the documents can be uploaded.

Clone settings: The settings to use when cloning a case.

2.9.3. Promote to Nuix Discover

This operation exports a production set from the Nuix case and uploads the items to Nuix Discover.

The following settings can be configured:

Production set name: The name of the production set to promote to Nuix Discover.
Export standard metadata: Export items standard metadata to Nuix Discover. If checked, a copy of the metadata profile will be saved in the export folder.
Export custom metadata from profile: Optional, the metadata profile to use for additional metadata to export to Nuix Discover. To use this option, ensure that the Nuix Discover case is configured with the fields that are defined in the custom metadata profile.
Run indexing in Nuix Discover: Triggers an indexing in Nuix Discover after the documents are uploaded.

Enable the Run indexing in Nuix Discover option to have the content parsed and available for searching in Nuix Discover.

Run deduplication in Nuix Discover: Triggers a deduplication in Nuix Discover after the documents are uploaded.
Document ID strategy: Assign new Sequential document numbers from the Nuix Discover case, or use the Nuix Production set numbering.
Level: The Nuix Discover level to import documents to.
Documents per level: The maximum number of documents per level.
Filetypes: Select the components to uploadto the Nuix Discover case:
- Native files
- Text extraction from the Nuix case
- PDF image of the document
Temporary export folder: The folder where which the temporary legal export is created. After the upload is complete, the native and text files are deleted from the temporary folder.
Split export at: Break down the export and uploads into multiple part of the maximum number of items specified.
Wait for Nuix Discover job to finish: Waits until the items have been loaded into Nuix Discover before moving to the next upload part or before finishing the operation.

The Convert mail, contacts, calendars to, Export scheme, and Tag failed items as options have the same behavior as in the Legal Export operation.

2.9.4. Retrieve Metadata from Nuix Discover

This operation reads metadata from items in Nuix Discover and applies custom metadata or tags Nuix items.

The following settings can be configured:

Nuix scope query: The Nuix query to select the items to update.

Action: The action to perform on the Nuix items

Tag Matching Items: Tags items that exist in Nuix Discover and in the Nuix case
Retrieve Fields: Retrieves fields from Nuix Discover and applies it to a matching item in the Nuix case as custom metadata
Tag Matching Items and Retrieve Fields: Performs both of the actions above

Nuix Discover item source: Where in Nuix Discover to query items from

All Documents: All documents within the Nuix Discover case
Saved Search: Items that match a saved search
Production: Items in a production
Binder: Items in a binder

Match Nuix items on: The GUID or Document ID of the item to match items from Nuix Discover

Match Nuix Discover items on: The Document ID or Named Field to match items from a Nuix case

When using Name Field scope, the user must provide a field to use when matching items in Nuix Discover to items in a Nuix case

Nuix tag name: The tag name to use when an item from Nuix discover matches an items in the Nuix case

Export CSV: Export field values for Nuix Discover documents to a CSV file

Nuix Discover fields: The fields to retrieve from Nuix Discover

In addition to providing the values for the Nuix Discover fields manually, the user can also load from a CSV or TSV file, for example:

Field Name
[Meta] GUID
Document Type
Created By

2.10. Nuix ECC

These operations perform actions with Nuix ECC.

2.10.1. Configure Nuix ECC Connection

This operation sets the configuration used to connect to the Nuix ECC environment.

Optionally, the Nuix ECC Service can be used and point to a parameter of type Nuix ECC Service. During the submission of the workflow in Scheduler, the user will be prompted to select the Nuix ECC Service.

Hostname: The hostname of the Nuix ECC instance
Endpoint type: The Nuix ECC Endpoint Type, for example HTTPS.
Username: The username used to connect to the Nuix ECC instance.
Password: The password for the username above.

The value entered in this field will be stored in clear text in the workflow file - a password SHOULD NOT be entered in this field. Instead, set this field to a protected parameter name, for example {nuix_ecc_password} and see section Protected Parameters for instructions on how to set protected parameter values.

2.10.2. Set Nuix ECC Case

This operation sets the case to use for Nuix ECC Collections, using the following settings:

Case identifier: The Name, ID or Name (Regex) of the Nuix ECC case
Create case if it does not exist: Optionally, create a new Nuix ECC case if the one specified does not exist.

2.10.3. Set Nuix ECC Collection Configuration

This operation sets the configuration to use for Nuix ECC Collections, using the following settings:

Configuration identifier: The Name, ID or Name (Regex) of the Nuix ECC Collection configuration

2.10.4. Add Collection Sources to Nuix ECC Collection

This operation adds sources to collect from for the Nuix ECC Collection, using the following settings:

Collection sources: The sources to collect from:
- Identifier: The identifier of the source, for example LAPTOP-4KYG769
- Identifier type: The Name, ID or Name (Regex) used to determine how a source is identified
- Source type: The type of source the user is using, for example Computer
- Collection strategy: The strategy used when collecting from the source, either Use Configuration or Use Custom Paths
- Collection custom paths: The custom paths to collect from, for example C:\Data\Files

When using the collection strategy Use Configuration, sources must have a predefined location to collect from. This setting is defined from within the ECC Admin Console application.

In addition to providing the values for the collection sources manually, the user can also load from a CSV or TSV file, for example:

Identifier  IdentifierType  SourceType  CollectionStrategy  CollectionCustomPaths
LAPTOP-4KYG769  NAME    COMPUTER    PREDEFINED  ""
Server\s\d  NAME_REGEX  COMPUTER    PREDEFINED  ""
119 ID  COMPUTER    CUSTOM_PATH "C:\Data\Files,C:\Users\Admin\Documents,D:\Temp"

When specifying collection sources from a CSV or TSV file, if the user is using custom paths the paths must use a comma delimiter ,, for example C:\Data\Files,C:\Users\Admin\Documents

2.10.5. Deploy Nuix ECC Agents

This operation is used to deploy Nuix ECC agents on computers, using the following settings:

Service account username: The username of the service account used to run commands on computers
Service account password: The password of the service account used to run commands on computers
Computer names: The names of computers to deploy ECC agents on, for example DESKTOP-AZH1K4

In addition to providing the values for the computer names manually, the user can also load from a CSV or TSV file, for example:

ComputerName
LAPTOP-4KYG769
DESKTOP-AZH1K4
Server2

Install command: the command used to install the ECC agent on a computer

Example using WinRS to deploy agents:

winrs /r:{computer_name} /u:{username} /p:{password} "msiexec.exe /i PATH_TO_INSTALLER /q /norestart"

Example using PsExec to deploy agents:

PATH_TO_PSEXEC \\{computer_name} -u {username} -p {password} -nobanner -s msiexec.exe /i PATH_TO_INSTALLER /q /norestart

The PATH_TO_INSTALLER is the path to the ECC Client Installer, for example \\Storage\Installers\ECC_Client_Installer.msi. The PATH_TO_PSEXEC is the path to the PsExec executable, for example C:\SysInternals\psexec.exe

The install command uses custom parameters and exposes {computer_name}, {username} and {password}. The username and password parameters will always be the service account username and the service account password. The computer name parameter will change to the name of the computer which the agent is being installed on.

Retry command on failure: Retries the install command if it failed to run the first time, the user also has the option to set how many times they would like to retry to command
Timeout: The amount of time given for the agent to deploy and be visible on Nuix ECC Admin console.

The timeout is for the command and applies for each run of the command, for example if the user set a timeout of 2 minutes and allowed up to 5 command retries, then each time the command ran the command would time out every two minutes. If the command failed 5 times the total time for the command would be 10 minutes.

2.10.6. Submit Nuix ECC Collection

This operation submits a Nuix ECC Collection to ECC, using the following settings:

Collection name: The name of the Nuix ECC Collection

The name of the Nuix ECC Collection may change if more than one collection source is in the collection, the format for collections with more than one collection source is collection_name (1 of 4). Where collection name is the name of the collection 1 would be the index of the collection source and 4 is the total amount of collection sources in the collection.

Wait for collection to finish: Optionally, waits for the Nuix ECC Collection to finish before moving onto the next operation.
Collection location: The location where collected files will be stored. This location must be available to all computers.

2.10.7. Remove Nuix ECC Agents

This operation is used to remove Nuix ECC agents on computers, using the following settings:

Service account username: The username of the service account used to run commands on computers
Service account password: The password of the service account used to run commands on computers
Computer names: The names of computers to remove ECC agents on, for example DESKTOP-AZH1K4

In addition to providing the values for the computer names manually, the user can also load from a CSV or TSV file, for example:

ComputerName
LAPTOP-4KYG769
DESKTOP-AZH1K4
Server2

Uninstall command: the command used to uninstall the ECC agent on a computer

Example using WinRS to uninstall agents:

winrs /r:{computer_name} /u:{username} /p:{password} "msiexec.exe /x PATH_TO_INSTALLER /q /norestart"

Example using PsExec to uninstall agents:

PATH_TO_PSEXEC \\{computer_name} -u {username} -p {password} -nobanner -s msiexec.exe /x PATH_TO_INSTALLER /q /norestart

The uninstall command uses custom parameters and exposes {computer_name}, {username} and {password}. The username and password parameters will always be the service account username and the service account password. The computer name parameter will change to the name of the computer which the agent is being installed on.

Retry command on failure: Retries the uninstall command if it failed to run the first time, the user also has the option to set how many times they would like to retry to command
Timeout: The amount of time given for the agent to be removed

2.11. Nuix Engine

These operations perform actions with the Nuix Engine.

2.11.1. Configure Nuix

This operation is used to define the settings of the Nuix processing engine, from Nuix Configuration profile and/or a Nuix Processing profile. The use of Processing profiles is recommended over Configuration profiles.

By default, Nuix stores configuration profiles in the user-specific folder %appdata%\Nuix\Profiles. To make a configuration profile available to all users, copy the corresponding .npf file to %programdata%\Nuix\Profiles.

Only a subset of settings from the Configuration profiles are supported in Automate Workflow, including Evidence Processing Settings (Date Processing, MIME Type, Parallel Processing), Legal Export (Export Type - partial, Load File - partial, Parallel Processing).

Configure Workers

The worker settings can either be extracted from the Nuix settings (see above) or can be explicitly provided in the workflow.

For local workers, these settings can be used to specify the number of local workers, the memory per worker and the worker temporary directory.

Nuix does not support running the OCR Operation and Legal Export operation with no local workers. If a value of 0 is specified in the local workers for these operations, Automate Workflow will start the operation with 1 local worker and as many remote workers as requested.

When the option Use remote workers is enabled, Automate will attempt to add as many workers as the engine allows, for example if the users engine allows 5 workers - the operation will attempt to add 5 workers.

Parallel processing settings can also be set using the following parameters:

{local_worker_count} - The number of local workers to run;
{local_worker_memory} - The memory (in MB) of each local worker;

Password settings

Passwords are used during the loading and re-loading of the data in Nuix. This section allows for specifying the use of a password list of passwords file.

Keystore settings

Keystores are used during the loading and re-loading of the data in Nuix. This section allows for specifying a CSV or TSV file containing the keystore information.

The keystore configuration file expects the following columns:

Path: The file path to the keystore
Password: The password of the keystore
Alias: The alias to use from the keystore
AliasPassword: The password for the alias
Target: The notes storage format file (NSF)

Sample Lotus Notes ID:

Path	Password	Alias	AliasPassword	Target
C:\Stores\Lotus\user.id	password			example.nsf
C:\Stores\Lotus\automate.id	password123			automate.nsf

When configuring a Lotus Notes ID store, the target can be the full path or the filename of the notes storage format file (NSF). Additionally the target can be set to * for the ID file to be applied to any (NSF) file.

Sample showing PGP, PKCS12 and Lotus Notes ID:

Path	Password	Alias	AliasPassword	Target
C:\Stores\PGP\0xA8B31F11-sec.asc		test@example.com	test_password
C:\Stores\PKCS12\template.keystore	password	ssl_cert
C:\Stores\Lotus\user.id	password			example.nsf
C:\Stores\PKCS12\example.keystore	password123	example-sample
C:\Stores\PGP\0x9386E293-sec.asc		user@example.com	abcd1234

When configuring the keystore file not all columns will have values, before adding this file to the workflow verify the values are in the correct columns.

A single keystore can be set using the following parameters:

{keystore_file_path} - The path to the keystore.
{keystore_file_password} - The password of the keystore.
{keystore_file_alias} - The alias to use from the keystore.
{keystore_file_alias_password} - The password for the alias.
{keystore_file_target} - The notes storage format file (NSF).

When using a single keystore the {keystore_file_path} parameter must contain a valid file path for the keystore to be added.

The keystore file can also be set using the parameter:

{keystore_tsv} - The file path to the keystore CSV or TSV file;

Require Nuix Profiles in Execution Profile

When using the workflow in Automate, selecting the option Require all Nuix profiles to be supplied in the Execution Profile option will require that all Nuix profiles used in the Workflow are explicitly supplied in the Execution Profile. If profiles are missing, the Job will not start.

2.11.2. Use Case

This operation opens an existing Nuix case or creates one, depending on the Method option specified.

The case timezone can be overwritten by setting parameter {case_timezone_id}. See Joda Time Zones for a list of valid timezone IDs.

2.11.3. Add to Compound Case

This operation adds existing cases to the currently opened Nuix case.

The current Nuix case must be a compound case, otherwise this operation will fail during execution.

By default the compound case will be closed and reopened after all child cases are added. The option Skip reloading compound case changes this behavior and does not reload the compound case. Some operations might not perform correctly when using this option due to the compound case not being refreshed.

2.11.4. Add Evidence

This operation adds evidence to the Nuix case.

The type of data that is added to the Nuix case is defined using the Scope setting:

Path item: Adds the file or folder as an evidence container.
Path contents: Adds the contents of the folder as an evidence container.
Path contents as separate evidence: Create separate evidence containers for each file or subfolder in the base folder.
Loadfile: Adds the contents of a Concordance, CSV or EDRM XML 1.2 loadfile as an evidence container.
Evidence listing: Creates an evidence container for each line in the listing. See Add Evidence from Evidence listing.
Google Vault Export: Creates an evidence container for each Google Vault Export and adds Drive Link Export files as child items. See Add Evidence from Google Vault Exports.
Data set: Adds the contents of the Data set as an evidence container. See Add Evidence from Data Set.
Microsoft Graph: Adds Teams, Calendar, Contacts, Sharepoint data to an evidence container using Microsoft Graph. See Add Evidence from Microsoft Graph.
SharePoint: Adds SharePoint data to an evidence container. See Add Evidence from SharePoint.
Exchange: Adds Exchange data to an evidence container. See Add Evidence from Exchange.
Enterprise Vault: Adds Enterprise Vault data to an evidence container. See Add Evidence from Enterprise Vault.
S3: Adds S3 data to an evidence container. See Add Evidence from S3.
SQL Server: Adds SQL Server data to an evidence container. See Add Evidence from SQL Server.
Oracle: Adds Oracle database data to an evidence container. See Add Evidence from Oracle.
Documentum: Adds Documentum data to an evidence container. See Add Evidence from Documentum.
Dropbox: Adds Dropbox data to an evidence container. See Add Evidence from Dropbox.
SSH: Adds SSH data to an evidence container. See Add Evidence from SSH.
Historical Twitter: Adds Historical Twitter data to an evidence container. See Add Evidence from Historical Twitter.

The source data timezone specified in the settings, and can be overwritten by setting parameter {data_timezone_id}. See Joda Time Zones for a list of valid timezone IDs.

The source encoding and zip encoding can be specified in the settings.

2.11.5. Deduplication

If this option is selected, data will be deduplicated at ingestion. Unless data will be added to the case in a single batch, the option Track and deduplicate against multiple batchloads needs to be selected.

The mechanism for deduplication at ingestion is designed to be used for the specific scenarios where a large amount of data is loaded and which is expected to have a high level of duplication. Due to the live synchronization required between the Nuix workers during the ingestion, only one ingestion with deduplication can run at a time on a server, and no remote workers can be added.

Handling of duplicate items:

Metadata-only processing: Deduplication status is tracked using the metadata field Load original. Top-level original items will have the value true in this field and will have all typical metadata and descendants processed - the descendants will not have this metadata field populated. Top-level duplicate items will have value false in this field and no other properties except for the metadata field Load duplicate of GUID which will indicate the GUID of the original document with the same deduplication key as the duplicate document.

To query all items that were not flagged as duplicates, use query !boolean-properties:"Load original":false.

Skip processing entirely will completely skip items identified as duplicates and no reference of these items will exist in the case.

Deduplication method:

Top-level MD5: Uses the MD5 hash of the top-level item.
Email Message-ID: Uses the email Message-ID property from the first non-blank field: Message-ID, Message-Id, Mapi-Smtp-Message-Id, X-Message-ID, X-Mapi-Smtp-Message-Id, Mapi-X-Message-Id, Mapi-X-Smtp-Message-Id.
Email MAPI Search Key: Uses the email MAPI Search Key property from the first non-blank field: Mapi-Search-Key, X-Mapi-Search-Key.

For a deduplication result similar to the post-ingestion Nuix ItemSet deduplication, check option Top-level MD5 only. For the most comprehensive deduplication result, check all three options.

Emails in the Recoverable Items folder are not considered for deduplication based on Message-ID and MAPI Search Key, due to the fact that data in this folder is typically unreliable.

2.11.6. Date filter

All modes other than No filter specify the period for which data will be loaded. All items that fall outside of the date filter will be skipped entirely and no reference of these items will exist in the case.

2.11.7. Mime type filter

Allows to set a filter to restrict data of certain mime-types to specific names.

For example, the filter mode Matches, with mime-type application/vnd.ms-outlook-folder and item name Mailbox - John Smith will have the following effect:

Items which are in a PST or EDB file, must have the first Outlook Folder in their path named Mailbox - John Smith.
Items which are not in a PST or EDB file are not affected.

The Mime type filter can be used to select specific folders for loading from an Exchange Database (EDB) file.

2.11.8. Add Evidence from Evidence listing

When selecting the Scope option Evidence listing, the Source path is expected to point to a CSV or TSV file with the following columns:

Name: The name of the evidence container
Path: The path to the file or folder to load
Custodian: Optional, the custodian value to assign
Timezone: Optional, the timezone ID to load the data under. See Joda Time Zones for a list of valid timezone IDs.
Encoding: Optional, the encoding to load the data under.
ZipEncoding: Optional, the encoding to load the zip files under.

If additional columns are specified, these will be set as custom evidence metadata.

If optional settings are not provided, the default settings from the Add Evidence operation will be used.

When selecting the option Omit evidence folder names, the last folder name from the path to each evidence included in the listing will not be included in the path in the Nuix case. Instead, all items from the folder will appear directly under the evidence container.

Sample evidence listing:

Name	Path	Custodian	Encoding	Timezone	Sample Custom Field	Another Sample Field
Evidence1	C:\Data\Folder1	Morrison, Jane	UTF-8	Europe/London	Value A	Value B
Evidence2	C:\Data\Folder2	Schmitt, Paul	Windows-1252	Europe/Berlin	Value C	Value D

2.11.9. Add Evidence from Data Set

When selecting the Scope option Data set, the Data set ID field should point to a data set parameter defined in the Configuration operation.

The Data set scope is only compatible with jobs submitted in Automate Scheduler and for Matters that have Data sets associated with them.

2.11.10. Add Evidence from Google Vault Exports

When selecting the Scope option Google Vault Export, the Source path is expected to point to a folder containing all the Google Vault Exports and Drive Link Exports. This is the same folder structure obtained when downloading exports from the Download Vault Exports operation.

There are three separate ways to add Drive Link Exports:

As family items: Skip creating an evidence container for the Drive Link Export and add each drive link file as a family item.
- Linked export file add as family item limit: Limit how many times a drive link file can be added as a family item. After the limit has been reached, a placeholder will be used instead.
- Replace linked export file with placeholder in duplicate families: Whether to use a placeholder for the drive link file when encountering duplicate families.
As full standalone items + placeholder family items: Create an evidence container for the Drive Link Export and add placeholder files as family items in-place of the drive link file.
As standalone items: Create an evidence container for the Drive Link Export without any link to the parent Export files.

When using the As family items option, the operation can potentially take a long period of time if there are a lot of drive link files to be added as family items. Using another method, or the item limit, can help prevent this issue.

Placeholder files are used to lessen the strain on adding drive link files as family items.

All placeholder files track a Content Item GUID custom metadata which points to the full item represented by the placeholder.

After adding all Google Vault Exports and Drive Link Exports, the option Associate Google Vault metadata will parse and assign custom metadata from the metadata xml and csv files found in the Export folders.

2.11.11. Add Evidence from Microsoft Graph

When adding data using the Microsoft Graph, the following configuration parameters must be defined prior to the Add Evidence operation.

{ms_graph_tenant_id}: The tenant ID for Azure AD.
{ms_graph_client_id}: The client/application ID for the app that has been registered with Azure AD which and granted the necessary privileges.
{ms_graph_client_secret_protected}: The client secret that has been configured for the client ID provided, for authentication.
{ms_graph_certificate_store_path}: The path to a PKCS#12 certificate store, to use instead of the client secret, for authentication.
{ms_graph_certificate_store_password}: The password for the PKCS#12 certificate store, if present.
{ms_graph_username}: Optionally, the username for a user that is a member of the Teams to be processed, only needed for ingesting Team Calendars.
{ms_graph_password}: The password for the username, if it is present.

For authentication, one of the {ms_graph_client_secret_protected} or {ms_graph_certificate_store_path} parameters must be set.

{ms_graph_start_datetime}: The beginning of the collection date range.
{ms_graph_end_datetime}: The end of the collection date range.

For collection of calendars (Users or Teams), the date range cannot exceed 5 years.

{ms_graph_retrievals}: A list of the content types to be retrieved, containing one or more of the following values: TEAMS_CHANNELS, TEAMS_CALENDARS, USERS_CHATS, USERS_CONTACTS, USERS_CALENDARS, USERS_EMAILS, ORG_CONTACTS, SHAREPOINT.
{ms_graph_mail_folder_retrievals}: Optionally, a list of mail folders to retrieve from, containing one or more of the following values: ARCHIVE, CLUTTER, CONVERSATION_HISTORY, DELETED_ITEMS, DRAFTS, INBOX, JUNK, OUTBOX, SENT_ITEMS, SYNC_ISSUES, OTHER, RECOVERABLE_ITEMS_DELETIONS, RECOVERABLE_ITEMS_PURGES, RECOVERABLE_ITEMS_DISCOVERY_HOLDS, RECOVERABLE_ITEMS_SUBSTRATE_HOLDS, RECOVERABLE_ITEMS_OTHER.

In addition to the retrieval options above, the values ALL, MAILBOX_ALL, and RECOVERABLE_ITEMS_ALL may be used to include all retrieval options, all retrievals in the user’s mailbox, and all retrievals of recoverable items, respectively.

{ms_graph_team_names}: Optionally, a list of team names to filter on.
{ms_graph_user_principal_names}: Optionally, a list of user principal names to filter on.
{ms_graph_version_retrieval}: Optionally, a boolean indicating of all versions should be retrieved. Defaults to false
{ms_graph_version_limit}: Optionally, an integer limiting the number of versions retrieved if version retrievel is enabled. Defaults to -1 which retrieves all versions available.

Sample Microsoft Graph collection parameters:

{ms_graph_tenant_id} : example.com
{ms_graph_client_id} : 6161a8bb-416c-3015-6ba5-01b8ca9819f6
{ms_graph_client_secret_protected} : AvjAvbb9akNF<pbpaFvz,mAGjgdsl>vk
{ms_graph_start_datetime} : 20180101T000000
{ms_graph_end_datetime} : 20201231T235959
{ms_graph_user_principal_names} : john.smith@example.com, eve.rosella@example.com
{ms_graph_retrievals} : TEAMS_CHANNELS, USERS_CHATS, USERS_EMAILS, SHAREPOINT
{ms_graph_mailbox_retrievals} : MAILBOX, ARCHIVE, RECOVERABLE_ITEMS, ARCHIVE_RECOVERABLE_ITEMS

For details on how to configure the Microsoft Graph authentication, see the Nuix documentation on the Microsoft Graph connector at https://download.nuix.com/system/files/Nuix%20Connector%20for%20Microsoft%20Office%20365%20Guide%20v9.0.0.pdf

2.11.12. Add Evidence from SharePoint

When adding data from SharePoint, the following configuration parameters must be defined prior to the Add Evidence operation.

{sharepoint_uri}: A URI specifying the site address.
{sharepoint_domain}: This optional parameter defines the Windows networking domain of the server account.
{sharepoint_username}: The username needed to access the account.
{sharepoint_password}: The password needed to access the account.

2.11.13. Add Evidence from Exchange

When adding data from Exchange, the following configuration parameters must be defined prior to the Add Evidence operation.

{exchange_uri}: The path to the Exchange Web Service (e.g. https://ex2010/ews/exchange.asmx).
{exchange_domain}: This optional parameter defines the Windows networking domain of the server account.
{exchange_username}: The username needed to access the account.
{exchange_password}: The password needed to access the account.
{exchange_mailbox}: The mailbox to ingest if it differs from the username.
{exchange_impersonating}: A boolean, defaults to false. This optional setting instructs Exchange to impersonate the mailbox user instead of delegating when the mailbox and username are different.
{exchange_mailbox_retrieval}: A list containing one or more of the following values: mailbox, archive, purges, deletions, recoverable_items, archive_purges, archive_deletions, archive_recoverable_items, public_folders.
{exchange_from_datetime}: This optional parameter limits the evidence to a date range beginning from the specified date/time. It must be accompanied by the {exchange_to_datetime} parameter.
{exchange_to_datetime}: This optional parameter limits the evidence to a date range ending at the specified date/time. It must be accompanied by the {exchange_from_datetime} parameter.

2.11.14. Add Evidence from Enterprise Vault

When adding data from Enterprise Vault, the following configuration parameters must be defined prior to the Add Evidence operation.

{ev_computer}: The hostname or IP address of Enterprise Vault.
{ev_vault}: A vault store ID. This optional parameter limits the evidence to the specified Enterprise Vault vault.
{ev_archive}: An archive ID. This optional parameter limits the evidence to the specified Enterprise Vault archive.
{ev_custodian}: A name. This optional parameter limits the evidence to the specified custodian or author.
{ev_from_datetime}: This optional parameter limits the evidence to a date range beginning from the specified date/time. It must be accompanied by the {ev_to_datetime} parameter.
{ev_to_datetime}: This optional parameter limits the evidence to a date range ending at the specified date/time. It must be accompanied by the {ev_from_datetime} parameter.
{ev_keywords}: This optional parameter limits the evidence to results matching Enterprise Vault’s query using the words in this string. Subject and message/document content are searched by Enterprise Vault and it will match any word in the string unless specified differently in the {ev_flag} parameter.
{ev_flag}: An optional value from any, all, allnear, phrase, begins, beginany, exact, exactany, ends, endsany.

The {ev_flag} parameter specifies how keywords are combined and treated for keyword-based queries. It must be accompanied by the {ev_keywords} parameter but will default to any if it is omitted.

2.11.15. Add Evidence from S3

When adding data from S3, the following configuration parameters must be defined prior to the Add Evidence operation.

{s3_access}: This parameter specifies the access key ID for an Amazon Web Service account.
{s3_secret_protected}: This parameter specifies the secret access key for an Amazon Web Service account.
{s3_credential_discovery_boolean}: This optional parameter is only valid when access and secret are not specified. A true value allows credential discovery by system property. A false or omitted value will attempt anonymous access to the specified bucket.
{s3_bucket}: This optional parameter specifies a bucket and optionally a path to a folder within the bucket that contains the evidence to ingest. For example, mybucketname/top folder/sub folder. Omitting this parameter will cause all buckets to be added to evidence.
{s3_endpoint}: This optional parameter specifies a particular Amazon Web Service server endpoint. This can be used to connect to a particular regional server, e.g. https://s3.amazonaws.com.

2.11.16. Add Evidence from Documentum

When adding data from Documentum, the following configuration parameters must be defined prior to the Add Evidence operation.

{documentum_domain}: This optional parameter defines the Windows networking domain of the server account.
{documentum_username}: The username needed to access the account.
{documentum_password}: The password needed to access the account.
{documentum_port_number}: The port number to connect on.
{documentum_query}: A DQL query. This optional parameter specifies a query used to filter the content.
{documentum_server}: This parameter specifies the Documentum server address.
{documentum_doc_base}: This parameter specifies the Documentum docbase repository.
{documentum_property_file}: This optional parameter specifies the Documentum property file.

2.11.17. Add Evidence from SQL Server

When adding data from SQL Server, the following configuration parameters must be defined prior to the Add Evidence operation.

{sql_server_domain}: This optional parameter defines the Windows networking domain of the server account.
{sql_server_username}: The username needed to access the account.
{sql_server_password}: The password needed to access the account.
{sql_server_computer}: The hostname or IP address of the SQL Server.
{sql_server_max_rows_per_table_number}: The maximum number of rows to return from each table or query. This parameter is optional. It can save time when processing tables or query results with very many rows. The selection of which rows will be returned should be considered arbitrary.
{sql_server_instance}: A SQL Server instance name.
{sql_server_query}: A SQL query. This optional parameter specifies a query used to filter the content.

2.11.18. Add Evidence from Oracle

When adding data from Oracle, the following configuration parameters must be defined prior to the Add Evidence operation.

{oracle_username}: The username needed to access the account.
{oracle_password}: The password needed to access the account.
{oracle_max_rows_per_table}: The maximum number of rows to return from each table or query. This parameter is optional. It can save time when processing tables or query results with very many rows. The selection of which rows will be returned should be considered arbitrary.
{oracle_driver_type}: The driver type used to connect. Can be thin, oci, or kprb.
{oracle_database}: A string representation of the connection params. The possible formats are documented at https://www.oracle.com/database/technologies/faq-jdbc.html#05_04
{oracle_role}: The role to login as, such as SYSDBA or SYSOPER. For normal logins, this should be blank.
{oracle_query}: A SQL query. This parameter specifies a query used to filter the content.

2.11.19. Add Evidence from Dropbox

When adding data from Dropbox, the following configuration parameters must be defined prior to the Add Evidence operation.

{dropbox_auth_code_protected}: A string retrieved via a webpage on Dropbox that enables access to an account.
{dropbox_team_boolean}: A boolean that indicates that a Dropbox team will be added to evidence. This optional parameter should be present and set to true for all invocations when adding a Dropbox team to evidence. It can be omitted to add an individual Dropbox account.
{dropbox_access_token_protected}: A string retrieved using the authCode that enables access to an account. If the access token to an account is already known, provide it directly using this parameter instead of {dropbox_auth_code_protected}. This code doesn’t expire unless the account owner revokes access.

2.11.20. Add Evidence from Slack

When adding data from Slack, the following configuration parameters must be defined prior to the Add Evidence operation.

{slack_auth_code_protected}: The temporary authentication code. Initiate a manual collection through Nuix Workstation to retrieve this code.
{slack_user_ids}: Optionally, the internal Slack IDs of the users to which the collection should be limited.
{slack_start_datetime}: Optionally, the beginning of the collection date range.
{slack_end_datetime}: Optionally, the end of the collection date range.

2.11.21. Add Evidence from SSH

When adding data from SSH, the following configuration parameters must be defined prior to the Add Evidence operation.

{ssh_username}: The username needed to access the account.
{ssh_password}: The password needed to access the account.
{ssh_sudo_password}: The password needed to access protected files when using SSH key based authentication.
{ssh_key_folder}: Points to a folder on the local system which holds the SSH authentication key pairs.
{ssh_computer}: The hostname or IP address of Enterprise Vault.
{ssh_port_number}: The port number to connect on.
{ssh_host_fingerprint}: The expected host fingerprint for the host being connected to. If this value is not set then any host fingerpint will be allowed, leaving the possibility of a man in the middle attack on the connection.
{ssh_remote_folder}: A folder on the SSH host to start traversing from. This optional parameter limits the evidence to items underneath this starting folder.
{ssh_accessing_remote_disks_boolean}: A boolean. When set to true, remote disks (e.g. /dev/sda1 will be exposes as evidence instead of the remote systems file system structure.

2.11.22. Add Evidence from Historical Twitter

When adding data from Twitter, the following configuration parameters must be defined prior to the Add Evidence operation.

{twitter_access_token}: A string retrieved using the authCode that enables access to an account. A new app can be created at https://apps.twitter.com to generate this token.
{twitter_consumer_key}: The consumer key (API key) of the Twitter app.
{twitter_consumer_secret_protected}: The consumer secret (API secret) of the Twitter app.
{twitter_access_token_secret_protected}: The access token secret of the Twitter app.

2.11.23. Add Evidence Repository

This operation adds an evidence repository to the case. The typical Nuix options can be used to customize the evidence repository settings.

This operation does not load data into the case. The Rescan Evidence Repositories operation must be used to add data.

2.11.24. Rescan Evidence Repositories

This operation rescans all evidence repositories and adds new data to the case.

The option No new evidence behavior can be used to show a warning, trigger an error, or finish the execution of the workflow if no new evidence is discovered.

2.11.25. Detect and Assign Custodians

This operation detects custodian names using one of the following options:

Set custodians from folder names sets the custodian to the same name as the folder at the specified path depth.
Set custodians from folder names with typical custodian names attempts to extract custodian names from the folder names, where the folder names contain popular first names, up to the specified maximum path depth.
Set custodians from PST files sent emails sender name attempts to extract custodian names from the name of the sender of emails in the Sent folder.
Set custodians from data set metadata sets the custodian names defined in the Custodian field in the data set metadata.

When using Set custodians from folder names option, ensure that the scope query contains all of the folders from the Nuix case root up to the folder depth defined. For example, the query path-guid:{evidence_guid} is not valid because it only contains the items below the evidence container but not the evidence container itself. On the other hand, the query batch-load-guid:{last_batch_load_guid} is valid because it contains all of the items loaded in that specific batch, including the evidence container and all of the folders on which custodian values will be assigned.

The settings of this operation can also be controlled using the following parameters:

{set_custodian_from_folder_name} - Enable or disable the Set custodians from folder names option;
{custodian_folder_level} - The folder depth corresponding to the Set custodians from folder names option;
{set_custodian_from_typical_folder_name} - Enable or disable the Set custodians from folder names with typical custodian names option;
{max_custodian_typical_folder_level} - The max folder depth corresponding to the Set custodians from folder names with typical custodian names option.
{set_custodian_from_pst} - Enable or disable the Set custodians from PST files sent emails sender name option;

The parameters for enabling or disabling options can be set to true, yes, or Y to enable the option, and to anything else to disable the option.

2.11.26. Exclude Items

This operation excludes items from the case that match specific search criteria.

Entries can be added to the exclusions list using the + and - buttons, or loaded from exclusions list from a CSV or TSV file.

The exclusions can also be loaded from a file during the workflow execution, using the Exclusions file option.

Parameters can be used in the Exclusions file path, to select an exclusion file dynamically based on the requirements of the workflow.

2.11.27. Include Items

This operation includes items previously excluded.

Excluded items which are outside of the scope query will not be included.

Items belonging to all exclusion categories can be included, or alternatively, exclusion names can be specified using the + and - buttons, or loaded from a text file.

2.11.28. Add to Item Set

This operation adds items to an existing Item Set or creates a new Item Set if one with the specified name does not exist.

If the list of items to add to an Item Set is empty, the first Root Item is temporarily added as a filler item to help create the Item Set Batch.

In addition to the standard Nuix deduplication options, Automate Workflow offers two additional deduplication methods:

Message ID: Uses the email Message-ID property from the first non-blank field: Message-ID, Message-Id, Mapi-Smtp-Message-Id, X-Message-ID, X-Mapi-Smtp-Message-Id, Mapi-X-Message-Id, Mapi-X-Smtp-Message-Id, PR_INTERNET_MESSAGE_ID.
Message ID / MD5: Uses the email Message-ID property if available, or alternatively the MD5.
Mapi Search Key: Uses the email MAPI Search Key property from the first non-blank field: Mapi-Search-Key, X-Mapi-Search-Key.

When performing a deduplication by family based on Message-ID or MAPI Search Key, two batches will be created: one for top-level items (with suffix TL) and another one for non-top-level items (with suffix NonTL). To query for original items in both of these batches, use syntax:
item-set-batch:("{last_item_set_originals_batch} TL" OR "{last_item_set_originals_batch} NonTL")

2.11.29. Remove from Item Set

This operation removes items, if present, from the specified Item Set.

2.11.30. Delete Item Set

This operation deletes the specified Item Set.

2.11.31. Add Items to Digest List

This operation adds items to a digest list with the option to create the digest list if it doesn’t exist.

A digest list can be created in one of the three digest list locations:

Case: Case location, equivalent to the following subfolder from the case folder Stores\User Data\Digest Lists
User: User profile location, equivalent to %appdata%\Nuix\Digest Lists
Local Computer: Computer profile location, equivalent to %programdata%\Nuix\Digest Lists

2.11.32. Remove Items from Digest List

This operation removes items, if present, from the specified digest list.

2.11.33. Manage Digest Lists

This operation performs an operation on the two specified digest lists and then saves the resulting digest list in the specified digest list location.

List of operations:

Add: Produces hashes which are present in either digest list A or digest list B;
Subtract: Produces hashes present in digest list A but not in digest list B;
Intersect: Produces hashes which are present in both digest list A and digest list B.

2.11.34. Delete Digest List

This operation deletes the specified digest list, if it exists, from any of the specified digest list locations.

2.11.35. Digest List Import

This operation imports a text or Nuix hash file into the specified digest list location.

Accepted file formats:

Text file (.txt, .csv, .tsv). If the file contains a single column, hashes are expected to be provided one per line. If the file contains multiple columns, a column with the header name MD5 is expected
Nuix hash (.hash) file

2.11.36. Digest List Export

This operation exports a Nuix digest list to the specified location as a text file. The resulting text file contains one column with no header and one MD5 hash per line.

2.11.37. Search and Tag

This operation tags items from the case that match specific search criteria.

Options:

Identify families: If selected, the operation will search for Family items and Top-Level items of items with hits for each keyword.
Identify descendants If selected, the operation will search for descendants of items with hits for each keyword.
Identify exclusive hits ("Unique" hits) oIf selected, the operation will search for Exclusive hits (items which only hit on one keyword), Exclusive family items (items for which the entire family only hit on one keyword) and Exclusive top-level items (also items for which the entire family only hit on one keyword).
Compute size If selected, the operation will compute the audited size for Hits and Family items.
Compute totals If selected, the operation will compute the total counts and size for all keywords.
Breakdown by custodian If selected, the searches and reporting will be performed for each individual custodian, as well as for items with no custodians assigned.
Log results If selected, the search counts will be printed in the Execution Log.

2.11.38. Tags

If the Assign tags option is selected, items will be tagged under the following tag structure:

Tag prefix
- Hits
  - Keyword tag: Items that matched the search query.
- Families
  - Keyword tag: Families of items that matched the search query.
- TopLevel
  - Keyword tag: Top-level items of items that matched the search query.
- Descendants
  - Keyword tag: Descendants of items that matched the search query.
- ExclusiveHits
  - Keyword tag: Items that hit exclusively on the keyword.
- ExclusiveFamilies
  - Keyword tag: Families that hit exclusively on the keyword.
- ExclusiveTopLevel
  - Keyword tag: Top-level items of families that hit exclusively on the keyword.

If the Remove previous tags with this prefix option is selected, all previous tags starting with the Tag prefix will be removed, regardless of the search scope, according to the Remove previous tags method.

This operation can be used with an empty list of keywords and with the Remove previous tags with this prefix enabled, in order to remove remove tags that have been previously applied either by this operation or in another way.

The Remove previous tags with this prefix method renames Tag prefix to Automate|SearchAndTagOld|Tag prefix_{datetime}. Although this method is the fasted, after running the Search and Tag operation multiple times it can create a large number of tags which might slow down manual activities in Nuix Workbench.

2.11.39. Reporting

This option generates a search report in an Excel format, based on a template file.

See Processing Report for information on using a custom template.

2.11.40. Keywords

The keywords can either be specified manually in the workflow editor interface, or loaded from a file.

The following file formats are supported:

.csv: Comma-separated file, with the fist column containing the keyword name or tag and the second column containing the keyword query. If the first row is a header with the exact values tag and query, the line will be read as a header. Otherwise, it will be read as a regular line with a keyword and tag name.
.tsv, .txt : Tab-separated file, with the fist column containing the keyword name or tag and the second column containing the keyword query.
.json: JSON file, either exported from the Nuix Search and Tag window, or containing a list of searches, with each search containing a tag and a query.

Sample JSON file:

{
  "searches": [
    {
      "tag": "KW 01",
      "query": "Plan*"
    },
    {
      "tag": "KW 02",
      "query": "\"Confidential Data\" OR Privilege"
    }
  ]
}

Alternatively, the path to a keywords file can be supplied which will be loaded when the workflow executes.

2.11.41. Search and Assign Custodians

This operation assigns custodians to items from the case that match specific search criteria.

Entries can be added to the custodian/query list using the + and - buttons, or loaded from a CSV or TSV file.

2.11.42. Tag Items

This operation searches for items in the scope query.

Then it matches the items to process either as the items in scope, or duplicates of the items in scope, as individuals or by family.

The tag name is applied to either the items matched (Matches), their families (All families), their descendants (All Descendants), items matched and their descendants (Matches and Descendants) or their families top-level items (Top-level).

2.11.43. Untag Items

This operation removes tags for the items in the scope query.

Optionally, if the tags are empty after the items in scope are untagged, the remove method can be set to delete the tags.

When the option to remove tags starting with a prefix is specified, tags with the name of the prefix and their subtags are removed. For example, if the prefix is set to Report, tags Report and Report|DataA will be removed but not Reports.

2.11.44. Match Items

This operation reads a list of MD5 and/or GUID values from the specified text file. Items in scope with a matching MD5 and/or GUID values of the items in scope are tagged with the value supplied in the Tag field.

2.11.45. Date Range Filter

This operation filters items in the scope query to items within the specified date range using either the item date, top-level item date or a list of date-properties.

Then it applies a tag or exclusion similar to Tag Items.

Use \* as a date property to specify all date properties.

The dates for this range can be specified using the parameters {filter_before_date} and {filter_after_date}.

2.11.46. Find Items with Words

This operation analyzes the text of the items in scope and determines if the item is responsive if the number of words respects the minimum and maximum count criteria.

The words are extracted by splitting the text of each item using the supplied regex.

Sample regex to extract words containing only letters and numbers:

[^a-zA-Z0-9]+

Sample regex to extract words containing only letters:

[^a-zA-Z]+

Sample regex to extract words containing any character, separated by a whitespace character (i.e. a space, a tab, a line break, or a form feed)

\s+

2.11.47. Filter Emails

This operation performs advanced searches for emails, based on recipient names, email addresses and domain names.

The Wizard feature prepopulates the filtering logic based on one of the following scenarios:

Tag internal-only emails
Tag communications between two individuals only
Tag communications within a group

2.11.48. Add Items to Cluster Run

This operation adds items to an existing Cluster Run or creates a new Cluster Run if one with the specified name does not exist.

When running this operation, the progress will only show 0.01%, and will be updated when the operation finishes.

2.11.49. Detect Attachment-Implied Emails

This operation must be used in conjunction with the Cluster Run operation from Nuix. First, generate a Cluster Run using Nuix Workstation and then run the Detect Attachment-Implied Emails operation to complement the identification of inclusive and non-inclusive emails.

If no cluster run name is specified, the operation will process all existing cluster runs.

Items will be tagged according to the following tag structure:

Threading
- Cluster run name
  - Items
    
    Inclusive
    
    Attachment-Inferred
    
    Singular
    
    Ignored
    
    Endpoint
    
    Non Inclusive
  - All Families
    
    Inclusive
    
    Attachment-Inferred
    
    Singular
    
    Ignored
    
    Endpoint
    
    Non Inclusive

To select all data except for the non-inclusive emails, use query
tag:"Threading|Cluster run name|All Families|Inclusive|*"

This operation should be used on cluster runs that contain top-level emails only, clustered using email threads. Otherwise, the operation will produce inconsistent results.

2.11.50. Reload Items

This operation reloads from source the items matching the scope query.

This operation can be used to decrypt password protected files when preceded by a Configuration operation which defines passwords and if the Delete encrypted inaccessible option is used.

If the scope query results in 0 items, the Nuix case database does not get closed which causes issues when attempting to add more data in the future. As a workaround, use a preceding getO to skip the Reload Items operation if the scope query results in 0 items. See example Python script below:

# Set scope_query to the scope query of the Reload Items operation
items_count = current_case.count(scope_query);
print("Reload Items operation scope count: %s" %items_count)

if items_count == 0:
    # Skip next operation
    current_operation_id = workflow_execution.getCurrentOperationId()
    workflow_execution.goToOperation(current_operation_id + 2)

When decrypting a document, the Nuix Engine maintains the originally encrypted item in place and creates a descendant with the decrypted content. In this situation, when using the Exclude encrypted documents decrypted successfully option, the originally encrypted item is excluded and only the decrypted version remains. Note that this only impacts encrypted documents (such as Word or PDF) and does not impact encrypted zip archives.

2.11.51. Replace Items

This operation replaces case items with files which are named with the MD5 or GUID values of the source items.

2.11.52. Delete Items

This operation deletes items in the scope query and their descendants.

This is irreversible. Deleted items are removed from the case and will no longer appear in searches. All associated annotations will also be removed.

2.11.53. Replace Text

This operation replaces the text stored for items matching the scope query, if an alternative text is provided in a file which is named based on the items MD5 or GUID values.

This operation can be used after an interrupted Nuix OCR operation, to apply the partial results of the OCR operation, by copying all of the text files from the OCR cache to a specific folder and pointing the Replace Text operation at that folder.

This operation searches for files at the root of the specified folder only and ignores files from subfolders.

2.11.54. Remove Text

This operation removes the text stored for items matching the scope query.

This operation can be used to remove text from items for which Nuix stripped the text during loading but where no meaningful text was extracted.

2.11.55. Redact Text

This operation runs regex searches against the text of the items in scope, and redacts all matches.

The Redaction definition file can be a text file with a list of regular expressions, or a tab-separated file columns Name and Regex.

2.11.56. OCR Items

This operation runs OCR using the Nuix OCR on the items identified by the scope query, using standard Nuix options

Starting with Nuix version 8, the OCR settings cannot be supplied manually and instead an OCR profile must be used.

The option Differentiate profile will apply when using an OCR profile with a custom cache directory. In this case, short job ID will be added as a sub-directory to the custom cache directory to avoid conflicts when running multiple jobs at the same time.

2.11.57. Generate Duplicate Custodians Field

This operation will generate a CSV file with the list of duplicate custodians in the case. See Generate Duplicate Fields for a description of the available options.

Running without the DocIDs selected in the Original fields will significantly improve execution time.

This operation is less memory-intensive than the Generate Duplicate Fields operation.

2.11.58. Generate Domain Fields

This operation will extract email domains from items in the Scope.

The resulting extracted domain fields can be saved to a CSV file and/or can be assigned as custom metadata to the items in Scope.

2.11.59. Generate Duplicate Fields

This operation will identify all items that match the Update items scope query and that have duplicates in the larger Search scope query.

The operation supports two evaluation methods:

Memory-Intensive: This method uses a large amount of memory on large cases but requires reduced computation.
Compute-Intensive: This operation performs a large number of computations on large cases but requires a reduced amount of memory.

The duplicate items are identified based on the following levels of duplication:

As individuals: Items that are duplicates at the item level.
By family: Items that are duplicates at the family level.
By top-level item: Only the top-level items of items in scope that are duplicates are identified.

When using the deduplication option By top-level item, ensure that the families provided are complete in the search and update scope..

When an item in the Update item scope with duplicates is identified, this operation will generate duplicate fields capturing the properties of the duplicate items. The following duplicate fields are supported:

Custodians
Item Names
Item Dates
Paths
Tags
Sub Tags
GUIDs
Parent GUIDs
Top-Level Parent GUIDs
DocIDs
Lowest Family DocID
Metadata Profile

When selecting the Metadata Profile option, all of the fields found in the specified Metadata Profile will be computed.

The Results inclusiveness option determines whether the value from the current original item should be added to the duplicate fields. For example, if the original document has custodian Smith and there are two duplicate items with custodians Jones and Taylor, the Alternate Custodians field will contain values Jones; Taylor whereas the All Custodians field will contain values Jones; Taylor; Smith.

The resulting duplicate fields can be saved to a CSV file and/or can be assigned as custom metadata to the items in the Update items scope.

For help with date formats, see Joda Pattern-based Formatting for a guide to pattern-based date formatting.

2.11.60. Generate Printed Images

This operation generate images for the items in scope using the specified Imaging profile.

The Tag failed items as options have the same behavior as in the Legal Export operation.

2.11.61. Populate Binary Store

This operation populates the binary store with the binaries of the items in scope.

2.11.62. Assign Custom Metadata

This operation adds custom metadata to the items in scope. A CSV or TSV file is required.

The file header must start with either GUID, ItemName, DocID, or Key, followed by the names of the metadata fields to be assigned.

When using ItemName, the metadata will be assigned to all items in the Nuix case which have that Item Name. This might involve assigning the same medata information to multiple items, if they have the same name.

When using Key, the matching of the items will be attempted with either the GUID, ItemName, or DocID, in this order.

Each subsequent line corresponds to an item that needs to be updated, with the fist column containing the GUID, ItemName, or DocID of the item and the the remaining columns containing the custom metadata.

Example simple CSV metadata file:

DocID,HasSpecialTerms,NumberOfSpecialTerms
DOC00001,Yes,5
DOC00002,Yes,1
DOC00003,No,0
DOC00004,Yes,7

To assign custom metadata of a specific type, add a second header line with the following format:

The first column: Type, indicating that this line is a header specifying field types
For each subsequent column, the type of the data, from the following options:
- Text
- Date
- Boolean
- Integer
- Float

Example CSV metadata file with types:

ItemName,DateRecorded,SampleThreshold
Type,Date,Float
file1.txt,2020-01-01,0.5
file2.txt,2021-01-01,1.5
Email.eml,2022-01-01,-7

2.11.63. Assign Data Set Metadata

This operation assigns the fields defined in the data set, either as custom metadata or as tags.

2.11.64. Associate Google Vault Metadata

This operation parses the XML files and CSV files exported from Google Vault, extracts the metadata records available (see https://support.google.com/vault/answer/6099459?hl=en#mailxml) and associate these as custom metadata to the matching items in the Nuix case.

The matching between Google Vault metadata records and the items in the Nuix case is performed in the following way:

Google Mail
- When parsing XML metadata files, matching is performed using the metadata field MBOX From Line
- When parsing CSV metadata files, matching is performed using the metadata fields Mapi-Smtp-Message-Id and Message-ID.
Google Documents
- When parsing XML metadata files, matching is performed using File name

2.11.65. Remove Custom Metadata

This operation removes the custom metadata specified from the items in scope.

2.11.66. Add Items to Production Set

This operation adds items matching the scope query to a production set.

When adding items to a production set, the following sort orders can be applied:

No sorting: Items are not sorted.
Top-level item date (ascending): Items are sorted according to the date of the top-level item in each family, in ascending order.
Top-level item date (descending): Items are sorted according to the date of the top-level item in each family, in descending order.
Evidence order (ascending): Items are sorted by effective path name (similar to the Windows Explorer sorting), in ascending order.
Keyword Fields: Items are sorted by a combination of fields in ascending or descending order.
Metadata Profile: Items are sorted by the fields within a Metadata profile in ascending order.

To achieve a sort order equivalent to the Nuix Engine Default sort order, select the Automate Custom sort method with the field Position in Ascending order.

The item numbering can be performed either at the Document ID level, or at the Family Document ID level. In the latter case, the top-level item in each family will be assigned a Document ID according to the defined prefix and numbering digits. All descendants from the family will be assigned a Document ID which is the same as the one of the top-level item, and a suffix indicating the position of the descendant in the family.

The document ID start number, the number of digits and the number of family digits can be specified using custom parameters:

{docid_start_numbering_at} - Select the option Start numbering at in the configuration of the Add Items to Production Set operation for this parameter to have an effect;
{docid_digits}
{docid_family_digits} - Select the numbering scheme Family Document ID in the configuration of the Add Items to Production Set operation for this parameter to have an effect;

When using a page-level numbering scheme, the parameter {group_family_items} can be used to control the grouping of documents from the same family, and the parameter {group_document_pages} can be used to control the grouping of pages from the same document. these parameter can be set to true or false.

2.11.67. Delete Production Set

This operation deletes ALl or Specific production sets.

2.11.68. Legal Export

This operation performs a legal export, using standard Nuix options.

Use the Imaging profile and Production profile options to control the parameters of images exported during a legal export.

The Split export at option will split the entire export (including loadfile and export components) into multiple parts of the maximum size specified, and will include family items.

The Convert mail, contacts, calendars to option will export the native emails to the selected format.

The Export scheme option can be used to control if attachments are separated from emails or not.

2.11.69. Export to Relativity

This product module may only be used by parties with valid licenses for Relativity or Relativity One, products of Relativity ODA LLC. Relativity ODA LLC does not test, evaluate, endorse or certify this product.

When selecting the Export type Relativity, the loadfile will be uploaded to Relativity during the legal export operation. If the export is split into multiple parts, each part will be uploaded as soon as it is available and previous parts finished uploading.

The following settings are required:

Fields mapping file: Path to JSON file mapping the Nuix Metadata profile to the Relativity workspace fields. If a mapping file is not provided, the fields in the loadfile will be mapped to fields with the same names in the Relativity workspace.

See more information on how to create a mapping file in the Relativity Loadfile Upload operation.

This operation only loads native files, text and metadata to Relativity. To load images, in addition to this operation, use the Relativity Images Overlay operation.

2.11.70. Case Subset Export

This operation will export the items in scope in a case subset under the specified parameters.

2.11.71. Export Items

This operation exports items to the specified Export folder.

The Path options option will export items into a Single directory or Recreate the directory structure of the original data.

The Convert emails to option will export the native emails to the selected format.

By default, only the items that are exported are tracked in the utilization database. When selecting the option Track material descendants of exported items in utilization data, in addition to tracking items that are exported, the material descendants of these items are also tracked.

2.11.72. Logical Image Export

This operation will export the items in scope in a Nuix Logical Image (NLI) container.

2.11.73. Metadata Export

This operation will export the metadata of items matching the scope query, using the selected metadata profile.

The following sort orders can be applied:

No sorting: Items are not sorted.
Top-level item date (ascending): Items are sorted according to the date of the top-level item in each family, in ascending order.
Top-level item date (descending): Items are sorted according to the date of the top-level item in each family, in descending order.
Evidence order (ascending): Items are sorted in the same way in which they appear in the evidence tree, in ascending order.

The Max Path Depth option does not offer any performance advantages - all items matching the scope query are processed and items exceeding the max path depth are not outputted to the resulting file.

2.11.74. Word-List Export

This operation exports a list of words from the items matching the scope query.

The words are extracted by running by splitting the text of each item using the supplied regex.

Sample regex to extract words containing only letters and numbers:

[^a-zA-Z0-9]+

Sample regex to extract words containing any character, separated by a whitespace character (i.e. a space, a tab, a line break, or a form feed)

\s+

Words which are shorter than the min or longer than the max length supplied are ignored.

2.11.75. Processing Report

This operation generates a processing report in an Excel format, based on a template file.

If a custom template is not specified, the operation will use the default Automate template. To create a custom template, first run the Processing Report operation with default settings. Then, make a copy of the latest template file. When running under a service account, the template is located at %userprofile%\.nuix\Workflow\Templates, and when running under the Local System account, the template is located at C:\Windows\System32\config\systemprofile\.nuix\Workflow\Templates. Then, modify the workflow to point to the newly created custom template file.

Processing Stages

A processing stage consists in a subset of items from the case, identified by a Nuix query and with an associated method to compute the size. The following size methods are available:

Audited size: The Nuix audited size.
File size The Nuix file size.
Text size: The size of the text.
Audited + Text size: The audited size plus the size of the text.
Audited (attachments 2x): The audited size, with the attachments size included twice. This can be an estimate of the size of a legal export with the option to leave attachments on emails.
Audited (attachments 2x) + Text size The audited size, with the attachments size included twice, plus the size of the text.
Digest size: The digest size. If the item does not have a digest, fallback to the file size. If the item is not a file, fallback to the audited size.

The default options from this operation generate a report with a predefined number of stages:

Source data
Extracted
Material
Post exclusions
Post deduplication
Export

Views

Views are used to define how the data is displayed in a report sheet, including the vertical and horizontal columns, the processing stage for which the view applies, the option to calculate the count and/or size of items, and the size unit.

The default options include several predefined views, with each view corresponding to a sheet in the Excel report:

Processing overview
Material items by custodian
Export items by custodian
Material items by year
Export items by year
Material items by type
Export items by type
Material items by extension
Export items by extension
Material images by dimensions
Export images by dimensions
Irregular items
Exclusions by type

By default, sizes are reported in Gibibytes (GiB). 1 GiB = 1024 x 1024 x 1024 bytes = 1,073,741,824 bytes. The size unit can be changed in the view options pane.

Each stage and view can be customized, removed and new stages and views can be added.

If the parameter {report_password} is set, the resulting Excel file will be encrypted with the password provided.

2.11.76. Generate Processing Report from multiple cases

The Additional cases option can be used to generate a single report from multiple cases, by specifying the location of the additional cases that need to be considered. Items are evaluated from the main workflow case first, and then from the additional cases, in the order provided. If an item exists in multiple cases with the same GUID, only the first instance of the item is reported on.

When using the Additional cases option to report on a case subset as well as the original case, run the report from the case subset and add the original case in the Additional cases list. This will have the effect of reporting on the case subset items first, and ignoring the identical copies of these items from tzhe original case.

2.11.77. Scan Case Statistics

This operation scans the case for evidence containers, custodians, languages, tags, and date-ranges (by month), item sets, production sets and exclusions, and for each of these tracks the count of all items, count and size of audited items, and count and size of physical items.

The resulting JSON file is stored in the case folder Stores\Statistics and sent to Automate Scheduler for centralized reporting.

The following additional options can be configured:

Case History: Enables the scanning of the case history to extract sessions, operations and volumes.
Compute Size: The methods used to compute the size of items.
Max scan duration (seconds): Stop scanning further case details after this time is reached.
Native Export: Include non-exported material children: If selected, when a Native Export event is detected in the case history, the material childre of the exported items are also included in the export scope.
Force scan previously scanned case: Re-scan a case even if it was previously scanned and no new events were detected.
Don’t skip Automate Engine sessions: By default, sessions ran by the Automate Engine are skipped during the case history scan. If enabled, this option will scan sessions ran by the Automate Engine as well. Use this option when rebuilding the Scheduler Utilization database.

2.11.78. Tree Size Count Report

This operation will generate a tree report including the size and count of items in the scope.

If the first elements from the path of items should not be included in the report, such as the Evidence Container name and Logical Evidence File name, increase the value of the Omit path prefixes option.

The Max path depth option limits the number of nested items for which the report will be generated.

See Processing Report for information on using a custom template and size units.

2.11.79. Switch License

This operation releases the license used by the Nuix Engine when running a job in Automate Scheduler, and optionally acquires a different license depending on the license source option:

None: Does not acquire a Nuix license and runs the remaining operations in the workflow without access to the Nuix case.
NMS: Acquires a Nuix license from the NMS server specified.
CLS: Acquires a Nuix license from the Nuix Cloud License server.
Dongle: Acquires a Nuix license from a USB Dongle connected to the Engine Server.
Engine Default: Acquires a Nuix license from the default source from which the Engine acquired the original Nuix license when the job was started.

When specifying a Filter, the text provided will be compared against the available Nuix license name and description.

When specifying a Workers count of -1, the default number of workers that the Engine originally used will be selected.

This operation is not supported for workflows executed in Automate Workflow.

2.11.80. Close Case

This operation closes the currently open Nuix case.

If the Close Execution Log option is selected, the Execution Log stored in the case folder Stores\Workflow will be closed and no further updates will be made to the log file unless the case is re-opened.

2.12. Nuix Enrich

These operations configure the connection to Nuix Enrich and analyse items from the Nuix case with Nuix Enrich.

2.12.1. Configure Nuix Enrich Connection

This operation sets the configuration used to connect to the Nuix Enrich service.

The Nuix Enrich service ID should be set to a parameter of type Nuix Enrich Service. During the submission of the workflow in Scheduler, the user will be prompted to select the Nuix Enrich Service and authenticate to the service if required.

2.12.2. Enrich Items

This operation sends the items in scope to Nuix Enrich for enrichment, and applies the results to the items in the Nuix Engine case.

2.13. Automate

These operations are native to Automate and are used to configure the workflow as well as interact with arbitrary third-parties using APIs, scripts, external commands.

2.13.1. Log

This records a user-defined log, and optionally prints the log to the execution log when running.

2.13.2. Placeholder

This operation can be used to separate sections of the workflow, or as an anchor to jump when jumping to a specific section in the workflow execution.

2.13.3. Configure Parameters

This operation lets users define custom parameters which will exist during the scope of the execution of the workflow. Custom parameters can be manually defined or loaded from a CSV or TSV file, along with a value, description and validation regex.

There are two types of parameters that can be defined in this operation: Static parameters and User parameters. Static parameters are parameters that have a fixed value which is defined in the operation configuration. For User parameters, a prompt is presented when queueing the workflow to provide the values.

Display Conditions

Display conditions can be used to determine if a user is prompted to provide a value for a certain parameter, depending on the values of previously filled-out parameters.

For example, if there are two parameters {perform_add_evidence} and {source_data_location}, a display condition could be set up to only display {source_data_location} parameter if the value of the {perform_add_evidence} parameter is True.

If a parameter does not match the display condition, it will have a blank value.

Display conditions can only reference parameters defined in the same Configure Parameters operation above the current parameter.

Parameter Value Filters

The following parameter value files can be applied, depending on the parameter type:

Text parameter values can be filtered using regular expressions (regex).
Number parameter values can be filtered using a minimum and maximum allowed value.
Relativity parameter values can be filtered based on other previous Relativity parameters, such as the Relativity client or workspace. These filters require the use of a Relativity Service.

2.13.4. Notification

This operation sends an email notification with a customized message.

If the Email Notification option is selected, an email will be sent to the specified email address. To obtain information about the SMTP email server and port used in the environment, contact the network administrator.

The value entered Password field will be stored in clear text in the workflow file - a password SHOULD NOT be entered in this field. Instead, set this field to a protected parameter name, for example {smtp_password} and see the section Protected Parameters for instructions on how to set protected parameter values.

The following additional options can be configured:

Attach workflow execution log as text: Select this option to attach a file named WorkflowLog.txt to the email, containing the current Execution Log.
Attach last generated report, if available: Select this option to attach the last generated report file.
Additional attachments: Specify additional files that should be attached to the email.

To attach multiple reports to a notification email, define and store the paths to those files using parameters, and then use those parameters in the Additional attachments section.

2.13.5. Script

This operation will run either the Script code supplied or the code from a Script file in the context of the Nuix case.

This operation can be used to integrate existing in-house scripts in a workflow.

Access Static Parameters

All case parameters are evaluated before the script is started, and can be accessed as attributes in the script execution context without the curly brackets. For example, to print the contents of the case folder, following python script can be used:

import os

print "Contents of case folder: "+case_folder
for f in os.listdir(case_folder):
	print f

Manage Dynamic Parameter

The parameters helper object can be used to get and set the value of dynamic parameters:

get(String name) - Get the value of the parameter with the name supplied as a String. If the parameter is not defined, return the parameter name.
get(String name, Object defaultValue) - Get the value of the parameter with the name supplied as String. If the parameter is not defined, the default value is returned.
put(String name, String value) - Set the value of the parameter with the name supplied. If the name supplied is not a valid parameter name, it will be normalized.
getAllParameterNames() - Returns a list with the names of all of the parameter names, including system parameters, user-defined parameters, and parameters supplied in the Execution Profile

Example of setting and retrieving parameters:

# Setting parameter {param1}
parameters.put("{param1}","Test Value from Script1")
print "Parameter {param1} has value: "+parameters.get("{param1}")

# Attempting to get undefined parameter {param2}
parameterValue = parameters.get("{param2}",None)
print "Parameter {param2} has value: "+str(parameterValue)

Output:

Parameter {param1} has value: Test Value from Script1
Parameter {param2} has value: None

Additionally, to get the values of parameters converted to specific types, use the methods below:

getLong(String name) - Get the value of the parameter with the name supplied as a Long number. If the parameter is not defined or can’t be converted, an exception is thrown.
getLong(String name, long defaultValue) - Get the value of the parameter with the name supplied as a Long number. If the parameter is not defined or can’t be converted, the default value is returned.
putLong(String name, long value) - Convert the Long number value and store in the parameter.
getBoolean(String name) - Get the value of the parameter with the name supplied as a Boolean. If the parameter is not defined or can’t be converted, an exception is thrown.
getBoolean(String name, boolean defaultValue) - Get the value of the parameter with the name supplied as a Boolean. If the parameter is not defined or can’t be converted, the default value is returned.
putBoolean(String name, boolean value) - Convert the Boolean value and store in the parameter.
getDouble(String name) - Get the value of the parameter with the name supplied as a Double number. If the parameter is not defined or can’t be converted, an exception is thrown.
getDouble(String name, double defaultValue) - Get the value of the parameter with the name supplied as a Double number. If the parameter is not defined or can’t be converted, the default value is returned.
putDouble(String name, double value) - Convert the Double number value and store in the parameter.
getJsonObject(String name) - Get the value of the parameter with the name supplied as a deserialized JSON object. If the parameter is not defined or can’t be deserialized as a JSON object, an exception is thrown.
getJsonObject(String name, Object defaultValue) - Get the value of the parameter with the name supplied as a deserialized JSON object. If the parameter is not defined or can’t be deserialized as a JSON object, the default value is returned.
putJsonObject(String name, Object value) - Serialize the value as a JSON string and store in the parameter.

When converting the parameter values to JSON object, the resulting object type is infered during deserialization and might be different than the original type.

Example of getting and setting typed parameters:

# Defining a Python dictionary
dictionary={}
dictionary["number"]=5
dictionary["color"]="Orange"
print "Original dictionary:"
print type(dictionary)
print dictionary

# Storing the dictionary as a parameter
parameters.putJsonObject("{sample_dictionary}",dictionary)

# Getting the parameter as an object
retrievedDictionary = parameters.getJsonObject("{sample_dictionary}")
print "Deserialized dictionary:"
print type(retrievedDictionary)
print retrievedDictionary

Output:

Original dictionary:
<type 'dict'>
{'color': 'Orange', 'number': 5}

Deserialized dictionary:
<type 'com.google.gson.internal.LinkedTreeMap'>
{u'color': u'Orange', u'number': 5.0}

See section Parameters for a list of built-in parameters.

For assistance with creating custom scripts or for integrating existing scripts into Automate Workflow, please contact Nuix support.

Manage Workflow Execution

The workflow execution can be manipulated live from the Script operation using the following methods from the workflowExecution helper object:

stop() - Stops the workflow execution
pause() - Pauses the workflow execution
log(String message) - Adds the message to the workflow execution log
logInfo(String message) - Adds the message to the workflow info list
logWarning(String message) - Adds the message to the workflow warnings
addLink(String linkUrl) - Adds the link to the workflow links list
addLink(String linkName, String linkUrl) - Adds the link to the workflow links list
addLink(String prefix, String linkName, String linkUrl) - Adds the link to the workflow links list
addLink(String prefix, String linkName, String linkUrl, String suffix) - Adds the link to the workflow links list
triggerError(String message) - Triggers an error with the specified message
appendWorkflow(String pathToWorkflowFile) - Appends the operations from workflow from file pathToWorkflowFile to the end of the current workflow.
appendWorkflowXml(String workflowXml) - Appends the operations from workflow XML workflowXml to the end of the current workflow. The workflowXml should contain the entire content of the workflow file.
insertWorkflow(String pathToWorkflowFile) - Inserts the operations from workflow from file pathToWorkflowFile after the current Script operation.
insertWorkflowXml(String workflowXml) - Inserts the operations from workflow XML workflowXml after the current Script operation. The workflowXml should contain the entire content of the workflow file.
goToOperation(int id) - Jumps to operation with specified id after the Script operation completes. To jump to the first operation, specify an id value of 1.
goToNthOperationOfType(int n, String type) - Jumps to nth operation of the specified type from the workflow after the Script operation completes.
goToOperationWithNoteExact(String text) - Jumps to the first operation in the workflow for which the note equals the specified text.
goToOperationWithNoteContaining(String text) - Jumps to the first operation in the workflow for which the note contains the specified text.
goToOperationWithNoteStartingWith(String text) - Jumps to the first operation in the workflow for which the note starts with the specified text.
getOperations() - Returns all operations.
getOperationsWithWarnings() - Returns all operations with warnings.
getOperationsWithErrors() - Returns all operations with errors.
getOperationsWithExecutionState(ExecutionState executionState) - Returns all operations for which the execution state equals the specified execution state.
getOperation(int id) - Returns the operation with the specified id.
getOperationWithNoteExact(String text) - Returns the first operation in the workflow for which the note equals the specified text.
getOperationWithNoteContaining(String text) - Returns first operation in the workflow for which the note contains the specified text.
getOperationWithNoteStartingWith(String text) - Returns the first operation in the workflow for which the note starts with the specified text.
getCurrentOperationId() - Returns the id of the current Script operation.
getOperationsCount() - Returns the id of the last operation in the workflow.
clearStickyParameters() - Remove all sticky parameters set in the user profile.
setProgress(double percentageComplete) - Set the operation progress. This is displayed in the user interface and used for the ETA calculation. Specify values between 0.0 and 1.0.
setTaskName(String taskName) - Sets the name of the tasks that the script is working on. This is displayed in the user interface.

Example of script that restarts execution twice and then jump to the last operation in the workflow:

count = parameters.getLong("{execution_count}",0)
count=count+1
parameters.putLong("{execution_count}",count)

if (count<3):
        workflowExecution.goToOperation(1)
else:
        workflowExecution.goToOperation(workflowExecution.getOperationsCount())

Manage Operations

Information about an operation can be obtained from the Script operation using the following methods from the operation helper object:

getId() - Returns the operation id.
getExecutionState() - Returns the operation execution state.
getName() - Returns the operation name.
getNotes() - Returns the operation notes.
getErrorMessage() - Returns the operation error message. If the operation does not have an error this value will be null or blank.
getWarningMessages() - Returns the list of warnings for the operation. If the operation does not have any warnings this will be an empty list.
clearWarningMessages() - Clears the operation warning messages.
getStartDateTime() - Returns the start date of the operation as a Joda DateTime.
getFinishedDateTime() - Returns the finished date of the operation as a Joda DateTime.
getSkippable(Boolean skippable) - Returns true if is skippable.
getDisabled() - Returns true` if the operation is disabled.
setDisabled(Boolean disabled) - Sets the disabled state of the operation.
getSoftFail() - Returns true if the operation is set to soft fail on error.
setSoftFail(Boolean softFail) - Set the soft fail state of the operation.
getEta() - Returns the operation ETA as a Joda DateTime.
getPercentageComplete() - Returns the operation progress as a percentage.

Example script that prints the details of the last operation with an error:

operations_with_errors = workflowExecution.getOperationsWithErrors()

if operations_with_errors.size() >= 1:
	last_error_operation = operations_with_errors[-1]
	print "Last operation with error #{0} {1}: {2}".format(last_error_operation.getId(), last_error_operation.getName(), last_error_operation.getErrorMessage())
else:
	print "No operations encountered errors"

Synchronize Execution

To synchronize the execution of certain parts of the workflow between multiple jobs, for example, to ensure that only one job at a time runs one part of the workflow, use locks using the following methods from the workflowExecution helper object:

acquireLock(String lockName) - Attempts to acquire the lock with the specified name. If the lock is held by another Job, the script execution is blocked at this step, until the lock becomes available.
boolean releaseLock(String lockName) - Releases the lock with the specified name. Returns true if the lock was previously held by this job.
boolean tryAcquireLock(String lockName) - Attempts to acquire the lock with the specified name, if available. Returns true if the lock was acquired. If the lock cannot be acquired, returns false and continues the execution.

Example script to acquire a lock with the matter ID:

workflowExecution.acquireLock(matter_id)

Example script to release the lock with the matter ID:

workflowExecution.releaseLock(matter_id)

Manage Data Sets Metadata

Information about the data sets selected when submitting the Job is stored in the dataSetsMetadata helper object. This object is a dictionary, with the key being the Data Set ID, and the values being a dictionary with the properties of the Data Set.

Call APIs

The Script operation exposes several helper objects that can be used to make calls to Automate and third-party APIs. These helper objects are:

restAutomate - Make calls to the Automate API.
restDiscover - Make calls to the Nuix Discover API.
restRelativity - Make calls to the Relativity REST API.
rest - Make calls to generic REST APIs.
genAi - Make calls to a third-party Gen AI service.

The response from REST API calls has the following methods and fields:

status_code - An integer representing the status code
text - The text response
raw - The binary response
json() - An object after parsing the response as JSON
raise_for_status() - Raise an exception if the status code is 4xx or 5xx
headers - A dictionary with the response headers

When making calls to an REST API over HTTPS, the call will fail if the HTTPS certificate is not trusted by the Java keystore. To explicitly allow connections to servers with a specific SHA-256 fingerprint certificate fingerprint, use the following method:

setFingerprint (String fingerprint)

Call Automate API

To make calls to the Automate API from the Script operation use the restAutomate helper object.

The base URL of the Automate instance and the authentication API key are set automatically from the Job under which the Script operation is running. However, these settings can be overwritten with the following methods:

setBaseUrl(String baseUrl)
setBearerToken(String bearerToken)

The following methods can be used to call an API endpoint:

get(String endpoint)
delete(String endpoint)
post(String endpoint, Object data)
put(String endpoint, Object data)

Example Python script that creates a new client:

body = {
  "name": "Sample Client Name",
  "description": "This client was created from the API",
  "enabled": False
}

response = restAutomate.post("/api/v1/scheduler/client", body);

print response.json();

Call Nuix Discover API

To make calls to the Nuix Discover API from the Script operation use restDiscover helper object.

The base URL of the Nuix Discover API and the authentication API key are set automatically from the Use Nuix Discover Case operation. However, these settings can be overwritten with the following methods:

setBaseUrl(String baseUrl)
setBearerToken(String bearerToken)

The following methods can be used to call an API endpoint:

call(String query)
call(String query, Map<String,Object> variables)

Example Python script that runs a GraphQL query for the users with the first name John:

body = '''
query MyQuery ($fn: String){
  users(firstName: $fn) {
    id,
    fullName
  }
}
'''
variables = {"fn":"John"}

response = restDiscover.call(body,variables);
print response.json();

Call Relativity API

To make calls to the Relativity Rest API from the Script operation use the restRelativity helper object.

The URL of the Relativity server and the authentication headers are set automatically from the Configure Relativity Connection operation. However, these settings can be overwritten with the following methods:

setBaseUrl(String baseUrl)
setBearerToken(String bearerToken)
setBasicAuth(String username, String password)

The following methods can be used to call an API endpoint:

get(String endpoint)
delete(String endpoint)
post(String endpoint, Object data)
put(String endpoint, Object data)
queryObjectManager(String objectTypeName, Long workspaceArtifactId, String condition, int start, int length)
queryObjectManager(String objectTypeName, Long workspaceArtifactId, String condition, String[] fieldNames, int start, int length)
queryObjectManagerSlim(String objectTypeName, Long workspaceArtifactId, String condition, int start, int length)
queryObjectManagerSlim(String objectTypeName, Long workspaceArtifactId, String condition, String[] fieldNames, int start, int length)

Example Python script that queries the Relativity Object Manager for workspaces with a specific name and prints the Artifact ID:

workspaceName = "Relativity Starter Template"

body = {
    "request": {
        "Condition": "'Name' == '"+workspaceName+"'",
        "ObjectType": {
            "ArtifactTypeID": 8
        },
        "Fields": [{
                "Name": "Name"
            }
        ]
    },
    "start":0,
    "length":1000
}

response = restRelativity.post("/Relativity.Rest/api/Relativity.ObjectManager/v1/workspace/-1/object/query",body)
response.raise_for_status()

print("Response count: "+str(int(response.json()["TotalCount"])))
for responseObject in response.json()["Objects"]:
    print "ArtifactID: "+str(int(responseObject["ArtifactID"]))
    for fieldValue in responseObject["FieldValues"]:
        print(fieldValue["Field"]["Name"]+": "+fieldValue["Value"])

Example Python script that uses the queryObjectManager helper to query names and client names of all the matters:

fields = ["Name", "Client Name"]

response = restRelativity.queryObjectManager("Matter", -1, None, fields, 1, 10000)
response.raise_for_status()

print("Response count: "+str(int(response.json()["TotalCount"])))
for responseObject in response.json()["Objects"]:
    print "ArtifactID: "+str(int(responseObject["ArtifactID"]))
    for fieldValue in responseObject["FieldValues"]:
        print(fieldValue["Field"]["Name"]+": "+fieldValue["Value"])

Call Generic API

To make calls to a generic API from the Script operation use rest helper object.

The base URL can be optionally set using the following method:

setBaseUrl(String baseUrl)

The authentication can be optionally set using the following methods:

setBearerToken(String bearerToken)
setBasicAuth(String username, String password)

Custom headers can be optionally set using the following method:

setHeader(String name, String value)

The following methods can be used to call an API endpoint:

get(String endpoint)
delete(String endpoint)
post(String endpoint, Object data)
put(String endpoint, Object data)

Example Python script queries a REST API:

response = rest.get("https://dummy.restapiexample.com/api/v1/employees");
print response.json();

The rest client support MultiPart requests, to send MultiPart requests the user need to get the builder object using the following method:

getMultiPartBuilder()

Once the user has the MultiPart builder object, the user can use the following methods:

addMultiPart(String content, String contentType): Adds a body part to the MultiPart request
addMultiPart(String content, String contentType, String contentDisposition): Adds a body part to the MultiPart request with a Content-Disposition header
reset(): Resets the MultiPart object
build(): Returns the MultiPart object

Example Python script that posts a MultiPart object

builder = rest.getMultiPartBuilder()
builder.addMultiPart("aaaaaaaaa","application/javascript")
builder.addMultiPart("bbbbbbbbb","image/gif")
body = builder.build()

response = rest.post("https://multipart.requestcatcher.com/test",body);
print response.json()

Call Gen AI

To make calls to a third-party Gen AI service genAi helper object.

The following methods can be used:

getModel(): Returns the name of the model being used
getApiUrl(): Returns the URL to the API being used
getServiceRoot(): Returns the domain name of the API, with the api. value removed
getCompletionMessage(List<Map<String,String>> genAiChatRequestMessages): Get the text response from Gen AI
getCompletions(List<Map<String,String>> genAiChatRequestMessages): Get the completion object from Gen AI

Example Python script with a simple call to get a response:

messages = [
    { 'role': 'system', 'content': 'You always respond with 10 words in English followed by a word in French.' },
    { 'role': 'user', 'content': 'What''s in this image?' },
    { 'role': 'user', 'imageMimeType': 'image/png', 'imageBase64': 'iVBORw0KGgoAAAANSUhEUgAAAQAAAAEACAIAAADTED8xAAADMElEQVR4nOzVwQnAIBQFQYXff81RUkQCOyDj1YOPnbXWPmeTRef+/3O/OyBjzh3CD95BfqICMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMK0CMO0TAAD//2Anhf4QtqobAAAAAElFTkSuQmCC'}
]

response = genAi.getResponseMessage(messages);

print(response)

Example Python script with a call to get the details of a response:

print("Setup:")
print("\tModel: "+genAi.getModel())
print("\tService Name: "+genAi.getServiceName())


messages = [
    { 'role': 'system', 'content': 'You always respond with 10 words in English followed by a word in French.' },
    { 'role': 'user', 'content': 'Hi, who are you' },
    { 'role': 'assistant', 'content': 'I am a warrior' },
    { 'role': 'user', 'content': 'Why did you say that?' }
]

response = genAi.getResponse(messages);

print("\tMessage: "+response.getMessage().getContent())
print("\tRole: "+response.getMessage().getRole())

usage = response.getUsage()
print("Usage: ")
print("\tPrompt tokens: "+str(usage.getPromptTokens()))
print("\tCompletion tokens: "+str(usage.getCompletionTokens()))

Example Python script to list available models and select a custom model for the scope of the script

print "Available models: "
for availableModel in genAi.getAvailableModels():
    print "\t"+availableModel.getId()+" ("+availableModel.getName()+")";

genAi.setModel("llava:34b")

print("Setup:")
print("\tModel: "+genAi.getModel())

2.13.6. PowerShell

This operation will run the specified PowerShell script.

Getting Parameters Values

When running a PowerShell script from the specified code, the Automate parameters used in the code will be evaluated before running the code. The evaluation of Automate parameters is not performed when running a PowerShell script file.

For example, the following PowerShell script code:

Write-Host "The time is: {date_time}"

will produce the following output:

Running PowerShell code
The time is: 20221006-132923
PowerShell exited with code 0

Setting Parameters Values

To set Automate parameter values from a PowerShell script, the value of the parameter has to be written to a file in a specific location. This mechanism is required because the PowerShell script does not run in the same context as the Automate workflow.

To set a parameter with the name {sample_parameter_name}, the PowerShell script should write the value of the parameter to a file named sample_parameter_name with no extension, in the folder {powershell_parameters}, for example:

Set-Content -NoNewline -Path {powershell_parameters}\sample_parameter_name -Value $SampleValue

The parameter {powershell_parameters} will be automatically assigned to a temporary path when running the PowerShell operation, and does not need to be defined elsewhere. To use this mechanism in a PowerShell script, pass the value of this parameter as an argument to the script.

For example, to get the current date and time in PowerShell and set to a Automate parameter, use the following PowerShell code:

$CurrentDate = Get-Date
Set-Content -NoNewline -Path {powershell_parameters}\date_from_powershell -Value $CurrentDate

2.13.7. Run External Application

This operation will run an executable file with specified arguments and wait for it to finish.

Example for copying a folder using robocopy:

Application location: C:\Windows\System32\Robocopy.exe
Arguments: "C:\Program Files\Automate" "C:\Temp\Automate" /E

Example for listing a folder using cmd.exe, and redirecting the output to a text file in the C:\Temp folder:

Application location: C:\Windows\System32\cmd.exe
Arguments /c dir "C:\Program Files" > "listing_{date_time}.txt"
Working Directory: C:\Temp

2.13.8. Call API

This operation will make an API call.

The following options can be configured:

Verb: The HTTP verb, such as GET or POST.
URL: The URL.
Certificate fingerprint: Optional, the certificate SHA-256 fingerprint that should be trusted even if the certificate is self-signed.
Authentication type: The type of authentication that the API requires.
- No Auth: No authentication.
- API Key: Provide the API key name and API key value that will be set as headers.
- Bearer Token: Provide the Token value.
- Basic Auth: Provide the Username and Password.
Parameters: Optional, URL parameters.
Headers: Optional, custom HTTP headers.
Body type: The type of body data to submit.
- None: No data to submit.
- Form Data: Provide the form field Names and Values.
- Raw: Provide the Body type and data.
- Binary: Provide the File location containing the binary data.

Once the API call has completed, the following parameters will be populated:

{call_api_response_code}: The HTTP response code.
{call_api_response_headers}: The response headers, JSON encocoded.
{call_api_response_body}: The response body.

2.13.9. Configure Native OCR

This operation sets the configuration of the Automate OCR.

The Automate OCR uses the Tesseract/Leptonica binaries built by Mannheim University Library. Prior to running a Native OCR operation, the Automate OCR or another distribution of the Tesseract OCR must be installed.

The operation has the following settings:

Workers allocation:
- Predetermined: Use the specified number of workers
- Per CPU core: Use a number of workers as a ratio of the number of CPU cores. For example, on a server with 16 cores, a ratio of 0.8 corresponds to 12 cores (i.e. 80% of 16 cores).
OCR engine binaries folder: Optional, the folder where the Automate OCR or the Tesseract OCR is installed.
User words file: Optional, the path to the Tesseract words file.
User patterns file: Optional, the path to the Tesseract patterns file.
Image resolution: Optional, the resolution of the source image in DPI of the images, if known.
Rasterize PDF resolution: Optional, the resolution to use when rasterizing PDF files before OCR.
OCR engine log level: Optional, the logging level for the Tesseract OCR Engine.
Languages: Optional, the language(s) in which the text is written, if known. When configuring multiple languages, separate them by the plus sign, for example eng+deu+fra.

For a list of languages supported by Tesseract, see https://github.com/tesseract-ocr/tessdoc/blob/main/Data-Files-in-different-versions.md.

Page segmentation mode: Optional, the method used to segment the page.

For a list of page segmentation modes supported by Tesseract, see https://tesseract-ocr.github.io/tessdoc/ImproveQuality.html#page-segmentation-method

Deskew: If set, the preprocessor will attempt to deskewed the image before running the OCR.

The Deskew option is only available for common image formats and PDF files. It is not available for source text files containing a listing of images.

The Deskew option will only correct small angle rotations and will not rotate the image by 90, 180, or 270 degrees.

Rotate: If set, the preprocessor will rotate the image before running the OCR. When using the Auto Detect option, the OCR engine will first run in 0 - Orientation and Script Detection (OSD) only mode to detect the orientation, and then will run a second time on the rotated images in the user-configured mode.

When using the Auto Detect rotation mode, it’s generally optimal in most cases to either not select a specific Page segmentation mode, or to select a mode without OSD because the image will already be correctly orientated.

OCR engine mode: Optional, the mode in which the OCR engine should run. This option should only be used when using a custom Tesseract build.
OCR engine config file: Optional, the Tesseract configuration file to use with configuration variables.
Timeout per file: Optional, the maximum duration of time that the OCR engine is allowed to run on a single file, possibly containing multiple pages.
OCR temp folder: Optional, the folder in which the temporary files used during files the OCR operations are created. If not set, a temp folder will be created in the destination folder where the OCR text is exported, or inside the Nuix case folder.
Don’t clear OCR temp folder on completion: If set, the OCR temp folder is not deleted on OCR completion. This option can be used to troubleshoot the OCR process by inspecting the intermediary temporary files.

2.13.10. Native OCR Items

This operation runs OCR using the Automate OCR Engine on Nuix case items. The operation is designed to perform best when the Nuix items have binary data stored.

When running the Native OCR Items operation on Nuix items which do not have binary data stored, the OCR will take significantly longer. Before running this operation, either store item binaries during the Add Evidence operation, or use the Populate Binary Store operation to populate binaries of the items that need to be OCRed.

Items in PDF or image formats supported by the Automate OCR Engine are extracted as native files from the Nuix items and OCRed. For all other items, printed images are generated inside Nuix which are then OCRed.

The settings for the OCR Engine are defined in the Configure Native OCR operation.

A CSV summary report is produced, listing all source items, the OCR success status and the other details of the OCR process.

The operation has the following settings:

Scope query: The Nuix query to select the items to OCR.
Text modifications
- Append: Append the extracted text at the end of the existing document text.
- Overwrite: Replace the document text with the extracted text.
Create searchable PDF: If set, generate PDF files with the extracted text overlaid and set as the printed images for the items.

The Tag failed items as options have the same behavior as in the Legal Export operation.

2.13.11. Native OCR Images Files

This operation runs OCR using the Automate OCR Engine on image files.

For a list of supported image file formats, please see https://github.com/tesseract-ocr/tessdoc/blob/main/InputFormats.md. In addition to this file formats, Automate supports source PDF files (these are rasterized to images) and text files containing a list of image files.

The settings for the OCR Engine are defined in the Configure Native OCR operation.

For each source image file, a corresponding text file is written in the Output text files folder. A CSV report named summary_report.csv is produced, listing all source files, the OCR success status, the path and size of the resulting text file, as well as the output of the OCR engine.

The operation has the following settings:

Source image files folder: The folder containing the image files to be OCRed.
Scan folder recursively: If set, the source folder will be scanned recursively, and the output files will be created using the same folder structure.
Skip images with existing non-empty text files: If set, images will be skipped if a text file with the expected name and a size greater than 0 exists in the destination folder.
Assemble pages regex: The regular expression to use to detect documents with multiple pages, which were exported with one image file per page. The regex must have at least one matching group which is used to select the document base name.
Output text files folder: The folder in which the text files will be created.
Keep incomplete files: If set, empty files and incomplete text files from the OCR Engine are not deleted.
Create searchable PDF: If set, the source images are converted to PDF files in the Output text files folder with the extracted text overlaid.
Output PDF files folder: The folder in which the PDF files will be created. If this field is blank, it will default to the output text files folder.

2.14. Relativity

These operations transfer data between the Nuix case and Relativity and allow managing various operations in Relativity.

2.14.1. Configure Relativity Connection

This operation sets the configuration used to connect to the Relativity environment.

Optionally, the Relativity Service can be used and point to a parameter of type Relativity Service. During the submission of the workflow in Scheduler, the user will be prompted to select the Relativity Service and authenticate to the service if required.

When not using a Relativity Service, the following options are explicitly defined in the operation:

Host name: The Relativity host name, for example relativity.example.com.
Service endpoint: The Relativity Service Endpoint, for example /relativitywebapi.
Endpoint type: The Relativity Endpoint Type, for example HTTPS.
User name: The user name used to perform the import into Relativity.
Password: The password for the username above.

The value entered in this field will be stored in clear text in the workflow file - a password SHOULD NOT be entered in this field. Instead, set this field to a protected parameter name, for example {relativity_password} and see section Protected Parameters for instructions on how to set protected parameter values.

Import threads: The number of parallel threads to use for Relativity uploads, such as Legal Export, Relativity Loadfile Upload, Relativity Images Overlay, Relativity Metadata Overlay, Relativity CSV Overlay.
Import thread timeout: The number of seconds to allow a Relativity upload thread to be idle. If no progress is reported for longer than the allowed timeout, the import thread will be aborted.
Import thread retries: The number of times to retry running an import thread, in situations where import encountered a fatal error or timed out.
Metadata threads: The number of parallel threads to use for Relativity metadata operations, such as Create Relativity Folders.
Patch invalid entries: If selected, this option will automatically patch entries that fail uploading due to the following issues:
- Field value too long - the uploaded field value is trimmed to the maximum allowed length in Relativity;
- Field value invalid, for example due to incorrectly formatted date - the field value is removed from the item uploaded to Relativity;
- Missing native of text file - the native or text component is removed from the item uploaded to Relativity;
Client version: When unchecked, Automate will use the Relativity client version which is the closest match to the Relativity server version. When checked, Automate will use the specified Relativity client version, if available.
REST version: The version of the REST services to use when querying Relativity objects, such as workspaces and folders. For Relativity One, use REST (v1 Latest).

The REST (Server 2021) version requires the Relativity Server Patch (Q3 2021) or later.

The Import threads value is independent of the number of Nuix workers. When using more than 1 import thread, the loadfile or the overlay file will be split and data will be uploaded to Relativity in parallel. Because multiple threads load the data in parallel, this method will impact the order in which documents appear in Relativity when no sort order is specified.

2.14.2. Set Relativity Client

This operation selects a client in the Relativity environment, using the following settings:

Client identifier: The Name or Artifact ID of the Relativity client.
Existing client: The action to take if the client does not exist:
- Create client if it does not exist creates a new client.
- only use existing client triggers an error if the client does not exist.

The following settings are applicable when creating a new client:

Client number: The client number to set on the client.
Status identifier: Optional, the Name or Artifact ID of the status to set on the client.
Keywords: Optional, the keywords to set on the client.
Notes: Optional, the notes to set on the client.

2.14.3. Set Relativity Matter

This operation selects a matter in the Relativity environment, using the following settings:

Matter identifier: The Name or Artifact ID of the Relativity matter.

The matter is selected in Relativity irrespective of the client that it belongs to, even if the Set Relativity Client operation was previously used.

Existing matter: The action to take if the matter does not exist:
- Create matter if it does not exist creates a new matter.
- only use existing matter triggers an error if the matter does not exist.

The following settings are applicable when creating a new matter:

Matter number: The matter number to set on the matter.
Status identifier: Optional, The Name or Artifact ID of the status to set on the matter.
Keywords: Optional, the keywords to set on the matter.
Notes: Optional, the notes to set on the matter.

When a new matter is created, it is created under the client selected using the previous Set Relativity Client operation.

2.14.4. Set Relativity Workspace

This operation selects a workspace in the Relativity environment using the following settings:

Workspace identifier: The Name or Artifact ID of the Relativity workspace.

The workspace is selected in Relativity irrespective of the client and matter that it belongs to, even if the Set Relativity Client or Set Relativity Matter operations were previously used.

Folder path: The path inside the workspace. If blank, this will retrieve the folder corresponding to the root of the workspace.
Create folder path if it does not exist:: If checked, the specified folder path will be created in the workspace if it does not exist.
Existing workspace: The action to take if the Workspace does not exist:
- Clone workspace if it does not already exist creates a new Workspace by cloning the source Workspace.
- Only use existing workspace triggers an error if the Workspace does not exist.
Clone settings: The settings to use when cloning a Workspace.
- Workspace name: The name to give the newly created Workspace.
- Matter: The Matter to use when cloning the Workspace.
- Workspace template: The Workspace template to use when cloning the Workspace.
- Resource pool: The Resource pool to use when cloning the Workspace, if this setting is not defined the first available Resource pool from the Relativity environment will be selected.
- Database location: The Database location to use when cloning the Workspace, if this setting is not defined the first available Database location from the Relativity environment will be selected.
- Default file repository: The Default file repository to use when cloning the Workspace, if this setting is not defined the first available Default file repository from the Relativity environment will be selected.
- Default cache location: The Default cache location to use when cloning the Workspace, if this setting is not defined the first available Default cache location from the Relativity environment will be selected.
- Status: The Status to use when cloning the Workspace, if this setting is not defined the first available Status from the Relativity environment will be selected.

When a workspace is cloned, it is created under the matter selected using the previous Set Relativity Matter operation.

2.14.5. Delete Relativity Workspace

This operation deletes the specified workspace, if it exists.

2.14.6. Create Relativity Group

This operation creates one or more groups in Relativity under the client selected using the previous Set Relativity Client operation, using the following settings:

Group Name: The name of the group to be created.
Keywords: Optional, the keywords to assign to the group created.
Notes: Optional, the notes to assign to the group created.

If a group with the specified name exists under the client, the group will not be created and instead the group name and artifact ID will be logged.

In addition to providing the values for the group settings manually, the user can also load from a CSV or TSV file, for example:

Group Name   Keywords    Notes
Reviewer    reviewer    Simple group for reviewer
Admin   admin   Group for admins

2.14.7. Manage Relativity Workspace Groups

This operation adds or removes groups in Relativity under the workspace selected using the previous Set Relativity Workspace operation, using the following settings:

Group identifier type: The identifier type used for the workspace groups, Name or Artifact ID.
Group action: The action to be performed on the groups, Add or Remove.
Group Settings Table
- Group identifier: The Name or Artifact ID of the group, defined by the Group identifier type field.

In addition to providing the values for the workspace groups manually, the user can also load from a CSV or TSV file, for example:

Group Identifier
Domain Users
Level 1
Level 2

The workspace groups can also be loaded from a file during the workflow execution, using the Workspace groups file option.

2.14.8. Create Relativity Users

This operation create one or more users in Relativity under the client selected using the previous Set Relativity Client operation, using the following settings:

User template identifier: The Name, Artifact ID or Email Address of the user to copy properties from.

When choosing the identifier type Name, the full Relativity name must be provided.

If the template user is enabled, all the users created will also be enabled and have access to Relativity, if the template user is disabled then the users created will not have access to Relativity.

Send email invitation: Sends an email invitation to each user created.
User Settings:
- Email: The email of the user to be created.
- First Name: The first name of the user to be created.
- Last Name: The last name of the user to be created.
- Keywords: Optional, the keywords to assign to the group created.
- Notes: Optional, the notes to assign to the group created.
- Login Method User Identifier: Optional, the subject or account name for the login methods copied from the template user.

In addition to providing the values for the user settings manually, the user can also load from a CSV or TSV file, for example:

Email   First Name    Last Name    Keywords    Notes    Login Method User Identifier
jon.doe@hotmail.com Jon Doe Reviewer    User    created by Automate  j.doe
el.mills@gmail.com  Elisa   Mills   Support User    created by Automate  e.mills

The user settings can also be loaded from a file during the workflow execution, using the User settings file option.

2.14.9. Manage Relativity Users

This operation deletes one or more users from Relativity, using the following settings:

User identifier type: The identifier type used to retrieve users: Name, Artifact ID or Email Address.

When choosing the identifier type Name for the user identifier, the full name must be provided.

User action: The action to be performed on the users, Delete.
Users:
- User identifier: The Name, Artifact ID or Email Address of the user.

In addition to providing the values for the users manually, the user can also load from a CSV or TSV file, for example:

User Identifier
jon.doe@hotmail.com
el.mills@gmail.com

The users can also be loaded from a file during the workflow execution, using the Users file option.

2.14.10. Manage Relativity Group Users

This operation adds or removes one or more users from a group, using the following settings:

Group identifier: The Name, Artifact ID or Name (Like) of the group to add or remove users.
User identifier type: The identifier type used to retrieve users: Name, Artifact ID or Email Address.

When choosing the identifier type Name, the full name must be provided.

User group action: The action to be performed on the users of the group, Add or Remove.
Group users:
- User identifier: The Name, Artifact ID or Email Address of the user.

In addition to providing the values for the group users manually, the user can also load from a CSV or TSV file, for example:

User Identifier
jon.doe@hotmail.com
el.mills@gmail.com

The group users can also be loaded from a file during the workflow execution, using the Group users file option.

2.14.11. Query Relativity Workspace Group Permissions

This operation exports the permissions of a Relativity group to the specified location as a JSON file.

2.14.12. Apply Relativity Workspace Group Permissions

This operation applies permissions to a Relativity Group, using the following settings:

Group Identifier: The Name, Artifact ID or Name (Like) of the group to apply permissions to.
Permissions JSON: Optionally, the content of a permissions file.
Permissions file: A permissions file created by the Query Relativity Workspace Group Permissions operation.

2.14.13. Copy Relativity Workspace Group Permissions

This operation copies the permissions assigned to a group in a Relativity workspace to another group or workspace, using the following settings:

Copy permissions from:

Source Workspace Identifier: The Name, Artifact ID or Name (Like) of the source workspace.
Source Group Identifier: The Name, Artifact ID or Name (Like) of the source group.

To:

Destination Workspace Identifier: The Name, Artifact ID or Name (Like) of the source workspace.
Destination Group Identifier: The Name, Artifact ID or Name (Like) of the source group.

2.14.14. Create Relativity Folders

This operation creates folders in the Relativity workspace from the listing CSV file. The listing file must have a single column and the name of the column must contain the word Folder or Path or Location.

When uploading documents to Relativity with a complex folder structure, it is recommended to use the Create Relativity Folders before the upload to prepare the folder structure.

2.14.15. Relativity Loadfile Upload

This operation loads a Concordance or CSV loadfile to Relativity.

The following settings are required:

Loadfile location: Path to the loadfile.
Fields mapping file: Path to JSON file mapping the Nuix Metadata profile to the Relativity workspace fields. If a mapping file is not provided, the fields in the loadfile will be mapped to fields with the same names in the Relativity workspace.
Detect export in parts: Detects the existence of loadfiles in subfolders in the specified location, and uploads all detected loadfiles sequentially.

This operation sets the Relativity OverwriteMode property to Append when loading the documents into Relativity.

The Legal Export operation can be used to export the loadfile and upload to Relativity, with the added benefit of uploading export parts as soon as they become available.

Sample minimal mapping.json:

{
    "FieldList": [
        {
            "identifier": true,
            "loadfileColumn": "DOCID",
            "workspaceColumn": "Control Number"
        },
        {
            "loadfileColumn": "TEXTPATH",
            "workspaceColumn": "Extracted Text"
        },
        {
            "loadfileColumn": "ITEMPATH",
            "workspaceColumn": "File"
        },
        {
            "loadfileColumn": "BEGINGROUP",
            "workspaceColumn": "Group Identifier"
        }
    ]
}

2.14.16. Relativity Metadata Overlay

This operation exports metadata from the Nuix items in the scope query and overlays it to Relativity.

The following settings are required:

Fields mapping file: Path to JSON file mapping the Nuix Metadata profile to the Relativity workspace fields. If a mapping file is not provided, the fields in the loadfile will be mapped to fields with the same names in the Relativity workspace.

See more information on how to create a mapping file in the Relativity Loadfile Upload operation, or use the sample mapping file below.

This operation sets the Relativity OverwriteMode property to Overlay when loading the metadata into Relativity.

To overlay data to Relativity using a non-indexed field, set the identifier property to true in the mapping file and provide the Artifact ID of that field in the fieldId property.

Sample mapping.json for overlaying data based the GUID in a workspace which contains the field NuixGuid with the Artifact ID 1040313:

{
    "FieldList": [
        {
            "loadfileColumn": "TEXTPATH",
            "workspaceColumn": "Extracted Text"
        },
        {
            "loadfileColumn": "GUID",
            "identifier": true,
            "fieldId": 1040313,
            "workspaceColumn": "NuixGuid"
        }
    ]
}

2.14.17. Relativity Images Overlay

This operation overlays images from an Opticon loadfile to Relativity.

The following settings are required:

Identifier field: The Artifact ID of the identifier field, such as Control Number or Document ID.

To get the Artifact ID of the identifier field, open the workspace in Relativity, navigate to the Workspace Admin → Fields, and click on the identifier field, for example Control Number. Then, to obtain the Artifact ID of this field, extract the value from the URL. For example, the Artifact ID of the with the following URL is 1003667: https://relativity.automate.lab/Relativity/RelativityInternal.aspx?AppID=1018179&ArtifactTypeID=14&ArtifactID=1003667&Mode=Forms&FormMode=view&LayoutID=null&SelectedTab=null

Strip suffix from first page: Strips the suffix from the first page to infer the document ID from the Opticon loadfile, for example _0001.
Detect export in parts: Detects the existence of loadfiles in subfolders in the specified location, and uploads all detected loadfiles sequentially.

This operation sets the Relativity OverwriteMode property to Overlay when loading the images into Relativity.

2.14.18. Relativity CSV Overlay

This operation overlays the metadata from the specified overlay file to Relativity.

The following settings are required:

Fields mapping file: Path to JSON file mapping the Nuix Metadata profile to the Relativity workspace fields. If a mapping file is not provided, the columns in the CSV file will be mapped to fields with the same names in the Relativity workspace.

See more information on how to create a mapping file in the Relativity Loadfile Upload operation.

2.14.19. Relativity Property Query

This operation queries properties of a Relativity workspace and assigns them as parameters in Workflow.

2.14.20. Load Relativity Dynamic Objects

This operation loads dynamic objects (RDO) to Relativity , using the following settings:

Object type identifier: The Name, Artifact ID or Name (Like) of the object type.
Load objects in workspace: Determines if the objects will be loaded into a workspace.

If the options Load objects in workspace is selected, then the Set Relativity Workspace Operation is required.

Objects: Tab separated list of the objects to be loaded.

Sample objects data:

Article Title	Article Type	Article Date	Is Available
Star Wars	Wikipedia Article	2022-11-10T00:00:01	Yes
Globex	Review Article	2022-11-10T00:00:01	No

The field Name is required and the operation will fail if the field is not present.

In addition to providing the values for the Objects manually, the user can also load from a TSV file using the same format as the example above.

When loading objects the first row represents the fields of the object type and the rows after are the objects that will be evaluated and loaded into Relativity.

When the user is using the field type object or choice, use the name of the object or choice in the column, for example given the field Department with the type Single Object and the field Department Group with the type Single Choice:

Name    Department  Department Group
John Doe    IT  Sales
Jane Doe    Marketing   Sales

When the user is using the field type multiple object or multiple choice, use the name of the object or choice and separate each item by comma ,, for example given the field Hobbies with the type Multiple Object and the field Groups with the type Multiple Choice:

Name    Hobbies  Groups
John Doe    Hockey,Golfing  Rotary Club,Robotics
Jane Doe    Golfing,Skiing,Reading   Book Club,Crossfit

2.14.21. Create ARM Archive

This operation creates a Relativity ARM archive job, using the following settings:

Archive Directory: The path where the archive will be stored, for example \\INSTANCE007\Arhives\TestWorkspaceArchive
Use Default Archive Directory: Uses the default path to store your archive

When selecting an archive directory, a valid UNC path must be provided, for example: \\INSTANCE001\Arhives\NewArchive.

Priority: The priority of execution for the archive job: Low, Medium, High
Wait for Archive to Complete: Waits until the archive job completes.
Lock UI Job Actions: Determines if job actions normally available on UI should be visible for the user.
Notify Job Creator: Determines if email notifications will be sent to the job creator.
Notify Job Executor: Determines if email notifications will be sent to the job executor.
Include Database Backup: Include database backup in the archive.
Include dtSearch: Include dtSearch indices in the archive.
Include Conceptual Analytics: Include conceptual analytics indices in the archive.
Include Structured Analytics: Include structured analytics indices in the archive.
Include Data Grid: Include data grid application data in the archive.
Include Repository Files: Include all files included in workspace repository, including files from file fields in the archive.
Include Linked Files: Include all linked files that do not exist in the workspace file repository in the archive.
Missing File Behavior: Indicates weather to Skip File or Stop Job when missing files are detected during the archiving process.

Setting the Missing File Behavior to Stop Job will cause the archive job to stop / fail when there is a file missing.

Include Processing: Include processing application data in the archive.
Include Processing Files: Include all files and containers that have been discovered by processing in the archive.

When the options Include Processing Files is selected the files will be located in the archive directory under the folder Invariant.

Missing Processing File Behavior: Indicates whether to Skip File or Stop Job when missing processing files are detected during the archiving process.
Include Extended Workspace Data: Include extended workspace information in the archive.

Extended workspace data includes installed applications, linked relativity scripts and non-application event handlers.

Application Error Export Behavior: Indicates whether to Skip Application or Stop Job on applications that encountered errors during export.

This operation requires the Relativity instance to have the ARM application installed.

2.14.22. Create Relativity ARM Restore

This operation creates an ARM restore job, using the following settings:

Archive Path: Path of the ARM archive to be restored, for example \\INSTANCE007\Arhives\TestWorkspaceRestore

The Archive Path provided must not be in use by another ARM job.

Priority: The priority of execution for the restore job: Low, Medium, High.
Lock UI Job Actions: Determines if job actions normally available on UI should be visible for the user.
Notify Job Creator: Determines if email notifications will be sent to the job creator.
Notify Job Executor: Determines if email notifications will be sent to the job executor.
Matter Identifier: The Name, Artifact ID or Name (Like) of the matter to restore to.

If a preceding Set Relativity Matter operation exists in the workflow then the matter from the Set Relativity Matter operation, if there is a value in the Matter Identifier field then the matter set in the Matter Identifier field will be used.

Resource Pool Identifier: The resource pool to restore the workspace to. If this setting is not defined, the first available resource pool from the Relativity environment will be selected.
Database Server Identifier: The database server to restore the workspace to. If this setting is not defined, the first available database server from the Relativity environment will be selected.
Cache Location Identifier: The cache location to restore the workspace to. If this setting is not defined, the first available cache location from the Relativity environment will be selected.
File Repository Identifier: The file repository to restore the workspace to. If this setting is not defined, the first available file repository from the Relativity environment will be selected.
- Reference Files as Archive Links: Determines if files should remain in the archive directory and should be referenced from the workspace database as opposed to copying them to the workspace repository.
- Update Repository File Paths: Determines if repository file locations should be updated to reflect their new location.
- Update Linked File Paths: Determines if non-repository file locations should be updated to reflect their new location
- Auto Map Users: Determines if archive users should be auto mapped by email address.
- Auto Map Groups: Determines if archive groups should be auto mapped by name.
- Structured Analytics Server: The Name, Artifact ID or Name (Like) of the structured analytics server. This field is only required when the archive the user is restoring contains structured analytics data.
- Conceptual Analytics Server: The Name, Artifact ID or Name (Like) of the conceptual analytics server This field is only required when the archive the user is restoring contains conceptual analytics data.
- dtSearch Location Identifier: The Name, Artifact ID or Name (Like) of the dtSearch location. This field is only required when the archive the user is restoring contains dtSearch indexes.
- Existing Target Database: Target database in case the archive does not have a database backup file.

This operation requires the Relativity instance to have the ARM application installed.

2.14.23. List Relativity Documents

This operation lists all documents present in the Relativity Workspace.

The following settings are available:

Scope query: Cross references the DocIDs from the Relativity workspaces against the documents in the Nuix case in this scope.
Tag matched items as: The tag to assign to documents in scope in the Nuix case which have the same DocIDs as documents from the Relativity workspace.
Export DocIDs under: The path and name of the file to which to write the list of DocIDs from the Relativity workspaces. Each line will contain a single DocID.

2.14.24. Add Relativity Script

This operation adds the specified script to the Workspace, using the following settings:

Script identifier: The script to add to the Relativity Workspace
Application identifier: The application the script will run under, this setting is optional.

In order to add a script to the Relativity Workspace, first define it in the Relativity Script Library. The Relativity Script Library is located on the home page of Relativity under Applications & Scripts → Relativity Script Library.

2.14.25. Relativity Run Script

This operation runs a script in a Relativity workspace, or in the admin workspace.

Optionally, input values can be provided to the script. To determine the required input IDs and the allowed values, run the script without any inputs and inspect the Execution Log.

Once the script has completed, eventual errors will be stored in the parameter name {last_relativity_script_error}.

The output of the script can be exported to a file of the following type:

CSV: Use the extension .csv
PDF: Use the extension .pdf
XLSX: Use the extension .xlsx, the export defaults to this option if no other format is matched, this option will be used.

2.14.26. Delete Relativity Script

This operation deletes the specified script, if it exists.

2.14.27. Manage Relativity dtSearch Index

This operation runs an index build on the dtSearch index, using the following settings:

dtSearch Index identifier: The dtSearch index to perform the actions on.
Index action: The index build operation to be performed on the index, the build operation is one of the following:
- Full build
- Incremental build
- Compress index
- Activate index
- Deactivate index
Wait for action completion: Waits for the build operation to finish before moving to the next operation.

2.14.28. Run Relativity Search Term Report

This operation runs a search term report on the Relativity instance, using the following settings:

Search Term Report Identifier: The search term report to run
Report Run Type: The report run type to be performed, the report run type is one of the following:
- Run All Terms
- Run Pending Terms
Report Results Location: Optional, the location to export the csv results of the report

Once this operation has completed, the results will be stored as a json object in the parameter {relativity_search_term_results_json}. The results will be in the following format:

{
    "results": [
        {
            "Name": "apples",
            "Documents with hits": "16",
            "Documents with hits, including group": "0",
            "Unique hits": "",
            "Last run time": "2/10/2023 4:08 AM"
        },
        {
            "Name": "automate",
            "Documents with hits": "72",
            "Documents with hits, including group": "0",
            "Unique hits": "",
            "Last run time": "2/10/2023 4:08 AM"
        },
        {
            "Name": "sensitive",
            "Documents with hits": "2",
            "Documents with hits, including group": "0",
            "Unique hits": "",
            "Last run time": "2/10/2023 4:08 AM"
        }
    ]
}

The results of the search term report are stored in the results array, and the properties inside the objects are the fields corresponding to the view of the search term report results.

The parameter {relativity_search_term_results_json} can be used in a script to add logic to the results of the search term report, for example the following script only prints results that were seen at least once :

# Example script only showing terms with hits
results_object = parameters.getJsonObject("{relativity_search_term_results_json}")
results_array = results_object["results"]

# Header which indicates how many times it was seen
hits_header = "Documents with hits"

# Only print a result if it was seen at least one time
for result in results_array:
	if int(result[hits_header]) > 0:
		for key in result.keySet():
			print(key + ": " + result[key])

		# Separate results
		print("\n")

Reporting

This option generates a search terms report in an Excel format, based on a template file. The report uses the _REL_RUN_SEARCH_TERMS_ worksheet from the template.

See Processing Report for information on using a custom template.

2.14.29. Export Relativity Saved Searches

This operation converts saved searches to Automate Relativity Query Language format and exports the saved searches to a csv file, using the following settings:

Saved search export location: The location to export the csv results

Once this operation has completed, the csv file location will be stored in the parameter {relativity_saved_searches_file}.

Reporting

This option generates a saved search report in an Excel format, based on a template file. The report uses the _REL_EXPORT_SAVED_SEARCH_ worksheet from the template.

See Processing Report for information on using a custom template.

2.14.30. Create Relativity Saved Searches

This operation creates saved searches using Automate Relativity Query Language, using the following settings:

Saved Searches:
- Folder: The folder path, if the path does not exist then it will be created
- Name: The name of the query
- Query: The relativity query language string that will be converted into the saved search
- Scope: The scope of the saved search
- Fields: The fields of the saved search, fields are seperated by , commas
- Sorting: The sorting fields of the saved search, sorting fields are seperated by , commas and contain a sorting direction in square [] brackets. For example, if the user wanted to sort by the artifact id ascending then the user would provide Artifact ID [Ascending] for the sorting column. A user can only provide two possible values for the sorting direction Ascending or Descending

In addition to providing the values for the saved searches manually, the user can also load from a CSV or TSV file, for example:

Folder,Name,Query,Scope,Scope Folders,Fields,Sorting
Admin Searches,Produced Documents,[Bates Beg] is_set,WORKSPACE,,"Edit,File Icon,Control Number,Bates Beg,Bates End",Bates Beg [Ascending]
Admin Searches,Extracted Text Only,[Extracted Text] is_set,FOLDERS,Temp\\Tes,Extracted Text,

The saved searches can also be loaded from a file during the workflow execution, using the Saved searches file option.

2.14.31. Automate Relativity Query Language

Automate Relativity Query Language is a custom language used to create Relativity saved searches, this language takes the saved search creation form from Relativity and converts it to text based query language to allow workflows to automate the creation of saved searches.

This language is made up of a bunch of expressions, each expression contains a document field name, operator and a value. Each expression is then joined by an and or an or which acts as a logical operator between the two expressions.

Expressions can also be grouped together to form logic groups which contain one or more expressions inside of parentheses. Expressions inside logic groups will be evaluated together and the result of a logic group is the evaluated expressions inside. There is no limit to how many times an expression can be nested.

Document Field Name

The document field name corresponds to Document Fields in Relativity, to declare the document field name in an expression, enclose the field name within square brackets. For example if the user wanted to use the field name Control Number, then in the expression it would be declared as [Control Number].

When using a Saved Search or an Index Search as a document field in the expression, they are declared as follows: [Saved Search] for saved searches and [Index Search] for index searches.

Operator

The operator of an expression defines how the value is evaluated, there are two different kinds of operators Binary Operators which expect a value and Unary Operators which don’t require a value. To declare the operator in an expression the user must first declare the document field name and then provide one of the following operators listed in the table below.

Operator Example

is

[Control Number] is "Example"

is_not

[Control Number] is_not "Example"

is_set

[Artifact ID] is_set

is_not_set

[Artifact ID] is_not_set

is_logged_in

[Created By] is_logged_in

is_not_logged_in

[Created By] is_not_logged_in

is_like

[Folder Name] is_like "FolderA"

is_not_like

[Control Folder Name] is_not_like "FolderA"

is_less_than

[Attachment Count] is_less_than "3"

is_less_than_or_equal_to

[Attachment Count] is_less_than_or_equal_to "12"

is_greater_than

[Attachment Count] is_greater_than "9"

is_greater_than_or_equal_to

[Attachment Count] is_greater_than_or_equal_to "5"

starts_with

[Email Subject] starts_with "Confidential"

does_not_start_with

[Email Subject] does_not_start_with "Redacted"

ends_with

[Title] ends_with "signing"

does_not_end_with

[Title] does_not_end_with "signing"

contains

[Email From] contains "@example.com"

does_not_contain

[Email From] does_not_contain "@example.com"

is_any_of_these

[Custodian] is_any_of_these [1023221, 2254568]

is_none_of_these

[Custodian] is_none_of_these [1023221, 2254568]

is_all_of_these

[Submitted By] is_all_of_these [1024881]

is_not_all_of_these

[Submitted By] is_not_all_of_these [1024881, 102568]

is_in

[Folder] is_in [1025681, 1024881, 1032568]

is_not_in

[Folder] is_not_in [1025681, 1024881, 1032568]

is_before

[Sort Date] is_before 2022-04-05

is_before_or_on

[Sort Date] is_before_or_on 2022-04-05

is_after

[Date Last Modified] is_after 2021-12-09T15:36:00

is_after_or_on

[Date Last Modified] is_after_or_on 2021-12-09T15:36:00

is_between

[Date Added] is_between 2019-01-01 - 2023-01-01

is_not_between

[Date Added] is_not_between 2019-01-01 - 2023-01-01

Value

The value of the expression defines what value the user expects from the document field. To declare a value in an expression the document field name and operator must be declared and then the user can provide a value by putting text or a number inside double quotes, for example "Value 8" or the user can provide a list of integers inside square brackets, for example [102889, 1025568]. The integers inside the square brackets correspond to artifact IDs of objects on Relativity.

Values can also be declared as dates, date values do not need to be inside double quotes or square brackets. When declaring a date value only specific operators can be used, the operators that support date values are is, is_not, is_in, is_not_in, is_before, is_before_or_on, is_after, is_after_or_on, is_between, is_not_between.

The operators is_between and is_not_between can only take two date or date time values separated by a -. For example: 2019-01-01 - 2023-01-01 or 2019-01-01T00:00:00 - 2022-12-31T23:59:59

A date value can be one of the following formats:

Date: The format of a date is Year (4 digits) - Month (2 digits) - Day (2 digits). For example 2023-04-13
Date Time:The format of a date time is Year (4 digits) - Month (2 digits) - Day (2 digits) T Hour (2 digits) : Minute (2 digits) : Seconds (2 digits) optionally the user can also declare the milliseconds by adding . followed by 1 to 9 digits. For example: 2019-05-10T05:00:13 or 2019-05-10T05:00:13.8754
Month: The format of month is the name of a month capitalized. For example: March or July
This week: The format of this week is the lower case word, for example this week
This month: The format of this month is the lower case word, for example this month
Next week: The format of next week is the lower case word, for example next week
Last week: The format of last week is the lower case word, for example last week
Last 7 days: The format of last 7 days is the lower case word, for example last 7 days
Last 30 days: The format of last 30 days is the lower case word, for example last 30 days

Example Saved Search Queries

Emails with attachments between two dates:

[Email Subject] is_set and ([Number of Attachments] is_not "0" and [Date Sent] is_between 2021-08-04 - 2023-02-28T23:59:59.997)

Produced documents with production errors, sorted by file size:

[Bates Beg] is_set and [Production Errors] is "true"

Documents without extracted text:

[Extracted Text] is_not_set or [Extracted Text Size] is "0"

2.14.32. Query Relativity Workspace Overwritten Permissions

This operation exports overridden inherited permissions, using the following setting:

Permissions output file: The location to export the permissions JSON file
Object scope:
- Object type: The type of the object, for example Folder
- Object name: Optionally, the name of the object, for example Staging. To query all objects with a specific type, leave the name field blank.

In addition to providing the values for the object types manually, the user can also load from a CSV or TSV file, for example:

Folder  Admin
Folder  Staging
View

The object scope can also be loaded from a file during the workflow execution, using the Object scope file option.

2.14.33. Apply Relativity Workspace Overwritten Permissions

Before running this operation on a production workspace, run this operation on a test workspace created from the same template, or perform a backup of the production workspace, to ensure the desired outcome is achieved.

This operation applies overridden inherited permissions using the following settings:

Match objects by:
- Artifact ID & Name: Objects from the target workspace must have the same name and Artifact ID as the objects from the permissions file to be a match.
- Name: Objects from the target workspace must have the same name as the objects from the permissions file.
New object behavior: The action to take when an object is identified in the target workspace, but the object does not exist in the permissions file:
- Do not change permissions for objects not present in the permissions file
- Reset permissions for objects not present in the permissions file
Skip objects: (Optional) Skip applying permissions on objects defined in the table
- Object type: The object type name, for example View
- Object name: The name of the object
Overwritten permissions file: A permissions file created by the Query Relativity Workspace Overwritten Permissions operation.
Overwritten permissions JSON: Optionally, the content of a permissions file.

Reporting

This option generates an overwritten permissions report in an Excel format, based on a template file. The report uses the _REL_OVERWRITTEN_PERMISSIONS_ worksheet from the template.

See Processing Report for information on using a custom template.

2.14.34. Call Relativity API

This operation will make an API call to Relativity using the current the config from Configure Relativity Connection, using the following settings:

Verb: The HTTP verb, such as GET or POST.
Endpoint: The endpoint on the Relativity API.
Parameters: Optional, URL parameters.
Body: The JSON request.

Once the API call has completed, the following parameters will be populated:

{relativity_call_api_response_code}: The HTTP response code.
{relativity_call_api_response_headers}: The response headers, JSON encocoded.
{relativity_call_api_response_body}: The response body.

2.14.35. Delete Relativity Saved Search

This operation deletes a specified saved search or all saved searches from a workspace.

2.14.36. Run Relativity Imaging Set

This operation runs the specified imaging set, using the following settings:

Imaging set identifier: The Name, Artifact ID or Name (Like) of the imaging set.
Hide images for QC review: When enabled, it prevents users from viewing images until the QC review process is complete.
Wait for completion: Waits until the imaging set has finished running.

2.14.37. Delete Relativity Index

This operation deletes the specified index, if it exists.

2.14.38. Create Relativity Analytic Index

This operation creates an analytic index, using the following settings:

Name: The name of the analytic index
Index type: The type of index, Conceptual or Classification
Saved search identifier: The Name, Artifact ID or Name (Like) of the saved search
Analytics server identifier: The Name, Artifact ID or Name (Like) of the analytics server
Order: The order of the index seen in dropdown inside of Relativity, for example setting the value to 1 would cause the index to be seen first in all dropdown
Email notification recipients file: (Optional) The list of email recipients notified during index population and build, for example:

Email Notification Recipient
usera@example.com
userb@example.com
userc@example.com

In addition to the settings above, conceptual analytic indexes have the following advanced options

Advanced Options
- Concept stop words file: (Optional) A file containing words to suppress from the index
- Continue index steps to completion: (Optional) Indicated whether to automatically complete all steps necessary to activate an analytics index after starting a step
- Dimensions: (Optional) The number of dimensions of the concept space in which documents are mapped when the index is built
- Enable email header filter: (Optional) removes common header fields (such as To, From, and Date) and reply-indicator lines
- Optimize training set: (Optional) Indicates whether to select only conceptually relevant documents from the training set saved search
- Remove documents that errored during population: (Optional) Removes documents from being populated when they have errored in a previous population
- Remove English signatures and footers: (Optional) Indicates whether to remove signatures and footers in English language emails
- Repeated content filters file: (Optional) A file containing repeated content filters associated with the index
- Training set: (Optional) The Name, Artifact ID or Name (Like) of the saved search for training

Example Concept stop words file:

Stop Words
and
a
can

Example Repeated content filters file, filters are identifier by name:

Content Filters
Credit Card Regex Filter
Email Address Filter

2.14.39. Run Relativity Saved Searches

This operation runs saved searches on the Relativity instance and returns the item count, using the following settings:

Run options: How the user will retrieve the saved searches to run:
- All saved searches in workspace: Runs all saved searches in the workspace
- All saved searches under search container: Runs all saved searches under the specified search container
- A single saved search: Runs the specified saved search
Saved search identifier: The Name, Artifact ID or Name (Like) of the saved search
Search container identifier: The Name, Artifact ID or Name (Like) of the search container

Once this operation has completed, the results will be stored as a json object in the parameter {relativity_run_saved_search_results_json}. The results will be in the following format:

{
    "results": [
        {
            "Name": "All Documents",
            "Query": "[Artifact ID] is_set",
            "Hits": 163,
            "Folder": "Admin Searches\\Tests"
        },
        {
            "Name": "Extracted Text Only",
            "Query": "[Extracted Text] is_set",
            "Hits": 113,
            "Folder": ""
        },
        {
            "Name": "Produced Documents",
            "Query": "[Control Number] is_set and [Document] is \"true\"",
            "Hits": 65,
            "Folder": "Admin Searches"
        }
    ]
}

The results of the saved searches that ran are stored in the results array, the properties inside the objects are:

Name: The name of the saved search
Artifact ID: The artifact ID of the saved search
Hits: The amount of documents returned when running the saved search

The parameter {relativity_run_saved_search_results_json} can be used in a script to add logic to the results of the saved search results, for example the following script will only print results that have at least one hit:

# Example script only showing saved searches with atleast one document
results_object = parameters.getJsonObject("{relativity_run_saved_search_results_json}")
results_array = results_object["results"]

# Only print a result if it has atleast one document
for result in results_array:
	if int(result["Hits"]) > 0:
		print("Folder: " + result["Folder"])
		print("Name: " + result["Name"])
		print("Query: " + result["Query"])
		print("Hits: " + str(result["Hits"]))

	# Separate results
	print("\n")

Reporting

This option generates a saved search report in an Excel format, based on a template file. The report uses the _REL_RUN_SAVED_SEARCH_ worksheet from the template.

See Processing Report for information on using a custom template.

2.14.40. Manage Relativity Analytic Index

This operation runs an index action on the specified analytic index, using the following settings:

Analytic index identifier: The Name, Artifact ID or Name (Regex) of the analytic index
Analytic index type: The analytic index type Conceptual or Classification
Existing analytic job action: The behavior when an existing analytic index job is found
- Skip running an analytic index job if another one is in progress for the same index
- Stop the currently running analytics job action, and start a new job
Index action: The action to perform on the analytics index
- Full population: Runs full index population
- Incremental population: Runs an incremental population
- Build index: Runs a full index build
- Retry errors: Retries errors that occurred during population
- Remove documents in error: Removes documents that errored
- Activate: Activates the index for querying
- Deactivate: Disables queries on the index
Wait for completion: Waits for the index job to complete

When using the index action Build index on an analytic index, the analytic index must be deactivated.

2.14.41. Run Relativity OCR Set

This operation runs the specified OCR set, using the following settings:

OCR set identifier: The Name, Artifact ID or Name (Like) of the OCR set.
Existing OCR set job action: Action to take when an existing OCR set job is currently running
- Stop: Stop the currently running OCR set job, and start a new job
- Skip: Skip running an OCR set job if another one is in progress for the same set
Wait for completion: Waits until the OCR set has finished running.

2.14.42. Export Relativity Metadata

This operation exports the specified metadata type, using the following settings:

Metadata type: The type of metadata to export, either View or Saved Search
Metadata identifier: The Name, Artifact ID or Name (Like) of the metadata.
Metadata type location: The location to export the metadata from, either from Current Workspace or the Admin Workspace
Custom condition: Allows the user to define a more granular query when exporting metadata, see https://platform.relativity.com/RelativityOne/Content/REST_API/Resource_Tasks/Querying_for_resources.htm#_Syntax_for_query_conditions for more information.

The output of the view can be exported to a file of the following type:

CSV: Use the extension .csv
PDF: Use the extension .pdf
XLSX: Use the extension .xlsx, the export defaults to this option if no other format is matched, this option will be used.

2.14.43. Create Relativity Production Set

This operation create a Production set, using the following settings:

Name: The name of the production set
Production data sources: The data sources for the production
- Data source name: The name of the data source
- Data source type: The type of data to produce, one of the following Images, Natives or Images and Natives
- Saved search identifier: The Name, Artifact ID or Name (Like) of the saved search
- Image placeholder: The action to perform when using image placeholders, either Never use image placeholder, Always use image placeholder, or When no image exists
- Placeholder identifier: The Name, Artifact ID or Name (Like) of the placeholder
- Markup set identifier: The Name, Artifact ID or Name (Like) of the markup set
- Burn redactions: Weather to burn redactions when producing image type productions
Create production set from template: Create a new production set using the settings of an existing production set
- Production set template identifier: The Name, Artifact ID or Name (Like) of the production set template
- Production set exists in another workspace: Enabling this option lets the user copy production set template settings from any workspace
- Workspace identifier: The Name, Artifact ID or Name (Like) of the template production set workspace
Create production set from settings Create a new production using the settings in the operation
- Numbering type: The document numbering type
- Prefix: The string shown before the bates number
- Suffix: (Optional) The string shown after the bates number
- Start Number: The initial starting bates number
- Number of numbering digits: The number representing the number of digits used for document-level numbering, range 1-7
- Branding font: The type of font used for branding
- Branding font size: The size of font to use for branding
- Scale branding font: Causes the branding font to scale
- Wrap branding font: Causes branding text to wrap when it overlaps with adjacent headers or footers

2.14.44. Run Relativity Production Set

This operation runs a Production set, using the following settings:

Production set identifier: The Name, Artifact ID or Name (Like) of the production set
Production set action: The action to perform on the production set
- Stage: Stages the production set to prepare for producing documents
- Run: Starting a job on the production set and produces staged documents
- Stage and Run: Stages the production set to prepare for producing documents and then immediately starts a job on the production set
Wait for completion: Waits for the production set to finish before moving onto the next operation

2.15. Semantic Search

These operations configure the Semantic engine and run similarity searches.

2.15.1. Configure Semantic Search Engine

This operation sets the configuration of the Semantic Search Engine. This operation must used in a workflow before data is loaded, or reloaded, for it to have an effect.

The following settings can be configured:

Use Semantic service: Select this option to use a Semantic service defined in the Third-Party Services. Alternatively, the embedded Semantic engine will be used.
Semantic service ID: The ID of the Semantic service to use.
Text embeddings model: The model to use for building the text semantic index, for example intfloat/multilingual-e5-small. If no value is specified, the item text will not be used to build the semantic index.
Image embeddings model: The model to use for building the image semantic index, for example openai/clip-vit-large-patch14. If no value is specified, the item image will not be used to build the semantic index.

2.15.2. Find Semantically Similar Items

This operation searches for items that are semantically similar to the scope, and tags them.

2.16. SQL

These operations transfer data between the Nuix case and a SQL server and allow for running arbitrary SQL commands.

2.16.1. SQL Command

This operation connects to a SQL database and runs SQL commands using the following options:

SQL platform: The SQL platform that commands will run on, either Microsoft SQL (using the JTDS or Native driver) and PostgreSQL.
SQL server name: The SQL host name, for example localhost.
Port: The SQL host port, for example 1433 for Microsoft SQL, 5432 for PostgreSQL.
Encryption: The requirement for encrypted JTDS connections:
- Disabled: Does not use encryption.
- Requested: Attempts to use an encrypted connection if supported by the server
- Required: Requires the use of an encrypted connection.
- Signed: Requires the use of an encrypted connection, signed with a certificate in the Java Trust Store.
Instance: The Microsoft SQL instance, for example SQLEXPRESS, or blank for the default instance.
Domain: The Windows domain for the Microsoft SQL authentication, or blank for Integrated Authentication.
Username: The username used to connect to the database, or blank for Integrated Authentication.
Password: The password used to connect to the database, or blank for Integrated Authentication.
Database: The SQL database to run SQL commands on.

When no database is specified using the SQL platform PostgreSQL, the operation will try to connect to the postgres database. Additionally, when creating a database with PostgreSQL, the database cannot be altered with the same query. To alter the database created, another SQL Command operation is required.

SQL query: The SQL query to run.

This operation can be used to create the database required to run other SQL operations.

Sample SQL query to create a database:

CREATE DATABASE automate;

2.16.2. Metadata To SQL

This operation exports the metadata of items matching the scope query to Microsoft SQL (using the JTDS or Native driver) or PostgreSQL.

When the table specified does not exist, this operation will try to determine each column type from the metadata fields in the selected metadata profile and create a SQL table with the detected column types.

When creating a SQL table, the type NVARCHAR(MAX) will be used in Microsoft SQL and the type TEXT will be used PostgreSQL when unable to determine the metadata field type.

2.16.3. Query From SQL

This operation queries data from a SQL database and adds custom metadata to the items in the scope, as well as exports the queried data to a CSV file.

The first table column name needs to be either GUID or DocID. The subsequent columns correspond to the metadata fields to be assigned.

Column aliases can be used in lieu of columns with the names GUID or DocID.

Sample Microsoft SQL query with column aliases:

SELECT [Header One] as 'GUID'
      ,[Header Two] as 'File Type'
      ,[Header Two] as 'File Path'
  FROM [TEST TABLE]

Sample PostgreSQL query with column aliases:

SELECT "Header One" as "GUID"
      ,"Header Two" as "File Type"
      ,"Header Two" as "File Path"
  FROM test_table

2.17. Veritone

These operations perform actions in Veritone.

2.17.1. Configure Veritone Connection

This operation sets the Veritone Third-Party Service that will be used to connect to Veritone. This operation is required for all other operations performing actions in Veritone.

The Veritone service ID must be specified as a parameter of the type Veritone Service.

2.17.2. Veritone Translate Items

This operation runs a Veritone translation job for the text of each item matching the scope query.

Item text is extracted from the Nuix items and sent to Veritone to be translated. The results are then saved to the Nuix item.

The operation has the following settings:

Scope query: The Nuix query to select the items to translate.
Source language provider: The method of determining an item’s language.
- Explicit: Explicitly provide the item language.
- Nuix Detected Language: Use the language detected by Nuix.
Fallback to explicit source language if item language couldn’t be determined: If Nuix could not detect the item’s language, fallback to the explicitly provided source language.
Explicit source language: The explicitly provided source language.
Target language: The translation target language.
Trim body text at: if checked, the size in characters after which the body text of items is trimmed before sending to Veritone for translation.
Save translation result as: How to save the translated text.
- As Item Text: Save the translated text as item text.
- As Custom Metadata: Save the translated text as custom metadata.
- As Child Item: Save the translated text as a child item.
Custom metadata name: The custom metadata name to use when saving the translated text as custom metadata.
Output folder: The folder location where the translated text files should be saved when adding them as a child item.
Text modifications
- Append: Append the translated text at the end of the existing document text.
- Overwrite: Replace the document text with the translated text.

The Tag failed items as options have the same behavior as in the Legal Export operation.

The Tag translated items as options have the same behavior as the Tag failed items as option, but for successfully translated items.

The Tag unsupported language items as options have the same behavior as the Tag failed items as option, but for items whose language did not have a supporting Veritone translation engine.

2.17.3. Veritone Transcribe Items

This operation runs a Veritone transcription job for each item matching the scope query.

The operation is designed to perform best when the Nuix items have binary data stored.

When running the Veritone Transcribe Items operation on Nuix items which do not have binary data stored, the process will take significantly longer. Before running this operation, either store item binaries during the Add Evidence operation, or use the Populate Binary Store operation to populate binaries of the items that need to be transcribed.

Items are extracted as native files from the Nuix items and sent to Veritone to be transcribed. The results are then saved to the Nuix item.

The operation has the following settings:

Scope query: The Nuix query to select the items to transcribe.
Target language: The transcription target language.
Save transcription result as: How to save the transcribed text.
- As Item Text: Save the transcribed text as item text.
- As Custom Metadata: Save the transcribed text as custom metadata.
Custom metadata name: The custom metadata name to use when saving the transcribed text as custom metadata.
Text modifications
- Append: Append the transcribed text at the end of the existing document text.
- Overwrite: Replace the document text with the transcribed text.

The Tag failed items as options have the same behavior as in the Legal Export operation.