Email threading administration

A typical email conversation includes messages, replies, forwards, and attachments, which are all added to the application as separate documents. The email threading feature analyzes the contents of these documents and organizes them together into threads.

Email threading allows you to reduce the number of documents to review by separating unique content from content that is duplicated across multiple email messages. Depending on the review strategy, reviewers can omit duplicate email messages and review only the unique messages in a thread.

An administrator must allow access to email threading features. Email threading features allow users to submit documents for thread processing, search for threaded documents, and review threaded documents on the Documents page. An administrator or group leader with permissions can also configure assignments that contain threaded documents to open in threaded view.

To identify more duplicate documents, you can run a populate hashes job for the attachments that are to be included in thread analysis. This can reduce the review time by creating fewer pivots. For information about how to identify duplicate documents by running a populate hashes job, see Populate hashes.

For information about how reviewers can work with email threading, including an explanation of email threading concepts, see About email threading.

Enable email threading features

To enable email threading features, administrators must grant users access to the following threading analysis and threading search options:

Processing - Thread Analysis

Search - Threads

For information about how to enable email threading features, and a description of the places where threading features appear, see Work with security for features.

Enable email threading system fields

Administrators can make thread-related system fields available to users. The fields appear as search criteria that the user can select on the Search page and as columns to display in the List pane. The following system fields are available:

Thread - Document Type

Thread Analysis Status

Thread ID

Thread Order ID

For information about how to enable email threading system fields, see Work with system fields. For more information about these system fields, see Email threading system fields.

Enable threaded view for assignments

When users open assignments containing threaded documents, the default setting opens the List pane in standard view on the Documents page. By breaking an assignment on threads, administrators and group leaders with permissions can create a setting that opens these assignment types in threaded view. Threaded view facilitates faster review by organizing the documents into threads.

To open an assignment in threaded view, users must also select the Threading option in their search preferences. For information about how to set search preferences, see Set search preferences.

Breaking an assignment on threads also keeps all documents in a thread together so that a thread does not split across multiple assignments.

Note: You must apply the threaded view setting for each phase.

To enable threaded view for an assignment:

Open or create a workflow and phase. For information about how to create a workflow and phase, see Create workflows and phases.

On the Phases page of the workflow, click a phase name to open its properties.

Under Break assignments on, select Threads.

For more information about phase properties, see Configure phase properties.

Click Save.

Thread analysis processing overview

This section discusses how the application forms threads during thread analysis.

About thread analysis

Thread analysis identifies email messages that stem from the same conversation. Email messages that contain the entire bodies of other email messages are grouped together into threads.

An email thread can include the following types of documents:

Pivot: A document that contains any unique content not contained in any other document in the thread. Examples of unique content include the body text, attachments, and recipients. Documents that cannot be thread analyzed are also marked as pivots.

A thread can contain more than one pivot.

Duplicate: Any document in a thread whose content, including attachments, is contained in other documents in the thread.

Duplicate previous pivot: A document that used to be a pivot, but is now a duplicate. This can happen when a new document is submitted for analysis after the thread that it belongs to has been built. If the new document is identified as a pivot, any documents that were previously identified as pivots but that are now wholly contained within the new pivot are now considered duplicate previous pivots.

Note: Email messages that are attached to other email messages are treated as attachments and are not evaluated for threading.

Depending on the review strategy, reviewers can omit duplicate email messages and review only the unique messages in a thread, resulting in cost savings and greater review efficiency. Because duplicate previous pivot documents may already have been reviewed or produced by the time that their thread document type changes from pivot to duplicate previous pivot, you may also be able to omit duplicate previous pivot documents from further review.

Thread analysis processing

Thread analysis finds email messages that are part of the same conversation thread. Thread analysis also identifies thread duplicates by finding email messages that are between the same parties and that contain the same attachments.

In order for an email message to be identified as a duplicate, all of the following conditions must be true:

The email message has the same normalized subject line as other email messages submitted for thread analysis. Normalization of a subject line is a process in which subject lines are standardized by ignoring prefixes such as Fw and Re. These prefixes are usually added by the email application, rather than the person who sent the email message. For a list of prefixes that are ignored, see Prefixes ignored during thread analysis.

All of the sender and recipient values in the email message are contained in a pivot in the thread.

The application does not consider whether a person is the sender in one email message and the recipient in another email message.

The sender and recipient values are parsed into a display name and an email address, where possible, so that a match on either is sufficient. This accounts for formatting differences between email messages that would otherwise cause the values to not be seen as a match.

The content in the normalized body of one email message is wholly contained at the bottom of another email message. Normalization of an email body ignores line feeds, extra spaces, and punctuation, which are frequently used by email applications to differentiate text from prior messages.

All files attached to the email message are attached to a pivot in the thread.

If all of the requirements are met, the thread analysis process designates the email message as a duplicate and the corresponding inclusive email message as a pivot.

If a document is a duplicate based on its hash value, the document also meets the criteria to be considered a threading duplicate. The application designates one copy of the hash duplicate document as the pivot, and all other copies as duplicates. For more information about how exact documents are identified using hash values, see Populate hashes.

Ongoing thread analysis

Thread analysis is an ongoing process, because new email messages can be added to a case at various times throughout the review process. When new email messages are processed for thread analysis, the structure of a thread can change.

For example, when new email messages are loaded into a case and processed for thread analysis, the new documents are compared to all previously analyzed documents in the case. If existing documents are in the same thread as the newly loaded documents, the thread document type of the existing documents may change.

The following example illustrates this scenario:

Email 1 is a pivot in a thread.

Email 2 is subsequently loaded into the case and submitted for thread analysis. Email 2 has the same normalized subject, senders and recipients, and attachments as email 1, and it contains the entire body of email 1, plus new unique content.

After thread analysis processing, the thread document types are updated as follows:

Email 2 is identified as a pivot and its thread document type becomes pivot.

Email 1 is no longer a pivot and its thread document type changes to duplicate previous pivot.

Prefixes ignored during thread analysis

The application ignores the following prefixes in email subject lines during thread analysis:

RE

FW

FWD

Accepted

Action Requested

Canceled

COMPLETE

Declined

NOTICE TO

Out of Office AutoReply:

Recall

REMINDER

Task Accepted

Task Declined

Task Request