Guide to Extracting Named Entities

This introduction covers:

What are named entities?

What is the benefit of using named entities?

What are the default categories of named entities?

How do you access the Custom Named Entities feature?

What are Named Entities?

Nuix Workstation extracts intelligence from processed data in the form of named entities. Named Entities are objects resembling a specific data pattern. Nuix Workstation extracts and indexes these values for analysis. These values are recognized from matches based on pre-defined regular expression pattern searches utilized by Nuix Workstation.

Named Entities identify specific data sets and are useful for analysis as they enable information such as companies, personal IDs, names, or phone numbers to be brought to the surface of the data set. Named Entity identification and extraction takes place during processing and is extracted from item content (text), properties, or both.

Nuix Workstation has an enhanced entity model with additional flexibility and the ability to extract entity information from a set of available default regular expressions.

What is the Benefit of Using Named Entities?

Nuix Workstation provides a set of standard default named entities. You access this as the Default Named Entities Profile from the Global Options window. These entities are categorized under the most common categories of information you can search for. See the following What are the default categories of named entities? for this list of categories.

Nuix Workstation also allows you to do the following:

Use the default Named Entity Profile or an existing Custom Named Entity Profile to create a new Custom Named Entity Profile allowing you to augment the list of any standard named entity category with additions, amendments, or deletions of named entities into a new or enhanced customized set of named entity lists. See Use custom named entities and Use Named Entity Profiles for details.

Use switches to disable all standard default Named Entities, and or enable only certain standard default Named Entities when you ingest data so that you only extract the named entities you want to have in a result set. See Disable or enable certain default Named Entities in Use Named Entity Profiles for details.

Search on standard and custom named entities to check those names more specifically in one or more selected items in a result set. See the Guide to Searching in Nuix Products for more details.

Using named entities in this way allows you to more efficiently and more cost-effectively specifically target what data to extract for further examination.

Note: The Nuix NLP functionality in Nuix Workstation allows fast sophisticated searches of Named Entities in documents according to various particular Compound Lexemes, for example to only find Social Security Numbers or phone numbers. See the Nuix Workstation Guide to Analyzing with Nuix NLP for further information.

What are the Default Categories of Named Entities?

Accessible from the Global Options window, the Default Named Entities Profile contains standard named entities under the following categories:

Company

Credit Card*

Personal ID

Email

Money

Person

Phone Number

Country

IP Address

URL

* See Appendix: Supported Credit Cards for a list of these cards and their number formats.

How Do You Access the Custom Named Entities Feature?

You can access the Custom Named Entities feature in any one of the following three ways:

File menu > Global Options > Custom Named Entities option

File menu > Global Options > Named Entity Profiles option

Data Processing Settings, then Use custom named entity profile

The Custom Named Entities feature enables users with a Case_Creation license using Nuix Workstation from v8.6 and later. For earlier versions, contact Nuix support at https://nuix.service-now.com/support for license recommendations.

Note: The Nuix Data Finder (NDF) plugin requires that you enable the SENSITIVE_DATA_FINDER plus feature.