Appendix A: Load file formats

This appendix details load file formats for the following:

  • Concordance

  • Summation

Concordance load files

Nuix Workstation creates Concordance load files as well as the following items:

The load file: loadfile.dat.

 An Opticon load file: loadfile.opt, which is always included in the export. It is always an empty (zero size)

file unless you select to export PDFs or TIFFs. Image 301 The summary report with information about the

production and export run: summary-report.txt and summary-report.xml.

The XML provides a more user-friendly report by combining it with a custom cascading style sheet.

A text file containing the top level MD5 digests: top-level-MD5-digests.txt.

A custom folder for each type of exported data: Native, TIFF, PDF, and Text.

You define folders under File Naming on the Numbering and Files tab on the Legal Export dialog.

The Concordance load file is a delimited file. You can use this format to facilitate the transfer of information from Nuix Workstation to other systems.

By default, the Concordance load file is created using ASCII encoding (Concordance now supports UTF-8 encoding). To create a Concordance load file with UTF-8 encoding, you need to start Nuix Workstation using a command-line switch.

Concordance load file format - version 2.16

The following information describes the default Concordance load file format created by Nuix 2.16:

Column Name

Source

Concordance DB

Represents

DOCID

Export Metadata

Text, 50, Image, Key

The DOCID auto-generated during the export process, whose format is controlled in the Legal Export dialog

PARENT_DOCID

Export Specific Metadata

Text, 50

The DOCID to track and maintain the parent-child relationship of documents

BEGINBATES

Export Specific Metadata

Text, 50

The beginning DOCID for a multi-page document; relevant when creating TIFFs or PDFs

ENDBATES

Export Specific Metadata

Text, 50

The ending DOCID for a multi-page document; relevant when creating TIFFs or PDFs

BEGINGROUP

Export Specific Metadata

Text, 50

The beginning DOCID for a family of documents.

ENDGROUP

Export Specific Metadata

Text, 50

The ending DOCID for a family of documents

PAGECOUNT

Export Specific Metadata

Numeric, 5

The number of pages in an imaged document

ATTACHMENTLIST

Item Metadata

Paragraph, Indexed

The list of attachment names

FILENAME

Item Metadata

Paragraph, Indexed

Specific Filename; maps to the Nuix Name field

FILEEXTENSION

Item Metadata

Paragraph, Indexed

The file extension for the specific item

CREATIONDATE

Item Metadata

Date YYYYMMDD

The Creation Date

MODIFIEDDATE

Item Metadata

Date YYYYMMDD

The Last Modified Date

FILESIZE

Item Metadata

Numeric, 20

The file size in bytes

SENTONDATE

Item Metadata

Date YYYYMMDD

The date an email message is sent

SENTONTIME

Item Metadata

Text , 20

The time an email message is sent

RECEIVEDDATE

Item Metadata

Date YYYYMMDD

The date an email message is received

RECEIVEDTIME

Item Metadata

Text, 20

The time an email message is received

AUTHORNAME

Item Metadata

Paragraph, Indexed

The name of the author of a document

AUTHOREMAIL

Item Metadata

Paragraph, Indexed

The Nuix Communications FROM field

TO

Item Metadata

Paragraph, Indexed

The Nuix Communications TO field

CC

Item Metadata

Paragraph, Indexed

The Nuix Communications CC field

BCC

Item Metadata

Paragraph, Indexed

The Nuix Communications BCC field

SUBJECT

Item Metadata

Paragraph, Indexed

The subject of an email message

TITLE

Item Metadata

Paragraph, Indexed

The subject of an email or the name of a file

ORIGINALPATH

Item Metadata

Paragraph, Indexed

The full path to source evidence

MD5HASH

Item Metadata

Paragraph, Indexed

The MD5 Hash generated by Nuix

ENTRYID

Item Metadata

Paragraph, Indexed

The Nuix GUID

DOCUMENTTYPE

Item Metadata

Paragraph, Indexed

The document type

ITEMPATH

Item Metadata

Paragraph, Indexed

The location path of the item

TIMEZONE

Item Metadata

Paragraph, Indexed

The time zone of the item

 

User-defined metadata fields (which, by default are included in the Concordance Load File):

(If other user-selected metadata fields are added, the appropriate changes to the Concordance DB must be made.)

Column Name

Source

Concordance DB

PATHNAME

User-selected

Paragraph, Indexed

GUID

User-selected

Paragraph, Indexed

FILETYPE

User-selected

Paragraph, Indexed

 

Concordance Load File (loadfile.dat) Delimiters (Concordance Default):

Type

Delimiter

Comma

(020) – ASCII (decimal)

Quote

(254) - ASCII (decimal) / (00FE) - Unicode (Hex)

Newline

(174) - ASCII (decimal) / (00AE) - Unicode (Hex)

Concordance Import Wizard - Format

 

Concordance load file format

The default Concordance load file contains the following non-configurable fields that are always present:

Using a Metadata Profile, you can add additional custom metadata to show as additional columns after the ITEMPATH, TEXTPATH, TIFFPATH and PDFPATH fields.

Column Name

Source

Concordance DB

Represents

DOCID

Export Metadata

Text, 50, Image, Key

The DOCID auto-generated during the export process, whose format is contr

Legal Export dialog

PARENT_DOCID

Export Specific Metadata

Text, 50

The DOCID to track and maintain the parent-child relationship of documents

BEGINBATES

Export Specific Metadata

Text, 50

The beginning DOCID for a multi-page document. Relevant when creating TI PDFs

ENDBATES

Export Specific Metadata

Text, 50

The ending DOCID for a multi-page document. Relevant when creating TIFF

BEGINGROUP

Export Specific Metadata

Text, 50

The beginning DOCID for a family of documents

ENDGROUP

Export Specific Metadata

Text, 50

The ending DOCID for a family of documents

PAGECOUNT

Export Specific Metadata

Numeric, 5

The number of pages in an imaged document

ITEMPATH

Item Metadata

Paragraph, Indexed

The relative path to the native file

TEXTPATH

Item Metadata

Paragraph, Indexed

The relative path to the text file

PDFPATH

Item Metadata

Paragraph, Indexed

The relative path to the PDF file

TIFFPATH

Item Metadata

Paragraph, Indexed

The relative path to the first TIFF page

 

Concordance Load File (loadfile.dat) Delimiters (Concordance Default):

Type

Delimiter

Comma

(020) – ASCII (decimal)

Quote

(254) - ASCII (decimal) / (00FE) - Unicode (Hex)

Newline

(174) - ASCII (decimal) / (00AE) - Unicode (Hex)

Concordance Import Wizard - Format

 

 

Note: To the contents of any of the preceding columns appear under the Name column in the Results pane, ensure you rename their column names to 'NAME', 'SUBJECT' or 'FILENAME'.

Summation load files

Nuix Workstation creates Summation load files as well as the following items:

A Class I DII load file

Summary reports detailing information about the production/export run in two formats:

summary-report.txt and summary-report.xml.

The XML provides a more user-friendly report by combining it with a custom cascading style sheet.

A top-level-MD5-digests.txt text file containing the top-level MD5 digests Image 311 A folder for each type of exported data: Native, TIFF, PDF, and Text.

You define folders under File Naming on the Numbering and Files tab on the Legal Export dialog.

Summation load file format

The Summation Legal Export provides a single DII, containing metadata, file and full-text references. The following table describes information in the Summation load file:

DII Token

Source

Represents

@DOCID

Export Specific Metadata

The DOCID auto-generated during the export process, whose format is controlled as part of the Legal Export dialog

@PARENTID

Export Specific Metadata

The DOCID used to track and maintain the parent-child relationship of documents

@FULLTEXT DOC

Standard DII token

One full-text file exists for each database record

@O

Standard DII token

 

@T

Standard DII Token

 

@I

Standard DII Token

The image location

@L

Standard DII Token

The long name for the item; includes Nuix specific item metadata: GUID, PathName, Name

@FROM

Nuix Defined Metadata

The Nuix Communications FROM field

@TO

Nuix Defined Metadata

The Nuix Communications TO field

@CC

Nuix Defined Metadata

The Nuix Communications CC field

@BCC

Nuix Defined Metadata

The Nuix Communications BCC field

@SUBJECT

Nuix Defined Metadata

The email subject or Nuix Name

@DATESENT

Nuix Defined Metadata

The Sent Date for email - Nuix Communications Date

@TIMESENT

Nuix Defined Metadata

The Sent Time for email - Nuix Communications Date

@HEADER / @HEADER- END

Item Properties

The email header content including all extracted metadata

@EMAIL-BODY / @EMAIL- END

Item Content

The email body content

@MULTILINE

Additional Metadata

All additional metadata referenced from the Metadata Profile used for the export

@ATTACH

Standard DII Token

Any email attachments