Appendix A: Load file formats
This appendix details load file formats for the following:
-
Concordance
-
Summation
Concordance load files
Nuix Workstation creates Concordance load files as well as the following items:
The load file: loadfile.dat.
An Opticon load file: loadfile.opt, which is always included in the export. It is always an empty (zero size)
file unless you select to export PDFs or TIFFs. The summary report with information about the
production and export run: summary-report.txt and summary-report.xml.
The XML provides a more user-friendly report by combining it with a custom cascading style sheet.
A text file containing the top level MD5 digests: top-level-MD5-digests.txt.
A custom folder for each type of exported data: Native, TIFF, PDF, and Text.
You define folders under File Naming on the Numbering and Files tab on the Legal Export dialog.
The Concordance load file is a delimited file. You can use this format to facilitate the transfer of information from Nuix Workstation to other systems.
By default, the Concordance load file is created using ASCII encoding (Concordance now supports UTF-8 encoding). To create a Concordance load file with UTF-8 encoding, you need to start Nuix Workstation using a command-line switch.
Concordance load file format - version 2.16
The following information describes the default Concordance load file format created by Nuix 2.16:
Source |
Concordance DB |
Represents |
|
DOCID |
Export Metadata |
Text, 50, Image, Key |
The DOCID auto-generated during the export process, whose format is controlled in the Legal Export dialog |
PARENT_DOCID |
Export Specific Metadata |
Text, 50 |
The DOCID to track and maintain the parent-child relationship of documents |
BEGINBATES |
Export Specific Metadata |
Text, 50 |
The beginning DOCID for a multi-page document; relevant when creating TIFFs or PDFs |
ENDBATES |
Export Specific Metadata |
Text, 50 |
The ending DOCID for a multi-page document; relevant when creating TIFFs or PDFs |
BEGINGROUP |
Export Specific Metadata |
Text, 50 |
The beginning DOCID for a family of documents. |
ENDGROUP |
Export Specific Metadata |
Text, 50 |
The ending DOCID for a family of documents |
PAGECOUNT |
Export Specific Metadata |
Numeric, 5 |
The number of pages in an imaged document |
ATTACHMENTLIST |
Item Metadata |
Paragraph, Indexed |
The list of attachment names |
FILENAME |
Item Metadata |
Paragraph, Indexed |
Specific Filename; maps to the Nuix Name field |
FILEEXTENSION |
Item Metadata |
Paragraph, Indexed |
The file extension for the specific item |
CREATIONDATE |
Item Metadata |
Date YYYYMMDD |
The Creation Date |
MODIFIEDDATE |
Item Metadata |
Date YYYYMMDD |
The Last Modified Date |
FILESIZE |
Item Metadata |
Numeric, 20 |
The file size in bytes |
SENTONDATE |
Item Metadata |
Date YYYYMMDD |
The date an email message is sent |
SENTONTIME |
Item Metadata |
Text , 20 |
The time an email message is sent |
RECEIVEDDATE |
Item Metadata |
Date YYYYMMDD |
The date an email message is received |
RECEIVEDTIME |
Item Metadata |
Text, 20 |
The time an email message is received |
AUTHORNAME |
Item Metadata |
Paragraph, Indexed |
The name of the author of a document |
AUTHOREMAIL |
Item Metadata |
Paragraph, Indexed |
The Nuix Communications FROM field |
TO |
Item Metadata |
Paragraph, Indexed |
The Nuix Communications TO field |
CC |
Item Metadata |
Paragraph, Indexed |
The Nuix Communications CC field |
BCC |
Item Metadata |
Paragraph, Indexed |
The Nuix Communications BCC field |
SUBJECT |
Item Metadata |
Paragraph, Indexed |
The subject of an email message |
TITLE |
Item Metadata |
Paragraph, Indexed |
The subject of an email or the name of a file |
ORIGINALPATH |
Item Metadata |
Paragraph, Indexed |
The full path to source evidence |
MD5HASH |
Item Metadata |
Paragraph, Indexed |
The MD5 Hash generated by Nuix |
ENTRYID |
Item Metadata |
Paragraph, Indexed |
The Nuix GUID |
DOCUMENTTYPE |
Item Metadata |
Paragraph, Indexed |
The document type |
ITEMPATH |
Item Metadata |
Paragraph, Indexed |
The location path of the item |
TIMEZONE |
Item Metadata |
Paragraph, Indexed |
The time zone of the item |
User-defined metadata fields (which, by default are included in the Concordance Load File):
(If other user-selected metadata fields are added, the appropriate changes to the Concordance DB must be made.)
Column Name |
Source |
Concordance DB |
PATHNAME |
User-selected |
Paragraph, Indexed |
GUID |
User-selected |
Paragraph, Indexed |
FILETYPE |
User-selected |
Paragraph, Indexed |
Concordance Load File (loadfile.dat) Delimiters (Concordance Default):
Type |
Delimiter |
Comma |
(020) – ASCII (decimal) |
Quote |
(254) - ASCII (decimal) / (00FE) - Unicode (Hex) |
Newline |
(174) - ASCII (decimal) / (00AE) - Unicode (Hex) |
Concordance Import Wizard - Format |
|
Concordance load file format
The default Concordance load file contains the following non-configurable fields that are always present:
Using a Metadata Profile, you can add additional custom metadata to show as additional columns after the ITEMPATH, TEXTPATH, TIFFPATH and PDFPATH fields.
Column Name |
Source |
Concordance DB |
Represents |
DOCID |
Export Metadata |
Text, 50, Image, Key |
The DOCID auto-generated during the export process, whose format is contr Legal Export dialog |
PARENT_DOCID |
Export Specific Metadata |
Text, 50 |
The DOCID to track and maintain the parent-child relationship of documents |
BEGINBATES |
Export Specific Metadata |
Text, 50 |
The beginning DOCID for a multi-page document. Relevant when creating TI PDFs |
ENDBATES |
Export Specific Metadata |
Text, 50 |
The ending DOCID for a multi-page document. Relevant when creating TIFF |
BEGINGROUP |
Export Specific Metadata |
Text, 50 |
The beginning DOCID for a family of documents |
ENDGROUP |
Export Specific Metadata |
Text, 50 |
The ending DOCID for a family of documents |
PAGECOUNT |
Export Specific Metadata |
Numeric, 5 |
The number of pages in an imaged document |
ITEMPATH |
Item Metadata |
Paragraph, Indexed |
The relative path to the native file |
TEXTPATH |
Item Metadata |
Paragraph, Indexed |
The relative path to the text file |
PDFPATH |
Item Metadata |
Paragraph, Indexed |
The relative path to the PDF file |
TIFFPATH |
Item Metadata |
Paragraph, Indexed |
The relative path to the first TIFF page |
Concordance Load File (loadfile.dat) Delimiters (Concordance Default):
Type |
Delimiter |
Comma |
(020) – ASCII (decimal) |
Quote |
(254) - ASCII (decimal) / (00FE) - Unicode (Hex) |
Newline |
(174) - ASCII (decimal) / (00AE) - Unicode (Hex) |
Concordance Import Wizard - Format |
|
Note: To the contents of any of the preceding columns appear under the Name column in the Results pane, ensure you rename their column names to 'NAME', 'SUBJECT' or 'FILENAME'.
Summation load files
Nuix Workstation creates Summation load files as well as the following items:
A Class I DII load file
Summary reports detailing information about the production/export run in two formats:
summary-report.txt and summary-report.xml.
The XML provides a more user-friendly report by combining it with a custom cascading style sheet.
A top-level-MD5-digests.txt text file containing the top-level MD5 digests A folder for each type of exported data: Native, TIFF, PDF, and Text.
You define folders under File Naming on the Numbering and Files tab on the Legal Export dialog.
Summation load file format
The Summation Legal Export provides a single DII, containing metadata, file and full-text references. The following table describes information in the Summation load file:
DII Token |
Source |
Represents |
@DOCID |
Export Specific Metadata |
The DOCID auto-generated during the export process, whose format is controlled as part of the Legal Export dialog |
@PARENTID |
Export Specific Metadata |
The DOCID used to track and maintain the parent-child relationship of documents |
@FULLTEXT DOC |
Standard DII token |
One full-text file exists for each database record |
@O |
Standard DII token |
|
@T |
Standard DII Token |
|
@I |
Standard DII Token |
The image location |
@L |
Standard DII Token |
The long name for the item; includes Nuix specific item metadata: GUID, PathName, Name |
@FROM |
Nuix Defined Metadata |
The Nuix Communications FROM field |
@TO |
Nuix Defined Metadata |
The Nuix Communications TO field |
@CC |
Nuix Defined Metadata |
The Nuix Communications CC field |
@BCC |
Nuix Defined Metadata |
The Nuix Communications BCC field |
@SUBJECT |
Nuix Defined Metadata |
The email subject or Nuix Name |
@DATESENT |
Nuix Defined Metadata |
The Sent Date for email - Nuix Communications Date |
@TIMESENT |
Nuix Defined Metadata |
The Sent Time for email - Nuix Communications Date |
@HEADER / @HEADER- END |
Item Properties |
The email header content including all extracted metadata |
@EMAIL-BODY / @EMAIL- END |
Item Content |
The email body content |
@MULTILINE |
Additional Metadata |
All additional metadata referenced from the Metadata Profile used for the export |
@ATTACH |
Standard DII Token |
Any email attachments |