Nexadata supports four Data Formats for file-based Datasets: Tabular, Spreadsheet, JSON, and Parquet. You choose the format on the Connect Data step when creating or editing a Dataset. This article explains each format, the file types it covers, and the conventions Nexadata expects so your data ingests cleanly. All four formats work with Nexadata's file-based connections, including Nexadata Hub, Amazon S3, and SFTP.
Tabular
Tabular data refers to delimited text files that organize records into rows and columns using a consistent character to separate fields. The most common tabular format is CSV (Comma-Separated Values), but Nexadata also supports tab-delimited and semicolon-delimited files under the same Tabular setting.
A proper tabular file includes:
Consistent Delimiters: Each field in a row is separated by the same delimiter (comma, tab, or semicolon in Nexadata).
Quoted Strings: If a field contains the delimiter character itself, that field should be enclosed in double quotes.
No Extraneous Characters: Proper tabular files do not include stray characters or line breaks within records unless handled correctly with quotes.
Supported Tabular Delimiters
Nexadata supports the following delimiters:
Comma ( , )
Tab
Semicolon ( ; )
Note: Additional delimiter options are planned for future updates.
Example of Proper Tabular Format
With a comma delimiter:
Name,Age,City
"John Doe",29,"New York"
"Jane Smith",34,"Los Angeles"
Spreadsheet
Spreadsheet format refers to Excel-style workbook files (.xlsx, .xls, .xlsm). Unlike a flat tabular file, a workbook can contain multiple sheets, formatting, formulas, banner rows, title blocks, and other non-tabular content. Nexadata's Spreadsheet ingestion is built to handle that complexity by letting you specify which sheet to read, where the data region begins on that sheet, and whether to auto-detect or hard-code the size of the data block.
Common use cases include:
Monthly or quarterly reports exported from BI tools, ERPs, or planning platforms that include a logo and title rows above the data
Multi-sheet workbooks where only one tab contains the data you want to load
Files with subheaders, merged cells, or notes between the header row and the first data row
For step-by-step instructions on configuring the Spreadsheet-specific settings (Sheet Name, Anchor Cell, Header Offset, Dynamic Range, and fixed Rows and Columns), see Setting Up a Spreadsheet Dataset.
Example Spreadsheet Layout
A typical workbook sheet looks like this when opened in Excel:
| A | B | C |
1 | (company logo) |
|
|
2 | Monthly Sales Report |
|
|
3 | November 2024 |
|
|
4 |
|
|
|
5 | Name | Age | City |
6 | John Doe | 29 | New York |
7 | Jane Smith | 34 | Los Angeles |
In this example, you would set the Anchor Cell to A5 so Nexadata begins reading from the header row, ignoring the logo and title rows above.
JSON (coming soon)
JSON (JavaScript Object Notation) is a flexible, lightweight format for representing structured data. JSON organizes data into key-value pairs and supports nested structures, making it well suited for data sourced from APIs and modern applications.
Benefits of JSON:
Human-readable and easy to interpret
Supports nested structures for complex data
Native format for most modern web APIs and SaaS exports
Example JSON Format
[
{
"Name": "John Doe",
"Age": 29,
"City": "New York"
},
{
"Name": "Jane Smith",
"Age": 34,
"City": "Los Angeles"
}
]
Parquet (coming soon)
Parquet is a columnar storage format optimized for large-scale data processing. It is widely used in data lakes and big data platforms because of its efficient storage and strong handling of complex data types.
Benefits of Parquet:
Highly compressed, leading to lower storage costs
Optimized for analytical workloads and large datasets
Schema is embedded in the file, ensuring type consistency between source and target
Example Parquet Format
Parquet is a binary format and is not typically human-readable. Its logical structure looks like this:
Name | Age | City |
John Doe | 29 | New York |
Jane Smith | 34 | Los Angeles |
Allow Any File Type
By default, Nexadata uses the file extension (such as .csv, .xlsx, .json, .parquet) to validate that a file matches the selected Data Format. The Allow Any File Type toggle on the Connect Data step lets you bypass this check so you can load files that have no extension or an unexpected one. Turn this on only when your source system produces files without recognizable extensions, since it disables the safety check that prevents format mismatches.
