Skip to main content
All CollectionsSetting up Datasets
Supported Data Formats in Nexadata
Supported Data Formats in Nexadata

Overview of data formats in Nexadata: current support for proper CSV files with delimiters, and upcoming JSON and Parquet support.

Quin Eddy avatar
Written by Quin Eddy
Updated over a week ago

Nexadata currently supports tabular data in CSV format and plans to support JSON and Parquet formats soon. This article explains each of these formats, details Nexadata's current capabilities with CSV data, including delimiter options, and provides format samples.


Supported Format: Tabular (CSV)

CSV (Comma-Separated Values) is a tabular format that organizes data into rows and columns, using a consistent delimiter to separate fields within each row. Nexadata specifically supports proper CSV files, meaning files that adhere to certain standard formatting rules to ensure compatibility and data integrity. A proper CSV file includes:

  1. Consistent Delimiters: Each field in a row is separated by a consistent delimiter (comma, tab, or semicolon in Nexadata).

  2. Quoted Strings: If a field contains the delimiter character itself, that field should be enclosed in double-quotes.

  3. No Extraneous Characters: Proper CSV files do not include extraneous characters or line breaks within records unless handled correctly with quotes.

Nexadata's CSV Delimiter Options

Nexadata supports the following delimiters in CSV files:

  • Comma ( , )

  • Tab

  • Semicolon ( ; )

Note: Additional delimiter options are planned for future updates.

Example of Proper CSV Format

With a comma delimiter:

Name,Age,City
"John Doe",29,"New York"
"Jane Smith",34,"Los Angeles"

Coming Soon: JSON

JSON (JavaScript Object Notation) is a flexible, lightweight format for representing structured data. JSON data is organized into key-value pairs and can handle nested structures, making it ideal for complex data configurations.

Benefits of JSON:

  • Human-readable and easy to interpret

  • Supports nested structures for complex data

Example JSON Format

[
{
"Name": "John Doe",
"Age": 29,
"City": "New York"
},
{
"Name": "Jane Smith",
"Age": 34,
"City": "Los Angeles"
}
]

JSON format support in Nexadata is currently under development and will be available soon.


Coming Soon: Parquet

Parquet is a columnar storage format optimized for large-scale data processing. It is widely used in data lakes and big data platforms due to its efficient storage, particularly for complex data types.

Benefits of Parquet:

  • Highly compressed, leading to lower storage costs

  • Optimized for analytical workloads and large datasets

Example Parquet Format

While Parquet is a binary format and not typically human-readable, its structure can be described as follows:

Name

Age

City

John Doe

29

New York

Jane Smith

34

Los Angeles

Parquet format support in Nexadata is also in progress and will be available in a future release.

Did this answer your question?