Nexadata currently supports tabular data in CSV format and plans to support JSON and Parquet formats soon. This article explains each of these formats, details Nexadata's current capabilities with CSV data, including delimiter options, and provides format samples.

Supported Format: Tabular (CSV)

CSV (Comma-Separated Values) is a tabular format that organizes data into rows and columns, using a consistent delimiter to separate fields within each row. Nexadata specifically supports proper CSV files, meaning files that adhere to certain standard formatting rules to ensure compatibility and data integrity. A proper CSV file includes:

Consistent Delimiters: Each field in a row is separated by a consistent delimiter (comma, tab, or semicolon in Nexadata).
Quoted Strings: If a field contains the delimiter character itself, that field should be enclosed in double-quotes.
No Extraneous Characters: Proper CSV files do not include extraneous characters or line breaks within records unless handled correctly with quotes.

Nexadata's CSV Delimiter Options

Nexadata supports the following delimiters in CSV files:

Comma ( , )
Tab
Semicolon ( ; )

Note: Additional delimiter options are planned for future updates.

Example of Proper CSV Format

With a comma delimiter:

Name,Age,City
"John Doe",29,"New York"
"Jane Smith",34,"Los Angeles"

Coming Soon: JSON

JSON (JavaScript Object Notation) is a flexible, lightweight format for representing structured data. JSON data is organized into key-value pairs and can handle nested structures, making it ideal for complex data configurations.

Benefits of JSON:

Human-readable and easy to interpret
Supports nested structures for complex data

Example JSON Format

[
    {
        "Name": "John Doe",
        "Age": 29,
        "City": "New York"
    },
    {
        "Name": "Jane Smith",
        "Age": 34,
        "City": "Los Angeles"
    }
]

JSON format support in Nexadata is currently under development and will be available soon.

Coming Soon: Parquet

Parquet is a columnar storage format optimized for large-scale data processing. It is widely used in data lakes and big data platforms due to its efficient storage, particularly for complex data types.

Benefits of Parquet: