Nexadata Pipelines are end-to-end data transformation processes that enable the efficient transformation of datasets through a series of operations. Each pipeline processes record sets, applying transformations step-by-step and passing the output to the next operation. At the start of a pipeline, you input one or more datasets, and by the end, the pipeline generates a fully transformed CSV.

Now, follow this guide to set up your Nexadata Pipeline:

Step-by-step Instructions

Step 1. Prepare Your Datasets

A pipeline begins with one or more CSV datasets. These files should be properly formatted to ensure a smooth transformation process. If your pipeline involves multiple datasets, the first operation should be to join them before applying other transformations.

Step 2. Define Transformations

Defining transformations in Nexadata Pipelines can be done in two ways:

Natural Language Mode (⚡ Transform)

In this mode, users can describe the operation they want to perform in simple, natural language. For example, you can type, "Filter out records where the value is less than 100," and the system will automatically configure the transformation based on your input.

Advanced Mode (💡 Advanced)

For more control, users can select and configure transformations through an intuitive UI. Advanced mode allows you to define each step precisely by manually setting up the transformation.

Regardless of the mode you use, Advanced Mode allows you to refine and edit all transformation steps, ensuring maximum flexibility in adjusting your pipeline as needed.

Supported Transformations

Nexadata Pipelines support a range of data transformations, including:

Copy: Duplicate a column and specify its position in the dataset.
Filter: Include or exclude rows based on custom conditions.
Flip sign: Automatically adjust the sign of values based on custom conditions.
Group by: Group and summarize data with functions like sum, count, and average.
Insert column: Add new columns with formulas or default values.
Insert row numbers: Add sequential row numbers to your dataset using the Insert Row Numbers transformation.
Join: Combine datasets in Nexadata with a common key, using graphical joins (inner, outer, left, right) for a custom, unified output dataset.
Keep: Selectively retain essential columns while removing all others, streamlining your dataset.
Lowercase: Standardize text by converting selected columns to lowercase.
Merge: Combine selected columns into a single column with a custom delimiter.
Remove: Delete unnecessary columns, streamlining your dataset for better focus and clarity.
Rename: Update column names, enhancing clarity and consistency across your dataset.
Replace: Find and replace values in a column, with options for exact matches or regex patterns.
Shift: Reposition columns to the start, end, or relative to another column.
Sort: Arrange dataset columns in ascending or descending order, with options for empty value placement.
Split: Divide a column’s contents into multiple columns based on a specified delimiter.
Sum: Compute and output a single-column sum for a selected numeric column.
Uppercase: Standardize text by converting selected columns to uppercase.

Step 3. Build Your Pipeline

To build a pipeline:

Start with Your Datasets: Input one or more properly formatted CSV files as the starting point.
Join Multiple Datasets: If more than one dataset is involved, begin by joining them using a common key. This ensures all datasets are properly combined before applying other transformations.
Apply Transformations: Add the necessary transformations in sequence. Each step takes the output from the previous one and passes it along to the next.

Natural Language vs. Advanced Mode

Build your pipeline using natural language:

Or build your pipeline using advanced mode:

As you build your pipeline, visual cues help you identify whether each step was configured using natural language or advanced mode:

Natural Language Mode is denoted by a lightning bolt icon ⚡.
- Hover over the lightning bolt to view the precise natural language prompt that was used for the transformation.
Advanced Mode is indicated by a lightbulb icon 💡.

You can seamlessly switch between these modes at any step to edit and refine your pipeline as needed.

Step 4. Pipeline Execution

Once your pipeline is configured, execute it to process the dataset(s). The pipeline will run each transformation step in sequence, ensuring that the output from one step is passed to the next.

Step 5. Final Output

Upon completion, the pipeline generates a final CSV containing the fully transformed dataset. This output file can be downloaded or used as input for other processes, depending on your data workflow needs.

Summary of Steps

Start with one or more properly formatted CSV datasets.
If multiple datasets are used, apply a join operation to combine them.
Define the necessary transformations using Natural Language or Advanced Mode.
Build your pipeline step-by-step, leveraging visual cues to track whether each step was configured using natural language (⚡) or advanced mode (💡).
Execute the pipeline to transform your data.
The final output will be a transformed CSV that is ready for use.

Quick Start Guide: Working with Sample Beverage Data in Nexadata

Transformation: Lowercase

Natural Language Input vs. Advanced Mode in Nexadata Pipelines

Transformation: Uppercase

Transformation: Sum

Setting Up a Nexadata Pipeline