Skip to main content
Transformation: Split

Use the Split transformation in Nexadata Pipelines to divide a column’s contents into multiple columns based on a specified delimiter.

Quin Eddy avatar
Written by Quin Eddy
Updated this week

The Split transformation in Nexadata Pipelines allows you to divide the contents of a selected column into multiple new columns based on a specified delimiter. This transformation is useful for extracting structured data from a single column, such as separating first and last names or breaking down address information. Depending on the level of control required, you can configure the transformation using Natural Language Mode or Advanced Mode.


Inputs for the Split Transformation

  1. Name of the Transformation: In Natural Language Mode, the transformation name is automatically generated, but you can update it later in Advanced Mode. For example, you might rename it to "Split Full Name into First and Last Names" or "Separate Address Details".

  2. Column to Split: Select the column you want to split. For example, you may choose a column named "Full_Name" if you want to separate it into first and last names.

  3. Delimiter: Define the character(s) that will be used as the delimiter for splitting the column. Note that only basic characters are supported as delimiters (e.g., comma, space, or hyphen); regular expressions are not supported.

  4. New Column Names: Specify names for each of the new columns created from the split. For example, if splitting "Full_Name", you might name the new columns "First_Name" and "Last_Name".

  5. Drop Original Column: Enable this toggle if you want to remove the original column after splitting it into new columns.


Using Natural Language Mode

In Natural Language Mode, describe the split operation you want, and Nexadata will automatically configure the transformation. The transformation name is auto-generated but can be modified later in Advanced Mode.

Example Instructions in Natural Language Mode

  • Split Full_Name by space into First_Name and Last_Name.

  • Divide Address column by comma into Street, City, State, and Zip.

  • Separate the Product_Details column by hyphen into Product_Type and Product_ID.

  • Split Email_Address by "@" into User and Domain.

Note: If Natural Language Mode doesn’t fully capture the transformation you need, you can switch to Advanced Mode to make adjustments.


Using Advanced Mode

In Advanced Mode, you have complete control over the configuration of the Split transformation. You can manually specify the column, delimiter, new column names, and whether to drop the original column. This allows for detailed control, ensuring the transformation precisely aligns with your analytical needs.

Steps in Advanced Mode

  1. Name of the Transformation: Provide or update a custom name, such as "Separate Full Name" or "Split Address Components".

  2. Column to Split: Select the column to be split, such as "Full_Name" or "Address".

  3. Delimiter: Enter the character(s) that will separate the values in the selected column, such as a space, comma, or hyphen.

  4. New Column Names: Specify the names for the new columns resulting from the split. After typing each column name, press Enter to create a visual "pill" that confirms the column name. This pill format helps you visually confirm each new column name.

  5. Drop Original Column: Toggle this option if you want the original column to be removed from the dataset after splitting.


Example Use Case

The Split transformation is ideal for dividing compound data fields into individual components. For instance, suppose you have a Full_Name column and need to split it into First_Name and Last_Name. For example:

  • Transformation Name: Split Full Name

  • Column to Split: Full_Name

  • Delimiter: Space

  • New Column Names: First_Name, Last_Name

  • Drop Original Column: Enabled

The transformation will create two new columns, First_Name and Last_Name, and the original Full_Name column will be removed if the drop toggle is enabled.


Summary

The Split transformation in Nexadata Pipelines enables efficient extraction of data from a single column into multiple columns by specifying a delimiter. Use Natural Language Mode for a quick setup or Advanced Mode for detailed control. This transformation is ideal for separating data like names, addresses, or other compound fields into structured components.

Did this answer your question?