Skip to main content

Transformation: Generate Unique ID

Add a globally unique identifier to each record in your dataset using the Generate Unique ID transformation

Updated this week

The Generate Unique ID transformation in Nexadata Pipelines adds a new column containing a globally unique identifier for each record in your dataset. Unlike the Insert Row Numbers transformation, which assigns sequential integers scoped to a single run, Generate Unique ID produces collision-resistant UUID v4 identifiers that remain unique across runs, environments, and source systems. This transformation is useful for creating primary keys for downstream databases, enabling deduplication across multiple source systems, and generating stable record references that survive re-execution. You can configure this transformation using Natural Language Mode or Advanced Mode.


Inputs for Generate Unique ID Transformation

  1. Name of the Transformation: In Natural Language Mode, the transformation name is automatically generated, but it can be updated later using Advanced Mode. For example, you might rename it to something more descriptive, like "Customer Record ID Generator".

  2. New Column Name: Enter the name of the new column that will store the generated IDs (e.g., "Unique_ID", "Record_Key", or "Global_ID").

  3. Position of New Column: Specify where the new column should be placed in the dataset:

    • First: Insert the new column at the beginning.

    • Last: Add the new column at the end of the dataset.

    • Before or After: Insert the new column before or after a specific existing column.

  4. Reference Column: If you choose to position the column Before or After another column, select a reference column that will serve as the insertion point.


Using Natural Language Mode

In Natural Language Mode, you can define the transformation with a single, simple instruction. Nexadata will automatically interpret the input, create the transformation, and assign it a default name. If needed, the transformation name can be modified later in Advanced Mode.

Example Instructions in Natural Language Mode

  • Add a new column called Record_Key at the beginning of the dataset and populate it with a unique ID for each row.

  • Generate a unique ID for each record and insert it as the first column.

  • Create a unique identifier column named Global_ID and place it before the Order_Date column.

  • Add a globally unique identifier column called Global_ID at the end of the dataset.

  • Insert a unique ID column named Primary_Key as the first column in the dataset.

Note: While Natural Language Mode provides a quick and intuitive setup, the transformation name and other settings can be adjusted using Advanced Mode if necessary.


Using Advanced Mode

In Advanced Mode, you have full control over the transformation setup through a detailed UI. This mode allows for more precise configuration, including the ability to rename the transformation and specify column placement.

Steps in Advanced Mode

  1. Name of the Transformation: Enter or update a custom name, such as "Customer Record ID Generator", especially if the name generated in Natural Language Mode needs refinement.

  2. New Column Name: Define the name for the new column where the generated IDs will be stored (e.g., "Unique_ID", "Record_Key", "Primary_Key").

  3. Position of New Column: Choose where to place the new column:

    • First: Insert the new column as the first column in the dataset.

    • Last: Add the column at the end of the dataset.

    • Before or After: Select a reference column to specify placement relative to existing columns.

  4. Reference Column: Provide a reference column if selecting Before or After.

Example in Advanced Mode

  1. Name of the Transformation: Order Key Generator

  2. New Column Name: Order_Key

  3. Position of New Column: Before the "Order_Date" column


Example Use Case

The Generate Unique ID transformation is particularly useful when merging records from multiple source systems that each use their own internal identifiers. For example, imagine you are combining customer records from a CRM and an ERP system, and you need a single stable key that can serve as a primary key in your data warehouse regardless of which source the record originated from:

  • Transformation Name: Customer Key Generator

  • New Column Name: Customer_Key

  • Position of New Column: First

In this case, every record across both source systems receives a new, collision-resistant UUID v4 identifier in the Customer_Key column. This key can be used as a primary key in the target table, as a deduplication reference when the pipeline runs again, or as a join key when relating records across downstream datasets.

Did this answer your question?