# Build Multi-Step Pipelines in Workbooks

This article walks through building a complete multi-step data pipeline using Workbooks. Each step lives in its own table, and the tables pass data to each other using cross-table column references.

> **Coming soon:** Workbooks are not yet available in all accounts. This article covers how they will work when the feature rolls out.

---

## What a multi-step pipeline looks like

A pipeline breaks a complex workflow into stages:

| Step | Table | What it does |
|------|-------|-------------|
| 1 | Scrape | Pulls raw prospects from LinkedIn or a data source |
| 2 | Enrich | Adds email, phone, company data |
| 3 | Validate | Checks email deliverability |
| 4 | Output | Pushes clean records to your CRM |

Each table takes input from the previous table via cross-table column references. Running the pipeline means running each table in sequence.

---

## Build a pipeline

### Step 1: Create the Workbook and first table

1. Click **Workbooks** in the left sidebar.
2. Click **New Workbook** and name it, e.g. `Weekly Outreach Pipeline`.
3. In the first table tab, configure your data source.
   - For LinkedIn scraping: add a **Data Source Column** using an Apify search action.
   - For a CSV import: import your file as the starting data.
4. Run the first table to populate it with data.

### Step 2: Add the enrichment table

1. Click **+** to add a new table tab. Name it `Enrich`.
2. Click **Add Column** and select **Cross-Table Reference**.
3. Choose the first table as the source and select the LinkedIn URL column.
4. Configure the key match on LinkedIn URL.
5. Add Action Columns for email and phone enrichment:
   - **Find Work Email by LinkedIn URL (BetterEnrich)**
   - **Find Mobile by LinkedIn URL (BetterEnrich)**
6. Map the LinkedIn URL cross-table column as input for both action columns.

### Step 3: Add the validation table

1. Add another table tab. Name it `Validate`.
2. Add a cross-table reference pulling the Email column from the `Enrich` table.
3. Add an Action Column: **Verify Email (BetterEnrich)** or **Identify Email Type (TexAu Utility)**.
4. Map the email cross-table reference as input.

### Step 4: Add the output table

1. Add a final table tab. Name it `Push to CRM`.
2. Add cross-table references pulling from both the `Enrich` and `Validate` tables:
   - Email (from Enrich)
   - Phone (from Enrich)
   - Email verification status (from Validate)
3. Add an Action Column:
   - **Create Contact (HubSpot)** or **Create Person (Pipedrive)**
4. Add a formula column to filter out invalid emails before pushing: `IF(email_status = "invalid", "skip", "send")`.
5. Configure the CRM action to only run on rows where the formula column equals `"send"`.

---

## Run the pipeline manually

1. Open the Workbook.
2. Start with the first table tab. Click **Run All Rows**.
3. Wait for it to complete.
4. Click the second table tab. Click **Run All Rows**.
5. Continue through each table in order.

---

## Run the pipeline on a schedule

1. Open the Workbook settings.
2. Click **Schedule**.
3. Set the frequency (daily, weekly, etc.) and the start time.
4. Select whether to run all tables in sequence or a specific table.
5. Click **Save Schedule**.

When the Scheduled Job triggers, TexAu runs each table in the pipeline in the order they appear as tabs.

---

## Pipeline design tips

- Keep each table focused on one task. Do not combine scraping and enrichment in the same table.
- Use a key column consistently across all tables. LinkedIn URL or Email works well as the shared identifier.
- Add a formula column at the end of each table to flag incomplete or problem rows before they move to the next step.
- Name your tables clearly - `Scrape`, `Enrich`, `Validate`, `Push` is easier to navigate than `Table 1`, `Table 2`.

---

## Troubleshooting

**A later table has blank cross-table columns even though the earlier table has data.**
Verify that the source table ran successfully and has data in the column you are referencing. Refresh the cross-table column by re-running the destination table.

**The pipeline runs out of order.**
Tables run in the order they are run manually, or in tab order when running on a schedule. If you need a specific execution order, arrange your tabs in that order by dragging them.

**One table fails mid-way and corrupts downstream tables.**
Add a formula column at the end of each table to filter bad rows before the next table runs. This prevents incomplete data from propagating downstream.