Built-in Processing Actions: Your Data Transformation Toolkit
TexAu includes 14 built-in processing actions that automatically clean, transform, and enrich your data. Unlike formulas (which you write), processing actions are pre-built functions ready to use. Think of them as professional-grade data cleanup tools.
This guide covers every action available, with real examples you can adapt to your workflow.
Clean Domain: Extract and Normalize Domain from URLs and Emails
What it does: Extracts the domain from any URL or email address and normalizes it (removes www., standardizes format).
When to use it:
-
You have a mix of email addresses and URLs and want just the domain
-
URLs have inconsistent formats (some with www., some without, some with https://)
-
You want to group people by their company domain for analysis
Examples:
| Input | Output |
|---|---|
| [email protected] | acme.com |
| https://www.acme.com/about | acme.com |
| www.acme.com | acme.com |
| [email protected] | techstartup.io |
| https://enterprise-solutions.org | enterprise-solutions.org |
Tips:
-
Removes "www." prefix automatically for cleaner results
-
Handles both URLs (with https://) and email addresses
-
Great for deduplicating by company when you have messy source data
-
Normalizes everything to lowercase for consistent matching
Predict Gender: Predict Gender from First Name
What it does: Analyzes a first name and predicts the most likely gender (Male, Female, or Neutral).
When to use it:
-
You want personalized outreach email templates (Dear Mr. vs. Dear Ms.)
-
You're analyzing team diversity across your targets
-
You need to segment messaging based on predicted demographics
-
You're filling in missing gender data for reporting
Examples:
| Input | Output | Use Case |
|---|---|---|
| John | Male | Use "Mr." in email greeting |
| Sarah | Female | Use "Ms." in email greeting |
| Alex | Neutral | Use full name in greeting |
| Michael | Male | Marketing segment analysis |
| Jennifer | Female | Team composition reporting |
Tips:
-
Predictions are based on statistical analysis of name databases
-
Always review predicted data, especially for uncommon or international names
-
The "Neutral" category includes unisex names (Alex, Casey, Morgan)
-
Use for personalization, not for legal or compliance purposes
-
This is a prediction, not a statement of fact. Always use respectfully
Encode URI Components: URL-Encode Text for Use in Links
What it does: Converts special characters in text to URL-safe format so you can safely include text in URLs and links.
When to use it:
-
You're building dynamic URLs or links with user data
-
You have text with special characters that need to be web-safe
-
You're creating tracking links with parameters
-
You need to include search terms or queries in URLs
Examples:
| Input | Output | Why |
|---|---|---|
| John Smith | John%20Smith | Spaces become %20 |
| acme & co | acme%20%26%20co | & becomes %26 |
| $100/month | %24100%2Fmonth | $ and / are encoded |
| [email protected] | hello%40acme.com | @ becomes %40 |
Tips:
-
Use this before inserting user data into URLs
-
Essential for building safe dynamic links
-
Common use: building personalized tracking URLs for emails
Count Occurrences: Count How Many Times a Pattern Appears in Text
What it does: Counts how many times a specific word or phrase appears in a text field.
When to use it:
-
You want to measure keyword density in descriptions
-
You're counting mentions of a product or competitor in text
-
You want to measure "enthusiasm" in testimonials by counting exclamation marks
-
You're analyzing how many times an action is mentioned
Examples:
| Text | Search | Output | Insight |
|---|---|---|---|
| "We use it daily, daily!" | "daily" | 2 | Frequency of use |
| "Big company, enterprise scale" | "enterprise" | 1 | References to enterprise |
| "URGENT! URGENT! Act now!!!" | "!" | 5 | Excitement level |
| Product description | "integration" | 3 | How much it emphasizes integrations |
Tips:
-
Case-sensitive ("Daily" and "daily" count separately)
-
Useful for finding emphasis in text (count "!!" or "urgent")
-
Great for analyzing keyword density in company descriptions
-
Use for data quality: count "null" or "NA" to find incomplete records
Find Sitemap URLs: Find Sitemap URLs for a Domain
What it does: Searches a domain and finds its XML sitemaps (if they exist). Sitemaps are often hidden but contain valuable info about what pages a company considers important.
When to use it:
-
You want to explore what pages a company prioritizes
-
You're building a list of key company resources for reference
-
You're analyzing company web structure for market intelligence
-
You want to find subdomains or secondary sites
Examples:
| Input Domain | Output | Information Revealed |
|---|---|---|
| acme.com | https://acme.com/sitemap.xml | Main website pages |
| enterprise.com | https://enterprise.com/sitemap.xml, https://blog.enterprise.com/sitemap.xml | Main site + blog |
| startup.io | No sitemap found | Company doesn't have public sitemap |
Tips:
-
Doesn't work for all domains. Many sites don't have public sitemaps
-
Sitemaps show pages the company wants indexed and visible
-
Great for understanding a company's web presence structure
-
Combine with other data points for complete intelligence
Normalize Deduplicate List: Clean Up and Remove Duplicates from a List
What it does: Takes a list of items (usually comma or semicolon-separated) and removes duplicates while cleaning up extra spaces.
When to use it:
-
You have a field with multiple values that might repeat
-
Contact names or emails got merged together with duplicates
-
You want a clean, unique list of technologies or products
-
You're consolidating data from multiple sources
Examples:
| Input | Output | Result |
|---|---|---|
| python, javascript, python | python, javascript | Duplicates removed |
| Docker; kubernetes; Docker | Docker, kubernetes | Cleaned and deduplicated |
| Java , Java, Python | Java, Python | Spaces removed, duplicates gone |
| AWS, aws, AWS | AWS | Normalized to single format |
Tips:
-
Automatically detects common separators (comma, semicolon, space)
-
Removes extra whitespace around items
-
Case-sensitive, so "AWS" and "aws" are treated separately (watch this!)
-
Great for cleaning up merged data or import errors
Identify Email Type: Classify Emails as Personal, Work, or Generic
What it does: Analyzes an email address and categorizes it as personal (Gmail, Yahoo, Outlook personal), work (company domain), or generic (info@, support@, noreply@).
When to use it:
-
You want to prioritize personal emails (more likely to reach the right person)
-
You need to filter out generic mailboxes (they often don't convert)
-
You're analyzing data quality of your contact database
-
You want to know if an email is tied to a specific person or generic inbox
Examples:
| Type | Why | |
|---|---|---|
| [email protected] | Personal | Well-known personal email provider |
| [email protected] | Work | Company domain email |
| [email protected] | Generic | Generic support inbox |
| [email protected] | Generic | Generic info inbox |
| [email protected] | Work | Company domain email |
| [email protected] | Personal | Well-known personal email provider |
Tips:
-
Personal emails are usually higher quality leads (direct to person)
-
Generic emails are less likely to convert (shared inbox)
-
Work emails show company affiliation
-
Use to deprioritize generic support/info addresses in outreach
-
Common pattern: filter out "Generic" for higher conversion rates
Extract URLs Emails: Pull Out All URLs and Email Addresses from Text
What it does: Scans a block of text and extracts every URL and email address it finds, returning them as a list.
When to use it:
-
You have company descriptions or about pages with embedded links
-
You want to find all email addresses mentioned in a text field
-
You're extracting resources from scraped website content
-
You need to identify social media URLs or website links from descriptions
Examples:
| Input Text | Output |
|---|---|
| "Visit us at https://acme.com or email [email protected]" | https://acme.com, [email protected] |
| "Contact: [email protected] or visit https://linkedin.com/company/acme" | [email protected], https://linkedin.com/company/acme |
| "Website: www.startup.io, Twitter: @startup" | www.startup.io (Twitter URL not recognized as standard URL) |
Tips:
-
Returns URLs in standard format (http://, https://, www.)
-
Great for extracting contact info from messy descriptions
-
Perfect for finding social media links in company data
-
Returns empty if no URLs or emails found
Normalize Phone Number: Standardize Phone Numbers to Consistent Format
What it does: Takes phone numbers in any format and converts them to a standard format for consistency and matching.
When to use it:
-
Your phone data comes from multiple sources with different formats
-
You want to standardize before storing or matching
-
You're preparing data for SMS campaigns (need consistent format)
-
You're deduplicating phone numbers
Examples:
| Input | Output | Format |
|---|---|---|
| (555) 123-4567 | 5551234567 | Standard US format |
| 555.123.4567 | 5551234567 | Cleaned |
| +1 555 123 4567 | 5551234567 | International simplified |
| 555-123-4567 | 5551234567 | Standard |
| 5551234567 | 5551234567 | Already clean |
Tips:
-
Removes formatting characters (parentheses, dashes, spaces, dots)
-
Works best for US/North America phone numbers
-
International numbers may be simplified to core digits
-
Use before deduplication to catch duplicates in different formats
Normalize Company Name: Clean Up Company Names (Remove Inc., LLC, etc.)
What it does: Standardizes company names by removing common suffixes (Inc., LLC, Corp., Ltd., etc.) and extra whitespace for cleaner matching and comparison.
When to use it:
-
You're matching companies across datasets
-
Company names have inconsistent formatting
-
You want cleaner names for display
-
You're deduplicating company lists
Examples:
| Input | Output | Removed |
|---|---|---|
| Acme Inc. | Acme | Inc. |
| Tech Solutions LLC | Tech Solutions | LLC |
| Enterprise Corp | Enterprise | Corp |
| Global Enterprises, Inc. | Global Enterprises | Inc. |
| Digital Innovations Ltd. | Digital Innovations | Ltd. |
Tips:
-
Removes common legal suffixes automatically
-
Also cleans up extra whitespace
-
Great for deduplication (matches "Acme Inc" and "Acme Corp" better)
-
Use before company name comparisons
-
Results in cleaner display names for reports and lists
Remove Extra Whitespace: Clean Up Messy Text with Extra Spaces
What it does: Removes extra spaces and whitespace from text leading spaces, trailing spaces, and multiple spaces between words become single spaces.
When to use it:
-
Data from exports or imports has inconsistent spacing
-
You're cleaning up copy-pasted text
-
You want consistent spacing for quality checks
-
You're combining multiple text sources
Examples:
| Input | Output |
|---|---|
| " Hello World " | "Hello World" |
| "John Smith" | "John Smith" |
| "Software Engineer" | "Software Engineer" |
| "Multiple Spaces" | "Multiple Spaces" |
Tips:
-
Removes leading and trailing spaces
-
Converts multiple consecutive spaces to single space
-
Simple and effective for data cleanup
-
Often used with other text functions
-
Great first step for any dirty data
Format Date Time: Convert Dates to Your Preferred Format
What it does: Takes a date value and reformats it to a standard, readable format of your choice.
When to use it:
-
Dates are in different formats (US vs. international, with/without time)
-
You want consistent date formatting for reports
-
You're preparing data for export or display
-
You need dates in a specific format for integrations
Examples:
| Input | Output Format | Result |
|---|---|---|
| 2024-01-15 | MM/DD/YYYY | 01/15/2024 |
| 1/15/2024 | YYYY-MM-DD | 2024-01-15 |
| 01/15/2024 14:30 | MMMM DD, YYYY | January 15, 2024 |
| 2024-01-15 | DD/MM/YYYY | 15/01/2024 |
Tips:
-
Choose from common date formats or specify custom
-
Handles both date-only and date-with-time formats
-
Essential for creating clean reports and exports
-
Useful for display and readability
-
Different regions prefer different formats. Match your audience
Find Redirect Page: Follow URL Redirects to Find the Final Destination
What it does: Takes a URL and follows any redirects (301, 302, etc.) to find the final destination page. Some URLs redirect to different pages, and this finds where they actually lead.
When to use it:
-
You have shortened URLs or landing page URLs that might redirect
-
You want to find the final destination of a company's main website
-
You're validating URLs or checking if they're still active
-
You need to resolve URL chains to their final target
Examples:
| Input URL | Redirects Via | Final Destination |
|---|---|---|
| bit.ly/acme123 | redirect.page | https://acme-solutions.com/special-offer |
| company.com | company.io (301) | https://company.io/home |
| old-site.com | new-site.com (302) | https://new-site.com |
Tips:
-
Follows redirect chains to the final page
-
Helpful for validating active URLs
-
Some URLs may not redirect (they're final)
-
If a URL is broken, no final destination found
-
Useful for finding current company websites
Distribute Leads Round Robin: Evenly Distribute Leads Among Team Members
What it does: Takes a list of leads and distributes them evenly across your team members in round-robin fashion (person 1 gets lead 1, person 2 gets lead 2, etc., then cycles back).
When to use it:
-
You want fair lead distribution to sales team
-
You're assigning new prospects to account managers
-
You need to balance workload across team members
-
You're automating lead assignment in TexAu
Examples:
| Lead | Team | Assignment |
|---|---|---|
| Lead 1 - Acme Corp | [John, Sarah, Michael] | John |
| Lead 2 - Tech Inc | [John, Sarah, Michael] | Sarah |
| Lead 3 - Enterprise LLC | [John, Sarah, Michael] | Michael |
| Lead 4 - Startup.io | [John, Sarah, Michael] | John (cycles back) |
Tips:
-
Automatically rotates through team members for fairness
-
Useful in workflow execution for fair assignment
-
Each person gets roughly equal number of leads
-
If your team size changes, distribution adjusts automatically
-
Works especially well when combined with scheduling
-
Ensures no one person gets overloaded
Quick Action Finder
Need to extract something from a URL or email? → Clean Domain
Need to clean up names or text? → Normalize Company Name, Remove Extra Whitespace
Need to identify and categorize? → Identify Email Type, Predict Gender
Need to find duplicates or extract lists? → Normalize Deduplicate List, Extract URLs Emails
Need to standardize format? → Normalize Phone N umber, Format Date Time
Need to analyze text? → Count Occurrences
Need to validate or explore? → Find Redirect Page, FindSitemapURLs
Need to distribute work? → Distribute Leads Round Robin
Combining Actions with Formulas
The real power comes when you combine processing actions with formulas:
-
Clean then calculate: Use Normalize Phone Number, then use a formula to check LEN()
-
Categorize then assign: Use Identify Email Type, then use IF() to determine priority
-
Extract then combine: Use Extract URLs Emails, then use CONCAT() with the results
Processing actions are the preparation step; formulas are the analysis step. Use both!
What's Next?
-
Back to Formula Basics for calculated columns
-
Ready to automate? Setting Up Scheduled Jobs
-
Formula Functions Reference for more calculation power
TexAu's processing actions can handle data cleanup automatically so you can focus on analysis.