Home Action Reference Web Scraping and Utility Actions

Web Scraping and Utility Actions

Last updated on Apr 06, 2026

Web Scraping and Utility Actions

TexAu includes a set of built-in actions for extracting data from websites, discovering technology stacks, finding emails, and pulling structured metadata. These actions do not require third-party API keys.

Before you begin

  • You need an active TexAu account with available credits.
  • All actions in this reference accept a website URL as the primary input.

Full Page Scrape

Scrapes a URL and returns its content in HTML, Markdown, or as a list of extracted links.

Credit cost: 1 credit per execution

Use case: Pull the full content of a landing page or blog post for analysis, content repurposing, or competitive research.

Input parameters

Parameter Display name Required Description
url Page URL Yes The target URL to scrape.

Output fields

Field Display name Type
title Page Title text
markdown Markdown Content text
html HTML Content text
links Extracted Links text

Extract Web Emails

Crawls a website and extracts all email addresses found across multiple pages.

Credit cost: 2 credits per execution

Use case: Find contact emails from a company website for outreach. The crawler visits up to 10 pages to collect all emails it finds.

Input parameters

Parameter Display name Required Description
url Website URL Yes The target URL to crawl for emails.

Output fields

Field Display name Type
email Email Address email
email_source Source Page text
email_type Email Type text
email_confidence Confidence Level text

This action returns multiple results per execution.


Extract Social Links

Extracts all social media profile links from a website URL, grouped by platform.

Credit cost: 1 credit per execution

Use case: Quickly find a company's LinkedIn, Twitter, Facebook, Instagram, YouTube, and GitHub profiles from their website.

Input parameters

Parameter Display name Required Description
url Website URL Yes The target URL to extract social links from.

Output fields

Field Display name Type
linkedin_links LinkedIn Links text
twitter_links Twitter / X Links text
facebook_links Facebook Links text
instagram_links Instagram Links text
youtube_links YouTube Links text
github_links GitHub Links text

Get Web Meta Tags

Fetches SEO, Open Graph, Twitter, and other meta tags for a given web page URL.

Credit cost: 1 credit per execution

Use case: Audit a page's SEO setup, check Open Graph tags before sharing on social media, or compare meta tag strategies across competitor pages.

Input parameters

Parameter Display name Required Description
url Page URL Yes The URL of the web page to extract meta tags from.

Output fields

Field Display name Type
favicon Favicon Path url
hreflang_lang Hreflang lang text
hreflang_url Hreflang Url text
og_title Open Graph Title text
og_description Open Graph Description text
og_image Open Graph Image URL/Path url
og_url Open Graph URL url
og_type Open Graph Type text
og_site_name Open Graph Site Name text
og_locale Open Graph Locale text
seo_title SEO Title text
seo_description SEO Description text
seo_canonical Canonical URL url
seo_charset Charset text
seo_language Language text
seo_robots Robots Directive text
seo_author SEO Author text
seo_keywords SEO Keywords text
seo_viewport Viewport text
twitter_card Twitter Card Type text
twitter_title Twitter Title text
twitter_description Twitter Description text
twitter_image Twitter Image URL/Path url
twitter_site Twitter Site Handle text
twitter_creator Twitter Creator Handle text

Get Web JSON-LD

Fetches and parses JSON-LD structured data (Schema.org) from a given web page URL.

Credit cost: 1 credit per execution

Use case: Extract product pricing, ratings, FAQ content, or organization details from a page's structured data for competitive intelligence or data enrichment.

Input parameters

Parameter Display name Required Description
url Page URL Yes The URL of the web page to extract JSON-LD data from.

Output fields

Field Display name Type
item_context Schema Context url
item_type Schema Type (@type) text
item_name Item Name text
item_description Description text
item_url URL url
item_image Image URL url
item_logo Logo URL url
item_brand Brand text
item_founding_date Founding Date text
item_keywords Keywords text
item_aggregate_rating_value Rating Value number
item_aggregate_rating_count Rating Count number
item_offers_price Offer Price number
item_offers_currency Offer Currency text
item_offers_availability Offer Availability text
item_offers_url Offer URL url
item_same_as Same As (Social Links) text
item_significant_link Significant Links text
item_list_element Breadcrumb List Elements text
item_main_entity Main Entity (FAQ Items) text
item_raw Raw JSON-LD Item Object text
total_count Total Items Found number

This action returns multiple results per execution.


Website Intelligence

Comprehensive website report covering performance, SEO, tech stack, security headers, DNS, SSL, social links, emails, and JSON-LD data in a single call.

Credit cost: 3 credits per execution

Use case: Run a full audit of a prospect's website before a sales call. Get their tech stack, estimated tech spend, SEO score, security posture, and social media links in one action.

Input parameters

Parameter Display name Required Description
url Website URL Yes The target URL to run the comprehensive intelligence report on.

Output fields (key fields)

Field Display name Type
url Original URL url
final_url Final Resolved URL url
http_status HTTP Status Code number
page_load_time_ms Page Load Time (ms) number
seo_score Overall SEO Score number
seo_issues SEO Issues text
og_title OG Title text
og_description OG Description text
seo_title SEO Title text
seo_description SEO Description text
tech_stack_technologies Detected Technologies text
tech_stack_categories Tech Categories text
gtm_company_stage Company Stage text
gtm_estimated_spend Est. Monthly Tech Spend number
gtm_buying_signals Buying Signals text
pixels Tracking Pixels text
total_pixels_found Total Pixels Found number
social_facebook Facebook Links text
social_twitter Twitter Links text
social_linkedin LinkedIn Links text
emails Emails Found text
json_ld_items JSON-LD Schema Items text
headers_security_score Header Security Score number
ssl_issuer SSL Issuer text
ssl_grade SSL Grade text
ssl_days_until_expiry SSL Days Until Expiry number
dns_email_provider Email Provider text

Get Website Tech Stack

Identifies the technologies, tools, and GTM signals used by a website.

Credit cost: 1 credit per execution

Use case: Check what CRM, analytics, or marketing tools a prospect uses. Use this data to personalize outreach or qualify leads based on their technology choices.

Input parameters

Parameter Display name Required Description
url Website URL Yes The target URL to detect technologies on.

Output fields

Field Display name Type
tech_name Technology Name text
tech_category Technology Category text
tech_confidence Detection Confidence number
tech_summary Tech Stack Summary text

This action returns multiple results per execution.

Note: Get Website Tech Stack returns basic technology categories. Find Website Tech Stack (Utility) provides a more detailed breakdown with version numbers and confidence scores.


Find URL Redirect Destination

Resolves a URL to its final destination and returns the full redirect chain with HTTP status codes.

Credit cost: 1 credit per execution

Use case: Check where shortened URLs or affiliate links actually point. Verify redirect chains for SEO audits or link validation.

Input parameters

Parameter Display name Required Description
url URL Yes The original URL to resolve and check for redirects.

Output fields

Field Display name Type
final_status_code Final Status Code number
is_redirected Is Redirected boolean
original_link Original Link url
redirect_count Redirect Count number
redirect_link Final Redirect Link url
resolved_successfully Resolved Successfully boolean

Extract Sitemap URLs

Discovers and extracts URLs from a website's sitemap via sitemap.xml, robots.txt, or crawl discovery.

Credit cost: 1 credit per execution

Use case: Get a complete list of pages on a website for SEO analysis, content auditing, or building a list of pages to scrape individually.

Input parameters

Parameter Display name Required Description
url Website URL Yes The target website URL to discover sitemaps for.

Output fields

Field Display name Type
extracted_url Discovered URL url
total_count Total URLs Found number
source Discovery Source text

This action returns multiple results per execution.


Find Website Tech Stack (Utility)

Identifies technologies used on a website including analytics tools, CMS, and JavaScript frameworks. This is the utility version that returns categorized results.

Credit cost: 2 credits per execution

Use case: Run a technology audit on a prospect's website. Identify their programming languages, frameworks, analytics tools, CRM, and live chat solutions.

Input parameters

Parameter Display name Required Description
site_url Site URL Yes Website URL or domain (e.g., texau.com).

Output fields

Field Display name Type
programming_languages Programming Languages text
paas Platform as a Service text
javascript_frameworks JavaScript Frameworks text
web_frameworks Web Frameworks text
web_servers Web Servers text
static_site_generator Static Site Generator text
security Security text
tag_managers Tag Managers text
cdn CDN text
analytics Analytics text
live_chat Live Chat text
crm CRM text
font_scripts Font Scripts text
webmail Webmail text
email Email text
miscellaneous Miscellaneous text

Get Slack Channel Members

Retrieves the member list and their profile details from a public Slack channel using workspace credentials.

Credit cost: 2 credits per execution

Use case: Extract member lists from public Slack communities for lead generation. Get names, emails, titles, and timezone data for each member.

Input parameters

Parameter Display name Required Description
workspace Workspace Yes The Slack workspace slug (e.g., 'texauhq').
channel_id Channel ID Yes The Slack Channel ID.
cookie Slack Auth Cookie Yes Your Slack authentication cookie (starts with 'xoxd-').
token Slack API Token Yes Your Slack API token (starts with 'xoxc-').

Output fields

Field Display name Type
workspace Workspace text
channel_id Channel ID text
total_members Total Members Extracted number
user_ids User IDs text
real_names Real Names text
display_names Display Names text
emails Emails text
titles Titles text
phones Phone Numbers text
status_texts Status Texts text
timezones Timezones text
is_admins Is Admin Status text
is_owners Is Owner Status text
avatar_urls Avatar URLs url

This action returns multiple results per execution.


Troubleshooting

Action returns empty results Verify that the URL is publicly accessible. Pages behind login walls, paywalls, or CAPTCHAs cannot be scraped.

Slow response times Website Intelligence and Extract Web Emails crawl multiple pages, so they take longer than single-page actions. Allow up to 30 seconds for results.

No emails found on a website Some websites do not display email addresses in their HTML. Try the Website Intelligence action, which checks multiple sources including DNS MX records.

Tech stack returns few results Websites that use server-side rendering or have minimal client-side JavaScript may show fewer detected technologies. This is expected behavior.

Sitemap returns zero URLs Not all websites have a sitemap.xml file. The action also checks robots.txt and attempts crawl discovery, but some sites may have no discoverable sitemap.