Web Scraping and Utility Actions

TexAu includes a set of built-in actions for extracting data from websites, discovering technology stacks, finding emails, and pulling structured metadata. These actions do not require third-party API keys.

Before you begin

You need an active TexAu account with available credits.
All actions in this reference accept a website URL as the primary input.

Full Page Scrape

Scrapes a URL and returns its content in HTML, Markdown, or as a list of extracted links.

Credit cost: 1 credit per execution

Use case: Pull the full content of a landing page or blog post for analysis, content repurposing, or competitive research.

Input parameters

Parameter	Display name	Required	Description
url	Page URL	Yes	The target URL to scrape.

Output fields

Field	Display name	Type
title	Page Title	text
markdown	Markdown Content	text
html	HTML Content	text
links	Extracted Links	text

Extract Web Emails

Crawls a website and extracts all email addresses found across multiple pages.

Credit cost: 2 credits per execution

Use case: Find contact emails from a company website for outreach. The crawler visits up to 10 pages to collect all emails it finds.

Input parameters

Parameter	Display name	Required	Description
url	Website URL	Yes	The target URL to crawl for emails.

Output fields

Field	Display name	Type
email	Email Address	email
email_source	Source Page	text
email_type	Email Type	text
email_confidence	Confidence Level	text

This action returns multiple results per execution.

Extract Social Links

Extracts all social media profile links from a website URL, grouped by platform.

Credit cost: 1 credit per execution

Use case: Quickly find a company's LinkedIn, Twitter, Facebook, Instagram, YouTube, and GitHub profiles from their website.

Input parameters

Parameter	Display name	Required	Description
url	Website URL	Yes	The target URL to extract social links from.

Output fields

Field	Display name	Type
linkedin_links	LinkedIn Links	text
twitter_links	Twitter / X Links	text
facebook_links	Facebook Links	text
instagram_links	Instagram Links	text
youtube_links	YouTube Links	text
github_links	GitHub Links	text

Get Web Meta Tags

Fetches SEO, Open Graph, Twitter, and other meta tags for a given web page URL.

Credit cost: 1 credit per execution

Use case: Audit a page's SEO setup, check Open Graph tags before sharing on social media, or compare meta tag strategies across competitor pages.

Input parameters

Parameter	Display name	Required	Description
url	Page URL	Yes	The URL of the web page to extract meta tags from.

Output fields

Field	Display name	Type
favicon	Favicon Path	url
hreflang_lang	Hreflang lang	text
hreflang_url	Hreflang Url	text
og_title	Open Graph Title	text
og_description	Open Graph Description	text
og_image	Open Graph Image URL/Path	url
og_url	Open Graph URL	url
og_type	Open Graph Type	text
og_site_name	Open Graph Site Name	text
og_locale	Open Graph Locale	text
seo_title	SEO Title	text
seo_description	SEO Description	text
seo_canonical	Canonical URL	url
seo_charset	Charset	text
seo_language	Language	text
seo_robots	Robots Directive	text
seo_author	SEO Author	text
seo_keywords	SEO Keywords	text
seo_viewport	Viewport	text
twitter_card	Twitter Card Type	text
twitter_title	Twitter Title	text
twitter_description	Twitter Description	text
twitter_image	Twitter Image URL/Path	url
twitter_site	Twitter Site Handle	text
twitter_creator	Twitter Creator Handle	text

Get Web JSON-LD

Fetches and parses JSON-LD structured data (Schema.org) from a given web page URL.

Credit cost: 1 credit per execution

Use case: Extract product pricing, ratings, FAQ content, or organization details from a page's structured data for competitive intelligence or data enrichment.

Input parameters

Parameter	Display name	Required	Description
url	Page URL	Yes	The URL of the web page to extract JSON-LD data from.

Output fields

Field	Display name	Type
item_context	Schema Context	url
item_type	Schema Type (@type)	text
item_name	Item Name	text
item_description	Description	text
item_url	URL	url
item_image	Image URL	url
item_logo	Logo URL	url
item_brand	Brand	text
item_founding_date	Founding Date	text
item_keywords	Keywords	text
item_aggregate_rating_value	Rating Value	number
item_aggregate_rating_count	Rating Count	number
item_offers_price	Offer Price	number
item_offers_currency	Offer Currency	text
item_offers_availability	Offer Availability	text
item_offers_url	Offer URL	url
item_same_as	Same As (Social Links)	text
item_significant_link	Significant Links	text
item_list_element	Breadcrumb List Elements	text
item_main_entity	Main Entity (FAQ Items)	text
item_raw	Raw JSON-LD Item Object	text
total_count	Total Items Found	number

This action returns multiple results per execution.

Website Intelligence

Comprehensive website report covering performance, SEO, tech stack, security headers, DNS, SSL, social links, emails, and JSON-LD data in a single call.

Credit cost: 3 credits per execution

Use case: Run a full audit of a prospect's website before a sales call. Get their tech stack, estimated tech spend, SEO score, security posture, and social media links in one action.

Input parameters

Parameter	Display name	Required	Description
url	Website URL	Yes	The target URL to run the comprehensive intelligence report on.

Output fields (key fields)

Field	Display name	Type
url	Original URL	url
final_url	Final Resolved URL	url
http_status	HTTP Status Code	number
page_load_time_ms	Page Load Time (ms)	number
seo_score	Overall SEO Score	number
seo_issues	SEO Issues	text
og_title	OG Title	text
og_description	OG Description	text
seo_title	SEO Title	text
seo_description	SEO Description	text
tech_stack_technologies	Detected Technologies	text
tech_stack_categories	Tech Categories	text
gtm_company_stage	Company Stage	text
gtm_estimated_spend	Est. Monthly Tech Spend	number
gtm_buying_signals	Buying Signals	text
pixels	Tracking Pixels	text
total_pixels_found	Total Pixels Found	number
social_facebook	Facebook Links	text
social_twitter	Twitter Links	text
social_linkedin	LinkedIn Links	text
emails	Emails Found	text
json_ld_items	JSON-LD Schema Items	text
headers_security_score	Header Security Score	number
ssl_issuer	SSL Issuer	text
ssl_grade	SSL Grade	text
ssl_days_until_expiry	SSL Days Until Expiry	number
dns_email_provider	Email Provider	text

Get Website Tech Stack

Identifies the technologies, tools, and GTM signals used by a website.

Credit cost: 1 credit per execution

Use case: Check what CRM, analytics, or marketing tools a prospect uses. Use this data to personalize outreach or qualify leads based on their technology choices.

Input parameters

Parameter	Display name	Required	Description
url	Website URL	Yes	The target URL to detect technologies on.

Output fields

Field	Display name	Type
tech_name	Technology Name	text
tech_category	Technology Category	text
tech_confidence	Detection Confidence	number
tech_summary	Tech Stack Summary	text

This action returns multiple results per execution.

Note: Get Website Tech Stack returns basic technology categories. Find Website Tech Stack (Utility) provides a more detailed breakdown with version numbers and confidence scores.

Find URL Redirect Destination

Resolves a URL to its final destination and returns the full redirect chain with HTTP status codes.

Credit cost: 1 credit per execution

Use case: Check where shortened URLs or affiliate links actually point. Verify redirect chains for SEO audits or link validation.

Input parameters

Parameter	Display name	Required	Description
url	URL	Yes	The original URL to resolve and check for redirects.

Output fields

Field	Display name	Type
final_status_code	Final Status Code	number
is_redirected	Is Redirected	boolean
original_link	Original Link	url
redirect_count	Redirect Count	number
redirect_link	Final Redirect Link	url
resolved_successfully	Resolved Successfully	boolean

Extract Sitemap URLs

Discovers and extracts URLs from a website's sitemap via sitemap.xml, robots.txt, or crawl discovery.

Credit cost: 1 credit per execution

Use case: Get a complete list of pages on a website for SEO analysis, content auditing, or building a list of pages to scrape individually.

Input parameters

Parameter	Display name	Required	Description
url	Website URL	Yes	The target website URL to discover sitemaps for.

Output fields

Field	Display name	Type
extracted_url	Discovered URL	url
total_count	Total URLs Found	number
source	Discovery Source	text

This action returns multiple results per execution.

Find Website Tech Stack (Utility)

Identifies technologies used on a website including analytics tools, CMS, and JavaScript frameworks. This is the utility version that returns categorized results.

Credit cost: 2 credits per execution

Use case: Run a technology audit on a prospect's website. Identify their programming languages, frameworks, analytics tools, CRM, and live chat solutions.

Input parameters

Parameter	Display name	Required	Description
site_url	Site URL	Yes	Website URL or domain (e.g., texau.com).

Output fields

Field	Display name	Type
programming_languages	Programming Languages	text
paas	Platform as a Service	text
javascript_frameworks	JavaScript Frameworks	text
web_frameworks	Web Frameworks	text
web_servers	Web Servers	text
static_site_generator	Static Site Generator	text
security	Security	text
tag_managers	Tag Managers	text
cdn	CDN	text
analytics	Analytics	text
live_chat	Live Chat	text
crm	CRM	text
font_scripts	Font Scripts	text
webmail	Webmail	text
email	Email	text
miscellaneous	Miscellaneous	text

Get Slack Channel Members

Retrieves the member list and their profile details from a public Slack channel using workspace credentials.

Credit cost: 2 credits per execution

Use case: Extract member lists from public Slack communities for lead generation. Get names, emails, titles, and timezone data for each member.

Input parameters

Parameter	Display name	Required	Description
workspace	Workspace	Yes	The Slack workspace slug (e.g., 'texauhq').
channel_id	Channel ID	Yes	The Slack Channel ID.
cookie	Slack Auth Cookie	Yes	Your Slack authentication cookie (starts with 'xoxd-').
token	Slack API Token	Yes	Your Slack API token (starts with 'xoxc-').

Output fields

Field	Display name	Type
workspace	Workspace	text
channel_id	Channel ID	text
total_members	Total Members Extracted	number
user_ids	User IDs	text
real_names	Real Names	text
display_names	Display Names	text
emails	Emails	text
titles	Titles	text
phones	Phone Numbers	text
status_texts	Status Texts	text
timezones	Timezones	text
is_admins	Is Admin Status	text
is_owners	Is Owner Status	text
avatar_urls	Avatar URLs	url

This action returns multiple results per execution.

Troubleshooting

Action returns empty results Verify that the URL is publicly accessible. Pages behind login walls, paywalls, or CAPTCHAs cannot be scraped.

Slow response times Website Intelligence and Extract Web Emails crawl multiple pages, so they take longer than single-page actions. Allow up to 30 seconds for results.

No emails found on a website Some websites do not display email addresses in their HTML. Try the Website Intelligence action, which checks multiple sources including DNS MX records.

Tech stack returns few results Websites that use server-side rendering or have minimal client-side JavaScript may show fewer detected technologies. This is expected behavior.

Sitemap returns zero URLs Not all websites have a sitemap.xml file. The action also checks robots.txt and attempts crawl discovery, but some sites may have no discoverable sitemap.