Web Scraping and Utility Actions
TexAu includes a set of built-in actions for extracting data from websites, discovering technology stacks, finding emails, and pulling structured metadata. These actions do not require third-party API keys.
Before you begin
- You need an active TexAu account with available credits.
- All actions in this reference accept a website URL as the primary input.
Full Page Scrape
Scrapes a URL and returns its content in HTML, Markdown, or as a list of extracted links.
Credit cost: 1 credit per execution
Use case: Pull the full content of a landing page or blog post for analysis, content repurposing, or competitive research.
Input parameters
| Parameter | Display name | Required | Description |
|---|---|---|---|
| url | Page URL | Yes | The target URL to scrape. |
Output fields
| Field | Display name | Type |
|---|---|---|
| title | Page Title | text |
| markdown | Markdown Content | text |
| html | HTML Content | text |
| links | Extracted Links | text |
Extract Web Emails
Crawls a website and extracts all email addresses found across multiple pages.
Credit cost: 2 credits per execution
Use case: Find contact emails from a company website for outreach. The crawler visits up to 10 pages to collect all emails it finds.
Input parameters
| Parameter | Display name | Required | Description |
|---|---|---|---|
| url | Website URL | Yes | The target URL to crawl for emails. |
Output fields
| Field | Display name | Type |
|---|---|---|
| Email Address | ||
| email_source | Source Page | text |
| email_type | Email Type | text |
| email_confidence | Confidence Level | text |
This action returns multiple results per execution.
Extract Social Links
Extracts all social media profile links from a website URL, grouped by platform.
Credit cost: 1 credit per execution
Use case: Quickly find a company's LinkedIn, Twitter, Facebook, Instagram, YouTube, and GitHub profiles from their website.
Input parameters
| Parameter | Display name | Required | Description |
|---|---|---|---|
| url | Website URL | Yes | The target URL to extract social links from. |
Output fields
| Field | Display name | Type |
|---|---|---|
| linkedin_links | LinkedIn Links | text |
| twitter_links | Twitter / X Links | text |
| facebook_links | Facebook Links | text |
| instagram_links | Instagram Links | text |
| youtube_links | YouTube Links | text |
| github_links | GitHub Links | text |
Get Web Meta Tags
Fetches SEO, Open Graph, Twitter, and other meta tags for a given web page URL.
Credit cost: 1 credit per execution
Use case: Audit a page's SEO setup, check Open Graph tags before sharing on social media, or compare meta tag strategies across competitor pages.
Input parameters
| Parameter | Display name | Required | Description |
|---|---|---|---|
| url | Page URL | Yes | The URL of the web page to extract meta tags from. |
Output fields
| Field | Display name | Type |
|---|---|---|
| favicon | Favicon Path | url |
| hreflang_lang | Hreflang lang | text |
| hreflang_url | Hreflang Url | text |
| og_title | Open Graph Title | text |
| og_description | Open Graph Description | text |
| og_image | Open Graph Image URL/Path | url |
| og_url | Open Graph URL | url |
| og_type | Open Graph Type | text |
| og_site_name | Open Graph Site Name | text |
| og_locale | Open Graph Locale | text |
| seo_title | SEO Title | text |
| seo_description | SEO Description | text |
| seo_canonical | Canonical URL | url |
| seo_charset | Charset | text |
| seo_language | Language | text |
| seo_robots | Robots Directive | text |
| seo_author | SEO Author | text |
| seo_keywords | SEO Keywords | text |
| seo_viewport | Viewport | text |
| twitter_card | Twitter Card Type | text |
| twitter_title | Twitter Title | text |
| twitter_description | Twitter Description | text |
| twitter_image | Twitter Image URL/Path | url |
| twitter_site | Twitter Site Handle | text |
| twitter_creator | Twitter Creator Handle | text |
Get Web JSON-LD
Fetches and parses JSON-LD structured data (Schema.org) from a given web page URL.
Credit cost: 1 credit per execution
Use case: Extract product pricing, ratings, FAQ content, or organization details from a page's structured data for competitive intelligence or data enrichment.
Input parameters
| Parameter | Display name | Required | Description |
|---|---|---|---|
| url | Page URL | Yes | The URL of the web page to extract JSON-LD data from. |
Output fields
| Field | Display name | Type |
|---|---|---|
| item_context | Schema Context | url |
| item_type | Schema Type (@type) | text |
| item_name | Item Name | text |
| item_description | Description | text |
| item_url | URL | url |
| item_image | Image URL | url |
| item_logo | Logo URL | url |
| item_brand | Brand | text |
| item_founding_date | Founding Date | text |
| item_keywords | Keywords | text |
| item_aggregate_rating_value | Rating Value | number |
| item_aggregate_rating_count | Rating Count | number |
| item_offers_price | Offer Price | number |
| item_offers_currency | Offer Currency | text |
| item_offers_availability | Offer Availability | text |
| item_offers_url | Offer URL | url |
| item_same_as | Same As (Social Links) | text |
| item_significant_link | Significant Links | text |
| item_list_element | Breadcrumb List Elements | text |
| item_main_entity | Main Entity (FAQ Items) | text |
| item_raw | Raw JSON-LD Item Object | text |
| total_count | Total Items Found | number |
This action returns multiple results per execution.
Website Intelligence
Comprehensive website report covering performance, SEO, tech stack, security headers, DNS, SSL, social links, emails, and JSON-LD data in a single call.
Credit cost: 3 credits per execution
Use case: Run a full audit of a prospect's website before a sales call. Get their tech stack, estimated tech spend, SEO score, security posture, and social media links in one action.
Input parameters
| Parameter | Display name | Required | Description |
|---|---|---|---|
| url | Website URL | Yes | The target URL to run the comprehensive intelligence report on. |
Output fields (key fields)
| Field | Display name | Type |
|---|---|---|
| url | Original URL | url |
| final_url | Final Resolved URL | url |
| http_status | HTTP Status Code | number |
| page_load_time_ms | Page Load Time (ms) | number |
| seo_score | Overall SEO Score | number |
| seo_issues | SEO Issues | text |
| og_title | OG Title | text |
| og_description | OG Description | text |
| seo_title | SEO Title | text |
| seo_description | SEO Description | text |
| tech_stack_technologies | Detected Technologies | text |
| tech_stack_categories | Tech Categories | text |
| gtm_company_stage | Company Stage | text |
| gtm_estimated_spend | Est. Monthly Tech Spend | number |
| gtm_buying_signals | Buying Signals | text |
| pixels | Tracking Pixels | text |
| total_pixels_found | Total Pixels Found | number |
| social_facebook | Facebook Links | text |
| social_twitter | Twitter Links | text |
| social_linkedin | LinkedIn Links | text |
| emails | Emails Found | text |
| json_ld_items | JSON-LD Schema Items | text |
| headers_security_score | Header Security Score | number |
| ssl_issuer | SSL Issuer | text |
| ssl_grade | SSL Grade | text |
| ssl_days_until_expiry | SSL Days Until Expiry | number |
| dns_email_provider | Email Provider | text |
Get Website Tech Stack
Identifies the technologies, tools, and GTM signals used by a website.
Credit cost: 1 credit per execution
Use case: Check what CRM, analytics, or marketing tools a prospect uses. Use this data to personalize outreach or qualify leads based on their technology choices.
Input parameters
| Parameter | Display name | Required | Description |
|---|---|---|---|
| url | Website URL | Yes | The target URL to detect technologies on. |
Output fields
| Field | Display name | Type |
|---|---|---|
| tech_name | Technology Name | text |
| tech_category | Technology Category | text |
| tech_confidence | Detection Confidence | number |
| tech_summary | Tech Stack Summary | text |
This action returns multiple results per execution.
Note: Get Website Tech Stack returns basic technology categories. Find Website Tech Stack (Utility) provides a more detailed breakdown with version numbers and confidence scores.
Find URL Redirect Destination
Resolves a URL to its final destination and returns the full redirect chain with HTTP status codes.
Credit cost: 1 credit per execution
Use case: Check where shortened URLs or affiliate links actually point. Verify redirect chains for SEO audits or link validation.
Input parameters
| Parameter | Display name | Required | Description |
|---|---|---|---|
| url | URL | Yes | The original URL to resolve and check for redirects. |
Output fields
| Field | Display name | Type |
|---|---|---|
| final_status_code | Final Status Code | number |
| is_redirected | Is Redirected | boolean |
| original_link | Original Link | url |
| redirect_count | Redirect Count | number |
| redirect_link | Final Redirect Link | url |
| resolved_successfully | Resolved Successfully | boolean |
Extract Sitemap URLs
Discovers and extracts URLs from a website's sitemap via sitemap.xml, robots.txt, or crawl discovery.
Credit cost: 1 credit per execution
Use case: Get a complete list of pages on a website for SEO analysis, content auditing, or building a list of pages to scrape individually.
Input parameters
| Parameter | Display name | Required | Description |
|---|---|---|---|
| url | Website URL | Yes | The target website URL to discover sitemaps for. |
Output fields
| Field | Display name | Type |
|---|---|---|
| extracted_url | Discovered URL | url |
| total_count | Total URLs Found | number |
| source | Discovery Source | text |
This action returns multiple results per execution.
Find Website Tech Stack (Utility)
Identifies technologies used on a website including analytics tools, CMS, and JavaScript frameworks. This is the utility version that returns categorized results.
Credit cost: 2 credits per execution
Use case: Run a technology audit on a prospect's website. Identify their programming languages, frameworks, analytics tools, CRM, and live chat solutions.
Input parameters
| Parameter | Display name | Required | Description |
|---|---|---|---|
| site_url | Site URL | Yes | Website URL or domain (e.g., texau.com). |
Output fields
| Field | Display name | Type |
|---|---|---|
| programming_languages | Programming Languages | text |
| paas | Platform as a Service | text |
| javascript_frameworks | JavaScript Frameworks | text |
| web_frameworks | Web Frameworks | text |
| web_servers | Web Servers | text |
| static_site_generator | Static Site Generator | text |
| security | Security | text |
| tag_managers | Tag Managers | text |
| cdn | CDN | text |
| analytics | Analytics | text |
| live_chat | Live Chat | text |
| crm | CRM | text |
| font_scripts | Font Scripts | text |
| webmail | Webmail | text |
| text | ||
| miscellaneous | Miscellaneous | text |
Get Slack Channel Members
Retrieves the member list and their profile details from a public Slack channel using workspace credentials.
Credit cost: 2 credits per execution
Use case: Extract member lists from public Slack communities for lead generation. Get names, emails, titles, and timezone data for each member.
Input parameters
| Parameter | Display name | Required | Description |
|---|---|---|---|
| workspace | Workspace | Yes | The Slack workspace slug (e.g., 'texauhq'). |
| channel_id | Channel ID | Yes | The Slack Channel ID. |
| cookie | Slack Auth Cookie | Yes | Your Slack authentication cookie (starts with 'xoxd-'). |
| token | Slack API Token | Yes | Your Slack API token (starts with 'xoxc-'). |
Output fields
| Field | Display name | Type |
|---|---|---|
| workspace | Workspace | text |
| channel_id | Channel ID | text |
| total_members | Total Members Extracted | number |
| user_ids | User IDs | text |
| real_names | Real Names | text |
| display_names | Display Names | text |
| emails | Emails | text |
| titles | Titles | text |
| phones | Phone Numbers | text |
| status_texts | Status Texts | text |
| timezones | Timezones | text |
| is_admins | Is Admin Status | text |
| is_owners | Is Owner Status | text |
| avatar_urls | Avatar URLs | url |
This action returns multiple results per execution.
Troubleshooting
Action returns empty results Verify that the URL is publicly accessible. Pages behind login walls, paywalls, or CAPTCHAs cannot be scraped.
Slow response times Website Intelligence and Extract Web Emails crawl multiple pages, so they take longer than single-page actions. Allow up to 30 seconds for results.
No emails found on a website Some websites do not display email addresses in their HTML. Try the Website Intelligence action, which checks multiple sources including DNS MX records.
Tech stack returns few results Websites that use server-side rendering or have minimal client-side JavaScript may show fewer detected technologies. This is expected behavior.
Sitemap returns zero URLs Not all websites have a sitemap.xml file. The action also checks robots.txt and attempts crawl discovery, but some sites may have no discoverable sitemap.