Turn any web page into
LLM-ready content

Enter a URL. Siphon renders the page in a headless browser, executes JavaScript, scrolls for lazy content, and extracts clean Markdown / JSON / plain text.

Siphon Playground
TRY IT — PAGES THAT WEB_FETCH CAN'T HANDLE
siphon.world/
Siphon (headless browser)
^ Enter a URL and click Extract or try the examples below
web_fetch (plain HTTP)
HTTP-only result will appear here for comparison
JS

SPA Rendering

Executes JavaScript via headless Chromium, waits for async data to load, and extracts the real content from React / Vue / Angular single-page apps.

Scroll & Lazy Load

Automatically scrolls the page to trigger lazy-loaded images, infinite scroll feeds, comment sections, and other dynamically loaded content.

+

Click to Expand

Clicks "Load More" buttons, expands collapsed sections, and switches tabs to capture content hidden behind user interactions.

#

Precision Extraction

Use CSS selectors to target specific content and exclude noise like navigation bars, ads, and footers. Keep only what you need.

AI

AI Enhanced

Optionally use an LLM to restructure extracted content, handle complex tables, multilingual pages, and custom field extraction.

{}

Structured Data

Automatically extracts embedded JSON-LD, Open Graph, and Microdata metadata from the page with zero configuration.

API Usage

# Simple call — get Markdown directly
curl http://siphon.world/extract?url=https://example.com
# SPA page — wait and target extraction
curl -X POST http://siphon.world/extract \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://vuejs.org/guide/introduction.html",
    "wait_for_selector": ".content",
    "target_selector": ".content",
    "exclude_selectors": [".VPSidebarNav"]
  }'
# Lazy-load page — auto scroll
curl -X POST http://siphon.world/extract \
  -d '{"url": "https://news.ycombinator.com", "scroll_to_bottom": true, "max_scrolls": 5}'