AI stacks
Live web data for LangChain with Piloterr
Build LangChain tools and retrievers that call Piloterr REST APIs. Feed agents fresh, structured JSON from protected sites, without maintaining browser farms or proxy pools.
- Custom tools wrapping any Piloterr endpoint
- Structured JSON ideal for agent reasoning
- Python and JavaScript SDK-compatible HTTP calls
- Anti-bot bypass for RAG and agent pipelines
At a glance
Tools
agent actions
JSON
structured output
400+
data sources
REST
HTTP API
Why connect LangChain
Agent tools
Wrap Piloterr endpoints as LangChain tools so agents can scrape, enrich, and retrieve live web data on demand.
RAG pipelines
Fetch clean Markdown or JSON from tough targets and chunk into vector stores without parsing raw HTML.
No browser maintenance
Agents call Piloterr instead of spinning up Playwright or Puppeteer, anti-bot handled server-side.
Predictable costs
Credit-based billing per successful request, forecast spend as agent call volume grows.
LangChain + Piloterr use cases
From research agents to production RAG systems.
Research agents
Agents scrape SERP, news, and company data to answer questions with fresh sources.
Enrichment chains
Sequential chains that enrich leads with LinkedIn, company, and domain data.
Vector ingestion
Load structured page content into Pinecone, Weaviate, or pgvector.
Multi-tool agents
Combine scrape, extract, and search tools in a single agent executor.
Why agents need Piloterr instead of raw fetch
| Approach | DIY | Piloterr |
|---|---|---|
| requests / fetch | Blocked on protected sites | 94%+ pass rate on WAF targets |
| Playwright tool | Slow, expensive, fragile | Managed browser + JSON |
| HTML parsing | Agent wastes tokens on markup | Structured JSON fields |
| Ops burden | Proxy rotation, CAPTCHA farms | Single API integration |
Connect LangChain in four steps
Step 1
Install LangChain
pip install langchain langchain-openai requests
Step 2
Get your API key
Copy x-api-key from the Piloterr dashboard.
Get your API keyStep 3
Create a custom tool
Wrap a Piloterr HTTP call in a @tool function (langchain.tools).
Step 4
Run the agent
Use create_agent with your model and tools, the LLM decides when to scrape.
Workflow recipes
Research agent with live SERP
Agent searches Google, reads results via Piloterr, and synthesizes an answer with citations.
Lead enrichment chain
Sequential chain: domain → company info → LinkedIn profile → CRM-ready JSON.
RAG over competitor pages
Scrape competitor pricing pages, chunk JSON, embed into vector store for Q&A.
Support bot with live docs
Agent scrapes your help-center pages on demand and answers customer questions with fresh citations.
When to use LangChain + Piloterr
Scenario
LLM agents need live web data
Recommendation: LangChain tools
Scenario
Batch ETL pipelines
Recommendation: Python SDK directly
Scenario
No-code orchestration
Recommendation: n8n or Make
Scenario
Multi-agent systems
Recommendation: CrewAI
LangChain tool example
A minimal tool that scrapes Google News via Piloterr.
import os
import requests
from langchain.tools import tool
PILOTERR_KEY = os.environ["PILOTERR_API_KEY"]
BASE = "https://api.piloterr.com/v2"
@tool
def search_google_news(query: str, location: str = "Paris, FR") -> dict:
"""Search Google News for recent articles matching a query."""
response = requests.post(
f"{BASE}/google/news",
headers={"x-api-key": PILOTERR_KEY, "Content-Type": "application/json"},
json={"query": query, "location": location, "page": 1},
timeout=60,
)
response.raise_for_status()
return response.json()See also
Transparent credit pricing
Pay only for successful requests. Start with +500 credits, then scale with plans from $49/mo.
Premium
$49/mo
18,000 credits
Premium+
$99/mo
40,000 credits
Startup
$249/mo
110,000 credits