Firecrawl:开源API,大规模搜索、抓取并与网页交互
Firecrawl 是一个开源 API,提供高可靠性的网页搜索、抓取和交互功能,可将网页内容转化为 LLM 可用的 Markdown 或结构化数据,支持实时代理和动态应用。
用于大规模搜索、抓取和与网络交互的 API。🔥 网页上下文 API,用于查找来源、提取内容,并将其转化为您的代理可以使用的干净 Markdown 或结构化数据。开源并作为托管服务提供。
嘿,你,加入我们的观星者吧 :)- 行业领先的可靠性:覆盖 96% 的网络,包括大量 JS 页面——无需代理烦恼,只需干净的数据(请参阅基准测试)- 极快速度:P95 延迟为 3.4 秒,跨越数百万个页面,专为实时代理和动态应用构建- LLM 就绪输出:
干净的 Markdown、
结构化 JSON、屏幕截图等——花费更少的 token,构建更好的 AI 应用- 我们处理难题:轮换代理、编排、速率限制、JS 阻止的内容等——零配置- 代理就绪:只需一个命令即可将 Firecrawl 连接到任何 AI 代理或 MCP 客户端- 媒体解析:
从网络托管的 PDF、DOCX 等中解析和提取内容- 操作:在提取内容之前单击、滚动、写入、等待和按下- 开源:透明且协作地开发——加入我们的社区核心端点更多在 firecrawl.dev 注册以获取您的 API 密钥。
尝试 playground 来测试它。搜索网络并从结果中获取完整内容。
from firecrawl import Firecrawlapp = Firecrawl(api_key="fc-YOUR_API_KEY")search_result = app.search("firecrawl",
limit=5)Node.js / cURL / CLINode.jsimport { Firecrawl } from 'firecrawl';const app = new Firecrawl({apiKey: "fc-YOUR_API_KEY"});
app.
search("firecrawl",{ limit:5 })cURLcurl -X POST 'https://api.firecrawl.dev/v2/search' \-H 'Authorization:
Bearer fc-YOUR_API_KEY' \-H 'Content-Type:
application/json' \-d '{"query":"firecrawl","limit":5}'CLIfirecrawl search "firecrawl" --limit 5输出:[{"url":"https://firecrawl.dev",
"title":"Firecrawl","markdown":"Turn websites into..."},{"url":"https://docs.firecrawl.dev","title":"Firecrawl Docs","markdown":
"# Getting Started..."}]从任何网站获取 LLM 就绪数据——Markdown、JSON、屏幕截图等。
from firecrawl import Firecrawlapp = Firecrawl(api_key="fc-YOUR_API_KEY")result = app.scrape('firecrawl.dev')Node.js / cURL / CLINode.js
import { Firecrawl } from 'firecrawl';const app = new Firecrawl({ apiKey: "fc-YOUR_API_KEY" });
app.scrape('firecrawl.dev')cURLcurl -X POST 'https://api.firecrawl.dev/v2/scrape' \-H 'Authorization:
Bearer fc-YOUR_API_KEY' \-H 'Content-Type:
application/json' \-d '{"url":"firecrawl.dev"}'CLIfirecrawl scrape https://firecrawl.devfirecrawl https:
//firecrawl.dev --only-main-content输出:# FirecrawlFirecrawl helps AI systems search,scrape,and interact with the web.## Features- Search:
Find information across the web- Scrape:
Clean data from any page- Interact:Click,navigate,and operate pages- Agent:Autonomous data gathering抓取页面,然后使用 AI 提示或代码与其交互。
from firecrawl import Firecrawlapp = Firecrawl(api_key="fc-YOUR_API_KEY")result = app.scrape("https:
//amazon.com")scrape_id = result.metadata.scrape_idapp.interact(scrape_id,
prompt="Search for 'mechanical keyboard'")app.interact(scrape_id,
prompt="Click the first result")Node.js / cURL / CLINode.jsimport { Firecrawl } from 'firecrawl';
const app = new Firecrawl({apiKey: "fc-YOUR_API_KEY"});const result = await app.scrape("https://amazon.com");
await app.interact(result.metadata.scrapeId, {prompt: "Search for 'mechanical keyboard'"});
await app.interact(result.metadata.scrapeId, {prompt: "Click the first result"});
cURL# 1. Scrape the pagecurl -X POST 'https://api.firecrawl.dev/v2/scrape' \-H 'Authorization:Bearer fc-YOUR_API_KEY' \-H 'Content-Type:
application/json' \-d '{"url":"https://amazon.com"}'# 2. Interact with the page (use scrapeId from step 1)curl -X POST 'https:
//api.firecrawl.dev/v2/scrape/SCRAPE_ID/interact' \-H 'Authorization:
Bearer fc-YOUR_API_KEY' \-H 'Content-Type:application/json' \-d '{"prompt":"Search for mechanical keyboard"}'CLIfirecrawl scrape https:
//amazon.comfirecrawl interact exec --prompt "Search for 'mechanical keyboard'"firecrawl interact exec --prompt "Click the first result"
输出:{"success":true,"output":"Keyboard available at $100","liveViewUrl":"https:
//liveview.firecrawl.dev/..."}几分钟内将 Firecrawl 连接到任何 AI 代理或 MCP 客户端。
只需一个命令,即可让您的代理轻松访问实时 Web 数据。npx -y firecrawl-cli@latest init --all --browser安装后重新启动代理。与 Claude Code、Antigravity、OpenCode 等兼容。
几秒钟内将任何 MCP 兼容客户端连接到 Web。
{"mcpServers":{"firecrawl-mcp":{"command":"npx","args":["-y","firecrawl-mcp"],"env":{"FIRECRAWL_API_KEY":"fc-YOUR_API_KEY"}}}}您是 AI 代理吗?
获取此技能以注册您的用户、获取 API 密钥并开始使用 Firecrawl 进行构建。curl -s https://firecrawl.dev/agent-onboarding/SKILL.md有关所有可用命令,请参阅 Skill + CLI 文档。
对于 MCP,请参阅 firecrawl-mcp-server。从网络获取数据的最简单方法。描述您需要什么,我们的 AI 代理会搜索、导航和检索它。无需 URL。Agent 是我们 /extract 端点的进化:更快、更可靠,并且不需要您提前了解 URL。
curl -X POST 'https://api.firecrawl.dev/v2/agent' \-H 'Authorization:Bearer fc-YOUR_API_KEY' \-H 'Content-Type:
application/json' \-d '{"prompt":
"Find the pricing plans for Notion"}'响应:{"success":true,"data":{"result":"Notion offers the following pricing plans:
\n\n1. Free - $0/month...\n2. Plus - $10/seat/month...\n3. Business - $18/seat/month...","sources":["https://www.notion.
so/pricing"]}}使用模式获取结构化数据:from firecrawl import Firecrawlfrom pydantic import BaseModel,Fieldfrom typing import List,
Optionalapp = Firecrawl(api_key="fc-YOUR_API_KEY")class Founder(BaseModel):
name:str = Field(description="Full name of the founder")role:Optional[str] = Field(None,
description="Role or position")class FoundersSchema(BaseModel):
founders:List[Founder] = Field(description="List of founders")result = app.agent(prompt="Find the founders of Firecrawl",
schema=FoundersSchema)print(result.data){"founders":
[{"name":"Eric Ciarla","role":"Co-founder"},{"name":"Nicolas Camara","role":"Co-founder"},{"name":"Caleb Peffer","role":
"Co-founder"}]}让代理关注特定页面:result = app.agent(urls=["https://docs.firecrawl.dev","https://firecrawl.dev/pricing"],
prompt="Compare the features and pricing information")根据您的需求在两种模型之间进行选择:
result = app.agent(prompt="Compare enterprise features across Firecrawl,Apify,and ScrapingBee",model="spark-1-pro")何时使用 Pro:
- 比较多个网站的数据- 从具有复杂导航或认证的网站中提取- 代理需要探索多种路径的研究任务- 准确性至关重要的关键数据在我们的 Agent 文档中了解有关 Spark 模型的更多信息。爬取整个网站并获取所有页面的内容。
curl -X POST 'https://api.firecrawl.
dev/v2/crawl' \-H 'Authorization:Bearer fc-YOUR_API_KEY' \-H 'Content-Type:application/json' \-d '{"url":"https://docs.firecrawl.dev",
"limit":100,"scrapeOptions":{"formats":["markdown"]}}'返回一个作业 ID:{"success":true,"id":"123-456-789","url":"https:
//api.firecrawl.dev/v2/crawl/123-456-789"}curl -X GET 'https:
//api.firecrawl.dev/v2/crawl/123-456-789' \-H 'Authorization:Bearer fc-YOUR_API_KEY'{"status":"completed","total":50,"completed":50,
"creditsUsed":
50,"data":[{"markdown":"# Page Title\n\nContent...","metadata":{"title":"Page Title","sourceURL":"https://..."}}]}注意:SDK 会自动处理轮询,
以获得更好的开发体验。立即发现网站上的所有 URL。
curl -X POST 'https://api.firecrawl.dev/v2/map' \-H 'Authorization: Bearer fc-YOUR_API_KEY' \-H 'Content-Type: application/json' \-d '{"
本文为机器翻译辅以 AI 润色,仅供参考。原始事实以原文为准。