精选· 重要性 4/5

Firecrawl：开源API，大规模搜索、抓取并与网页交互

GitHub Trending (AI repos)·8 天前·firecrawl·约 5 分钟阅读

Hacker News 131103 分

中文导读

Firecrawl 是一个开源 API，提供高可靠性的网页搜索、抓取和交互功能，可将网页内容转化为 LLM 可用的 Markdown 或结构化数据，支持实时代理和动态应用。

用于大规模搜索、抓取和与网络交互的 API。🔥 网页上下文 API，用于查找来源、提取内容，并将其转化为您的代理可以使用的干净 Markdown 或结构化数据。开源并作为托管服务提供。

嘿，你，加入我们的观星者吧 :)- 行业领先的可靠性：覆盖 96% 的网络，包括大量 JS 页面——无需代理烦恼，只需干净的数据（请参阅基准测试）- 极快速度：P95 延迟为 3.4 秒，跨越数百万个页面，专为实时代理和动态应用构建- LLM 就绪输出：

干净的 Markdown、

结构化 JSON、屏幕截图等——花费更少的 token，构建更好的 AI 应用- 我们处理难题：轮换代理、编排、速率限制、JS 阻止的内容等——零配置- 代理就绪：只需一个命令即可将 Firecrawl 连接到任何 AI 代理或 MCP 客户端- 媒体解析：

从网络托管的 PDF、DOCX 等中解析和提取内容- 操作：在提取内容之前单击、滚动、写入、等待和按下- 开源：透明且协作地开发——加入我们的社区核心端点更多在 firecrawl.dev 注册以获取您的 API 密钥。

尝试 playground 来测试它。搜索网络并从结果中获取完整内容。

from firecrawl import Firecrawlapp = Firecrawl(api_key="fc-YOUR_API_KEY")search_result = app.search("firecrawl",

limit=5)Node.js / cURL / CLINode.jsimport { Firecrawl } from 'firecrawl';const app = new Firecrawl({apiKey: "fc-YOUR_API_KEY"});

app.

search("firecrawl",{ limit:5 })cURLcurl -X POST 'https://api.firecrawl.dev/v2/search' \-H 'Authorization:

Bearer fc-YOUR_API_KEY' \-H 'Content-Type:

application/json' \-d '{"query":"firecrawl","limit":5}'CLIfirecrawl search "firecrawl" --limit 5输出：[{"url":"https://firecrawl.dev",

"title":"Firecrawl","markdown":"Turn websites into..."},{"url":"https://docs.firecrawl.dev","title":"Firecrawl Docs","markdown":

"# Getting Started..."}]从任何网站获取 LLM 就绪数据——Markdown、JSON、屏幕截图等。

from firecrawl import Firecrawlapp = Firecrawl(api_key="fc-YOUR_API_KEY")result = app.scrape('firecrawl.dev')Node.js / cURL / CLINode.js

import { Firecrawl } from 'firecrawl';const app = new Firecrawl({ apiKey: "fc-YOUR_API_KEY" });

app.scrape('firecrawl.dev')cURLcurl -X POST 'https://api.firecrawl.dev/v2/scrape' \-H 'Authorization:

Bearer fc-YOUR_API_KEY' \-H 'Content-Type:

application/json' \-d '{"url":"firecrawl.dev"}'CLIfirecrawl scrape https://firecrawl.devfirecrawl https:

//firecrawl.dev --only-main-content输出：# FirecrawlFirecrawl helps AI systems search,scrape,and interact with the web.## Features- Search:

Find information across the web- Scrape:

Clean data from any page- Interact:Click,navigate,and operate pages- Agent:Autonomous data gathering抓取页面，然后使用 AI 提示或代码与其交互。

from firecrawl import Firecrawlapp = Firecrawl(api_key="fc-YOUR_API_KEY")result = app.scrape("https:

//amazon.com")scrape_id = result.metadata.scrape_idapp.interact(scrape_id,

prompt="Search for 'mechanical keyboard'")app.interact(scrape_id,

prompt="Click the first result")Node.js / cURL / CLINode.jsimport { Firecrawl } from 'firecrawl';

const app = new Firecrawl({apiKey: "fc-YOUR_API_KEY"});const result = await app.scrape("https://amazon.com");

await app.interact(result.metadata.scrapeId, {prompt: "Search for 'mechanical keyboard'"});

await app.interact(result.metadata.scrapeId, {prompt: "Click the first result"});

cURL# 1. Scrape the pagecurl -X POST 'https://api.firecrawl.dev/v2/scrape' \-H 'Authorization:Bearer fc-YOUR_API_KEY' \-H 'Content-Type:

application/json' \-d '{"url":"https://amazon.com"}'# 2. Interact with the page (use scrapeId from step 1)curl -X POST 'https:

//api.firecrawl.dev/v2/scrape/SCRAPE_ID/interact' \-H 'Authorization:

Bearer fc-YOUR_API_KEY' \-H 'Content-Type:application/json' \-d '{"prompt":"Search for mechanical keyboard"}'CLIfirecrawl scrape https:

//amazon.comfirecrawl interact exec --prompt "Search for 'mechanical keyboard'"firecrawl interact exec --prompt "Click the first result"

输出：{"success":true,"output":"Keyboard available at $100","liveViewUrl":"https:

//liveview.firecrawl.dev/..."}几分钟内将 Firecrawl 连接到任何 AI 代理或 MCP 客户端。

只需一个命令，即可让您的代理轻松访问实时 Web 数据。npx -y firecrawl-cli@latest init --all --browser安装后重新启动代理。与 Claude Code、Antigravity、OpenCode 等兼容。

几秒钟内将任何 MCP 兼容客户端连接到 Web。

{"mcpServers":{"firecrawl-mcp":{"command":"npx","args":["-y","firecrawl-mcp"],"env":{"FIRECRAWL_API_KEY":"fc-YOUR_API_KEY"}}}}您是 AI 代理吗？

获取此技能以注册您的用户、获取 API 密钥并开始使用 Firecrawl 进行构建。curl -s https://firecrawl.dev/agent-onboarding/SKILL.md有关所有可用命令，请参阅 Skill + CLI 文档。

对于 MCP，请参阅 firecrawl-mcp-server。从网络获取数据的最简单方法。描述您需要什么，我们的 AI 代理会搜索、导航和检索它。无需 URL。Agent 是我们 /extract 端点的进化：更快、更可靠，并且不需要您提前了解 URL。

curl -X POST 'https://api.firecrawl.dev/v2/agent' \-H 'Authorization:Bearer fc-YOUR_API_KEY' \-H 'Content-Type:

application/json' \-d '{"prompt":

"Find the pricing plans for Notion"}'响应：{"success":true,"data":{"result":"Notion offers the following pricing plans:

\n\n1. Free - $0/month...\n2. Plus - $10/seat/month...\n3. Business - $18/seat/month...","sources":["https://www.notion.

so/pricing"]}}使用模式获取结构化数据：from firecrawl import Firecrawlfrom pydantic import BaseModel,Fieldfrom typing import List,

Optionalapp = Firecrawl(api_key="fc-YOUR_API_KEY")class Founder(BaseModel):

name:str = Field(description="Full name of the founder")role:Optional[str] = Field(None,

description="Role or position")class FoundersSchema(BaseModel):

founders:List[Founder] = Field(description="List of founders")result = app.agent(prompt="Find the founders of Firecrawl",

schema=FoundersSchema)print(result.data){"founders":

[{"name":"Eric Ciarla","role":"Co-founder"},{"name":"Nicolas Camara","role":"Co-founder"},{"name":"Caleb Peffer","role":

"Co-founder"}]}让代理关注特定页面：result = app.agent(urls=["https://docs.firecrawl.dev","https://firecrawl.dev/pricing"],

prompt="Compare the features and pricing information")根据您的需求在两种模型之间进行选择：

result = app.agent(prompt="Compare enterprise features across Firecrawl,Apify,and ScrapingBee",model="spark-1-pro")何时使用 Pro：

- 比较多个网站的数据- 从具有复杂导航或认证的网站中提取- 代理需要探索多种路径的研究任务- 准确性至关重要的关键数据在我们的 Agent 文档中了解有关 Spark 模型的更多信息。爬取整个网站并获取所有页面的内容。

curl -X POST 'https://api.firecrawl.

dev/v2/crawl' \-H 'Authorization:Bearer fc-YOUR_API_KEY' \-H 'Content-Type:application/json' \-d '{"url":"https://docs.firecrawl.dev",

"limit":100,"scrapeOptions":{"formats":["markdown"]}}'返回一个作业 ID：{"success":true,"id":"123-456-789","url":"https:

//api.firecrawl.dev/v2/crawl/123-456-789"}curl -X GET 'https:

//api.firecrawl.dev/v2/crawl/123-456-789' \-H 'Authorization:Bearer fc-YOUR_API_KEY'{"status":"completed","total":50,"completed":50,

"creditsUsed":

50,"data":[{"markdown":"# Page Title\n\nContent...","metadata":{"title":"Page Title","sourceURL":"https://..."}}]}注意：SDK 会自动处理轮询，

以获得更好的开发体验。立即发现网站上的所有 URL。

curl -X POST 'https://api.firecrawl.dev/v2/map' \-H 'Authorization: Bearer fc-YOUR_API_KEY' \-H 'Content-Type: application/json' \-d '{"

原文出处

firecrawl/firecrawl: The API to search, scrape, and interact with the web at scale. 🔥

本文为机器翻译辅以 AI 润色，仅供参考。原始事实以原文为准。

Firecrawl：开源API，大规模搜索、抓取并与网页交互

相关阅读

Claude Desktop每次启动强制创建1.8GB虚拟机，即使仅用于聊天

Apache Burr：构建可靠AI代理与应用的Python框架

0.01欧元转账可攻陷银行AI助手：间接提示注入漏洞分析