这是一个提供网络内容抓取功能的模型上下文协议(Model Context Protocol)服务器,它利用浏览器自动化、OCR技术和多种提取方法,让大型语言模型(LLMs)能够从互联网页面检索并处理内容,即便这些页面需要JavaScript渲染或采用了防简单抓取的技术。
要使用Docker安装和运行mcp-server-fetch
,请按照以下步骤操作:
docker build -t mcp-server-fetch .
docker run --rm -i mcp-server-fetch
fetch
- 使用浏览器自动化和多方法提取(包括OCR)从互联网抓取URL。
url
(字符串,必需):要抓取的URLraw
(布尔值,可选):获取实际HTML内容,而不是简化后的内容(默认:false)服务器使用多种方法提取内容:
服务器使用 sophisticated scoring system 选择最佳结果,考虑以下因素:
评分系统确保无论使用哪种提取方法,都能选择最可靠和高质量的内容。提供调试日志以跟踪评分决策。
url
(字符串,必需):要抓取的URL将以下内容添加到您的Claude设置中:
{
"mcpServers": {
"fetch": {
"command": "docker",
"args": [
"run",
"--rm",
"-i",
"mcp-server-fetch"
],
"disabled": false,
"alwaysAllow": []
}
}
}
默认情况下,根据请求是否来自模型(通过工具)或用户发起(通过提示),服务器将使用以下用户代理:
ModelContextProtocol/1.0 (Autonomous; +https://github.com/modelcontextprotocol/servers)
或
ModelContextProtocol/1.0 (User-Specified; +https://github.com/modelcontextprotocol/servers)
通过在args
列表中添加参数--user-agent
,可以自定义用户代理字符串。
如果需要配置浏览器自动化,请参考以下示例:
{
"mcpServers": {
"fetch": {
"command": "docker",
"args": [
"run",
"--rm",
"-i",
"mcp-server-fetch"
],
"customHeaders": {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.121 Safari/537.36"
},
"disableCache": true,
"screenshots": {
"enabled": true,
"path": "/screenshots"
}
}
}
}
您可以使用以下高级选项:
proxy
:设置代理timeout
:设置请求超时时间maxRedirects
:设置最大重定向次数retry
:设置重试次数delay
:设置抓取延迟我们欢迎社区贡献!请参考贡献指南。
本项目受MIT License许可:
MIT License
Copyright (c) 2023 Your Name
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
感谢使用Fetch MCP服务器!如果有任何问题或建议,请随时联系我们。