nginx反爬虫配置
新建文件agent_deny.conf,添加一下内容:
#禁止Scrapy等工具的抓取
if ($http_user_agent ~* (Scrapy|Curl|HttpClient)) {
return 403;
}
#禁止指定UA及UA为空的访问
if ($http_user_agent ~* "FeedDemon|Indy Library|Alexa Toolbar|AskTbFXTV|AhrefsBot|CrawlDaddy|CoolpadWebkit|Java|Feedly|UniversalFeedParser|ApacheBench|Microsoft URL Control|Swiftbot|ZmEu|oBot|jaunty|Python-urllib|lightDeckReports Bot|YYSpider|DigExt|HttpClient|MJ12bot|heritrix|EasouSpider|Ezooms|^$" ) {
return 403;
}
#禁止非GET|HEAD|POST方式的抓取
if ($request_method !~ ^(GET|HEAD|POST)$) {
return 403;
}
修改Nginx配置文件,在需要处理的server中包含该文件,include agent_deny.conf(注意路径)
附上常见的爬虫UA,这个是在github上找到别人整理好的,还支持验证,大叫有兴趣的话可以自己去看看
部分UA:
"Googlebot/2.1 (+http://www.google.com/bot.html)" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" "Mozilla/5.0 (iPhone; CPU iPhone OS 6_0 like Mac OS X) AppleWebKit/536.26 (KHTML, like Gecko) Version/6.0 Mobile/10A5376e Safari/8536.25 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" "Mozilla/5.0 (iPhone; CPU iPhone OS 8_3 like Mac OS X) AppleWebKit/537.36 (KHTML, like Gecko) Version/8.0 Mobile/12F70 Safari/600.1.4 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" "Mozilla/5.0 (iPhone; CPU iPhone OS 8_3 like Mac OS X) AppleWebKit/600.1.4 (KHTML, like Gecko) Version/8.0 Mobile/12F70 Safari/600.1.4 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.96 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Googlebot/2.1; +http://www.google.com/bot.html) Safari/537.36"
您可能感兴趣的文章
- 02-02hadoop动态增加和删除节点方法介绍
- 02-02干货 | Linux新手入门好书推荐
- 02-02linux系统下MongoDB单节点安装教程
- 02-02Linux下nginx生成日志自动切割的实现方法
- 02-02Centos 6中编译配置httpd2.4的多种方法详解
- 02-02CentOS7 下安装telnet服务的实现方法
- 02-02分布式Hibernate search详解
- 02-02Hadoop对文本文件的快速全局排序实现方法及分析
- 02-02CentOS6.3添加nginx系统服务的实例详解
- 02-02Hadoop编程基于MR程序实现倒排索引示例


阅读排行
推荐教程
- 12-07Tomcat启动报错:严重: Unable to process Jar entry [m
- 12-07解决tomcat启动报错:一个或多个listeners启动失败问题
- 12-07一文教你怎么选择Tomcat对应的JDK版本
- 12-07Tomcat配置IPV6的实现步骤
- 12-07tomcat启动报错jar not loaded的问题
- 02-02CentOS7 下安装telnet服务的实现方法
- 12-11docker存储目录迁移示例教程
- 12-15Docker-Compose搭建Spark集群的实现方法
- 12-07Tomcat部署war包并成功访问网页详细图文教程
- 01-07windows server 2008安装配置DNS服务器




