本帖最后由 great_andy_ 于 2019-9-15 19:59 编辑
军哥您好!
我在为虚拟机添加防爬虫代码的时候遇到一个错误提示,想请教下怎样解决。
我安装了最新的LNMP1.6,然后查看到NGINX的版本是 nginx/1.16.1
虚拟机安装成功了的,我添加防爬虫的方法如下:
# cd /usr/local/nginx/conf/vhost
# vi agent_deny.conf
然后添加代码,保存:
if ($http_user_agent ~* "qihoobot|Baiduspider|Googlebot|Googlebot-Mobile|Googlebot-Image|Mediapartners-Google|Adsbot-Google|Feedfetcher-Google|Yahoo! Slurp|Yahoo! Slurp China|YoudaoBot|Sosospider|Sogou spider|Sogou web spider|MSNBot|ia_archiver|Tomato Bot|Catall Spider|AcoiRobot")
{
return 403;
}
if ($http_user_agent ~ "WinHttp|WebZIP|FetchURL|node-superagent|java/|FeedDemon|Jullo|JikeSpider|Indy Library|Alexa Toolbar|AskTbFXTV|AhrefsBot|CrawlDaddy|Java|Feedly|Apache-HttpAsyncClient|UniversalFeedParser|ApacheBench|Microsoft URL Control|Swiftbot|ZmEu|oBot|jaunty|Python-urllib|lightDeckReports Bot|YYSpider|DigExt|HttpClient|MJ12bot|heritrix|EasouSpider|Ezooms|BOT/0.1|YandexBot|FlightDeckReports|Linguee Bot|iaskspider^$")
{
return 403;
}
if ($request_method !~ ^(GET|HEAD|POST)$)
{
return 403;
}
if ($http_user_agent ~* (Python|Java|Wget|Scrapy|Curl|HttpClient|Spider))
{
return 403;
}
最后在虚拟机的配置文件中加入上面这个 agent_deny.conf 文件:
server {
listen 80;
server_name *******;
access_log /home/wwwlogs/access.log main;
## 这个就是反爬虫文件
include /usr/local/nginx/conf/vhost/agent_deny.conf;
location / {
........
但是我在重启LNMP的时候,提示:
Stoping nginx... nginx: [emerg] "if" directive is not allowed here in /usr/local/nginx/conf/vhost/agent_deny.conf:1 failed. Use force-quit
请问怎样添加才是正确的啊?谢谢军哥!
|