If you are experiencing malicious behavior (aren't we all?) that is routed via/filtered through/coming from Yandex, I might suggest that you ban their entire IP block using iptables or something. This would be a fast-and effective way to ignore vast ip blocks entirely that would not tax your web app at all. E.g., if you do a network lookup for 202.46.58.136, you can see that there's a big block allocated to ShenZhen Sunrise Technology:
inetnum: 202.46.32.0 - 202.46.63.255
netname: SUNRISE
descr: ShenZhen Sunrise Technology Co.,Ltd.
descr: 2002 Jiabin Road,Luohu District,ShenZhen,China
country: CN
admin-c: MM546-AP
tech-c: MM546-AP
mnt-by: MAINT-CNNIC-AP
mnt-irt: IRT-CNNIC-CN
mnt-routes: MAINT-CNNIC-AP
status: ALLOCATED PORTABLE
changed: hm-changed@apnic.net 20050705
changed: hm-changed@apnic.net 20151202
source: APNIC
Seems to me you could formulate an iptable rule to drop all requests from this company fairly easily. Using a CIDR utility you an enter the starting ip range of 202.46.32.0 and an ending one of 202.46.63.255 and the tool will tell you that corresponds to CIDR 202.46.32.0/19. You can add a rule to drop all requests from this company using this command:
sudo iptables -I INPUT 30 -s 202.46.32.0/19 -j DROP
I realize that you cannot use this approach to actively detect new, fresh bad guys, but it's almost perfectly effective and would not tax your server much at all.
If you need to detect malicious behavior, it's helpful if ALL of your page requests are routed through a single PHP script -- this is how some frameworks (like CodeIgniter) work. You have some rewrite rule to map SEO-friendly urls onto this one file (e.g., index.php) but with query string params to select the correct functionality. If you have this, you can easily install any sort of every-page processing you need. You can have a function to scour user agents or a function to block IP addresses or some kind of heuristic sniffer to detect malicious activity and then apply some ban method -- although I must warn you that a huge IP block like 202.46.32.0/19 might result in a LOT of ip addresses ending up in a ban_table somewhere which can be quite inefficient. You might consider sending a 403 GONE result or something. If the attack is automated this is sort of like 'playing 'possum'
I had an issue where I was getting millions of SQL injection attempts from bad guys the world over -- there was no order or rhyme or reason to the IP addresses from which this attack came so I presume they were using a botnet. I added a filter that used a regex to sniff for any of the guilty sql injection hacks by looking for SELECT.*UNION or CONCAT or various other patterns that would never be sent by a legitimate request. I hesitate to say that this has truly solved the problem, but it shut the bad guys down pretty fast and got my server back to nominal functioning.
I've run into apache modules that do this kind of thing in a pretty serious way. I believe it was ModSecurity and it had certain problems sometimes -- e.g., it might interfere with actual proper site functions and it was difficult to track down the source of certain problems until I realized that apache was using pattern matching to auto-send 4xx responses when the request url matched some suspicious-looking pattern.