Detection improvement idea #287

gugu · 2025-02-23T13:53:12Z

Steps to reproduce

Right now most malicious bot authors copy user-agent string from their browser. As it is identical to browser user-agent there is no way to distinguish between them. But there is a possibility to detect such bots (and that's how we detect them in our link shortener).

As most of browsers are ever-green and require upgrades, their version always updates. We use caniuse.com database to detect browser usage share (it is available to download in JSON format)
That's how we do this:

If browser's version is old and usage is below threshold (0.1% for us) we mark this browser as bot.

Expected behaviour

Chrome 133 can return returns isbot = false in Feb 2025, but will return isbot = true in Feb 2026

Actual behaviour

It marks Chrome 76 as a human, even if no human uses it

Additional details

While we don't have resources to opensource our solution, I would like to share the idea with you. It is up to you if you would like to implement this or if it is out of scope of the module

sagrawal31 · 2025-02-23T14:03:11Z

Interesting. But not sure how optimised this approach would be.

omrilotan · 2025-02-24T10:00:33Z

That's a very interesting idea. Finding old browsers suspicious can be statistically efficient. Even though, in my experience, there are quite a few orgenisations that are limited to the browsers they can use for various reasons.

Your solution sounds like something that can come is useful in extensive parsers like ua-parser-js, as "modern" / "dated" browser flag.

However, this library is solely focused on "good bots" - ones who intentionally recognise themselves as automated services.

What does "isbot" do?

This package aims to identify "Good bots". Those who voluntarily identify themselves by setting a unique, preferably descriptive, user agent, usually by setting a dedicated request header.

What doesn't "isbot" do?

It does not try to recognise malicious bots or programs disguising themselves as real users.

See Definitions and Clarifications sections in the readme.

As a rule, I don't recommend using this tool as a security feature. Malicious traffic will either try to disguise as legitimate browser - then the user agent string is not enough, or as a web crawler (like Googlebot) - in which case you might wrongfully allow it into your network.

gugu assigned omrilotan Feb 23, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Detection improvement idea #287

Detection improvement idea #287

gugu commented Feb 23, 2025

sagrawal31 commented Feb 23, 2025

omrilotan commented Feb 24, 2025 •

edited

Loading

What does "isbot" do?

What doesn't "isbot" do?

Detection improvement idea #287

Detection improvement idea #287

Comments

gugu commented Feb 23, 2025

Steps to reproduce

Expected behaviour

Actual behaviour

Additional details

sagrawal31 commented Feb 23, 2025

omrilotan commented Feb 24, 2025 • edited Loading

What does "isbot" do?

What doesn't "isbot" do?

omrilotan commented Feb 24, 2025 •

edited

Loading