Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detection improvement idea #287

Open
gugu opened this issue Feb 23, 2025 · 2 comments
Open

Detection improvement idea #287

gugu opened this issue Feb 23, 2025 · 2 comments
Assignees

Comments

@gugu
Copy link

gugu commented Feb 23, 2025

Steps to reproduce

Right now most malicious bot authors copy user-agent string from their browser. As it is identical to browser user-agent there is no way to distinguish between them. But there is a possibility to detect such bots (and that's how we detect them in our link shortener).

As most of browsers are ever-green and require upgrades, their version always updates. We use caniuse.com database to detect browser usage share (it is available to download in JSON format)
That's how we do this:

If browser's version is old and usage is below threshold (0.1% for us) we mark this browser as bot.

Expected behaviour

Chrome 133 can return returns isbot = false in Feb 2025, but will return isbot = true in Feb 2026

Actual behaviour

It marks Chrome 76 as a human, even if no human uses it

Additional details

While we don't have resources to opensource our solution, I would like to share the idea with you. It is up to you if you would like to implement this or if it is out of scope of the module

@sagrawal31
Copy link

Interesting. But not sure how optimised this approach would be.

@omrilotan
Copy link
Owner

omrilotan commented Feb 24, 2025

That's a very interesting idea. Finding old browsers suspicious can be statistically efficient. Even though, in my experience, there are quite a few orgenisations that are limited to the browsers they can use for various reasons.

Your solution sounds like something that can come is useful in extensive parsers like ua-parser-js, as "modern" / "dated" browser flag.

However, this library is solely focused on "good bots" - ones who intentionally recognise themselves as automated services.

What does "isbot" do?

This package aims to identify "Good bots". Those who voluntarily identify themselves by setting a unique, preferably descriptive, user agent, usually by setting a dedicated request header.

What doesn't "isbot" do?

It does not try to recognise malicious bots or programs disguising themselves as real users.

See Definitions and Clarifications sections in the readme.

As a rule, I don't recommend using this tool as a security feature. Malicious traffic will either try to disguise as legitimate browser - then the user agent string is not enough, or as a web crawler (like Googlebot) - in which case you might wrongfully allow it into your network.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants