-
-
Notifications
You must be signed in to change notification settings - Fork 333
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Protection agains spam #299
Comments
Currently I used following script that generates a Python code, that deletes users with all their content: from django.contrib.auth.models import User
from textwrap import wrap
lines = []
for user in User.objects.order_by('-date_joined')[:100]:
lines += ['', f"{user.pk:>6}, # {user.username:<20} {user.email:<40}"]
last_comment = getattr(user.st_comments.order_by('-date').first(), 'comment', None)
if last_comment:
lines += [''] + wrap(
last_comment,
initial_indent=(' ' * 8) + ' # ',
subsequent_indent=(' ' * 8) + ' # ',
max_lines=8,
width=72,
)
lines = '\n'.join(lines)
print(f"\nUser.objects.filter(pk__in=[\n{lines}\n]).delete()\n\n") I just review generated script, remove all non-spam users and run this code. |
Deleting users is probably never a good idea. They can just register again with the same email. You should deactivate their account instead. Hard deleting topics, and comments will break notifications, bookmarks, and possibly other things. There is a very simple registration protection, that may help against bots, but humans will bypass any protection anyway. There are a few things that may help, like not allowing new users to post links, or having a queue of messages that trusted users/mods can review and approve (ala stackoverflow); but things a like captchas are annoying and useless against humans. There should be a way to soft delete all topics/comments by user. I wonder, what kind of spam did you get? |
Added reCapcha, will see if it helps: sirex/ubuntu.lt@bb00c03...959453e Spam is generated no by humans, but by bots, some how they are easily able to go through all the email verification. And email addresses are not issue, they use a random email every time. Here is a few examples:
Quite quickly I reached the point, that most users are spam users, more precisely there were about 6000 real users and more than 6000 spam users. And regarding messages, some topics have 3 real user posts, and 1000 fake spam user posts. If I would mark those posts as deleted I would see endless pages of deleted posts. So this is the case, where most of the content is generated by fake spam users and there is no point keeping that spam generated content in database. I hope reCaptcha will help. And next thing, to fix some how incorrect comment counts and redirects to a in-existing page. Forum in question is https://ubuntu.lt/. Registration form with reCaptcha looks like this: https://ubuntu.lt/user/register/ |
I would be really surprised, if all this spam would be generated by real humans. My guess, that spam bots just became very sophisticated. And without serious anti-spam protection, they can ruin a community forum in days. And ubuntu.lt community forum exists for more than 15 years. |
Am I supposed to see the captcha right away? I see there is a "captcha" label, but that's it... there is no captcha.
If they are bots then the captcha should help. Let me know how it goes, it may be worth to add it as an optional feature.
That's a good point. |
There are several reCaptcha versions, I'm using latest v3, which some how detects bots automagically without showing an image or something like that. Older versions shows some images and asks to enter what is in that image. |
I think, I managed to fix comment count and last active date for topics with this query: from spirit.topic.models import Topic
from spirit.comment.models import Comment
from django.db.models import Case, When, Value, Exists, OuterRef, Subquery, Count, Max
Topic.objects.update(
comment_count=Case(
When(
Exists(
Comment.objects.
filter(
topic_id=OuterRef('id'),
is_removed=False,
action=Comment.COMMENT,
)
),
then=Subquery(
Comment.objects.
filter(
topic_id=OuterRef('id'),
is_removed=False,
action=Comment.COMMENT,
).
values('topic_id').
order_by('topic_id').
annotate(comment_count=Count('*')).
values('comment_count')[:1]
),
),
default=Value(0),
),
last_active=Subquery(
Comment.objects.
filter(topic_id=OuterRef('id')).
values('topic_id').
order_by('topic_id').
annotate(last_active=Max('date')).
values('last_active')[:1]
),
) |
Last fix with bookmarks pointing to a comment number, that no longer exists. The fix is not perfect, it only ensures, that comment number is not greater than total number of comments on a topic. So this does not guarantee, that bookmark points to correct last seen comment, but it ensures, that does not end up on a 404 page, when comment number points to a non existing comment. The query used was: from spirit.comment.bookmark.models import CommentBookmark
from django.db.models import Case, When, OuterRef, Subquery, Count, F, PositiveIntegerField
CommentBookmark.objects.update(
comment_number=Subquery(
CommentBookmark.objects.
filter(id=OuterRef('id')).
values('user_id', 'topic_id', 'comment_number').
order_by('user_id', 'topic_id', 'comment_number').
annotate(
total_comments=Count('topic__comment'),
).
annotate(
comment_number_=Case(
When(
comment_number__gt=F('total_comments'),
then=F('total_comments'),
),
default=F('comment_number'),
output_field=PositiveIntegerField(),
),
).
values('comment_number_')[:1],
),
) |
So in summary, in order to improve protection against spam Spirit needs following things:
If these features would be available, then spam bots would no be able to attack the forum at such a massive scale as it happened with ubuntu.lt community forum. |
Now, more than a month has passed, and during that time, I found 4 new spam users, it looks, that at least two of them were manually created users. So it looks reCAPTCHA did the job. |
I just finished cleaning up spam users. It looks, that Spirit does not have any protection against Spam, because I had to clean about 6000 spam users, with random user names and random emails.
I will try to hack something, to add protection against Spam, but Spirit should definitely have this built in.
Also, since Spirt does not have good moderation tools, I had to delete spam users directly for Python shell, but now I have incorrect comment numbers, errors when Spirit tries to jump to a page which no longer exists. So it would be nice to have a script, that would update all that.
The text was updated successfully, but these errors were encountered: