Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Member activity report (star helper automation) #58

Open
vcarl opened this issue Jul 22, 2024 · 4 comments · Fixed by #73
Open

Member activity report (star helper automation) #58

vcarl opened this issue Jul 22, 2024 · 4 comments · Fixed by #73
Labels

Comments

@vcarl
Copy link
Member

vcarl commented Jul 22, 2024

We currently manually administer the Star Helper program, querying metrics resources for activity of our members and using spreadsheets to create a report to grade.

The metrics we track are quite simple: for a list of channels, rank the top 100 members by a combination score of

  • # of messages over the entire grading period
  • # of channels over the entire grading period
  • # of channels per day within the grading period

This should actually be

- - # of messages over the entire grading period
+ - volume of material posted over the entire grading period
  - # of channels over the entire grading period
  - # of channels per day within the grading period

because # of messages counts a 1-word reply and a 3 paragraph treatise as equivalent. We should count words or characters. (words, and characters/word separately?? :galaxybrain:)

Our current scoring thresholds, which could probably use some updating:

Messages
0 200 400 800 1500
0 1 2 3 4
Channels
0 2 5 10 20
0 0.25 1 2 3
Channels/day
0 1 3 6 8
0 0.25 0.5 1 2

Upvote & Fund

  • We're using Polar.sh so you can upvote and help fund this issue.
  • We receive the funding once the issue is completed & confirmed by you.
  • Thank you in advance for helping prioritize & fund our backlog.
Fund with Polar
@vcarl
Copy link
Member Author

vcarl commented Jul 24, 2024

Doing this maximally right would probably mean figuring out an appropriate time series datastore for the metrics data, but if we generate reports and delete the underlying data, that's probably sufficient for our needs here because the data is very unlikely to scale beyond megabytes. When feasible (privacy/scalability concerns relevant here), we should store data that allows for derived values to be recalculated, i.e. it seems better to me to track messages, vs updating a score on each message.

This is likely aided by completing #60 and making it easier to work with the database first, but if we use e.g. a Cloudflare Analytics Engine that may not be necessary.

This is going in mod-bot instead of reactibot because this related to context over time and a mods-eye view of the server. Cultural curation, vs specialized code for Reactiflux.

@vcarl
Copy link
Member Author

vcarl commented Aug 1, 2024

This feature, fully implemented, should allow for different groups of channels to be configured. "Channel category" is a good shortcut for selecting many channels, but arbitrary channels should be able to be grouped together for scoring purposes.

To describe it abstractly, I want this to reflect the social reality that people tend to select into distinct social groups based on what channels they participate in vs read vs never use. More concretely, this should allow us to set up qualitative participation metrics for help channels (Star Helpers), career channels, and social channels.

Part of why it'd advantageous to store data and calculate derived values for reports is to allow some flexibility with how those reports are generated. I'm not sure e.g. what thresholds will work best to capture participation, so I want to track enough metadata to get room to play with. It might make sense in the future, when those lessons about thresholds and modes of participation are better understood, to reduce the amount of data stored by calculating scores in real time

@vcarl
Copy link
Member Author

vcarl commented Aug 1, 2024

I've been thinking of this lately:

We should count words or characters. (words, and characters/word separately?? :galaxybrain:)

I think there's a lot of value in getting the metadata right, as means of generating signal for describing participation. Some thoughts, as a wishlist:

  • Author (duh), person replied to, (if in a thread) thread owner (for Reactibot threads, author of first message)
  • Channel/channel category
  • Emoji count + variety
  • Message length + wordcount
  • Inferred language/locale
  • Complex linguistic evaluations, like Flesch Kincaid, Lexile, or Gunning fog index

@polar-sh polar-sh bot added the Fund label Aug 2, 2024
This was referenced Sep 6, 2024
@vcarl vcarl closed this as completed in #73 Oct 8, 2024
vcarl pushed a commit that referenced this issue Oct 8, 2024
@vcarl
Copy link
Member Author

vcarl commented Oct 24, 2024

Whoops – #72 closed #76, not this issue. Going to reopen and mark #76 as a sub-issue of this

@vcarl vcarl reopened this Oct 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant