RFC: feat: Input type must be Default+Serde but not POD #82

caspark · 2024-11-13T02:31:19Z

Change Description

(from commit msg)

Remove the bytemuck bounds on Config::Input, and replace them with Default and Serialize + DeserializeOwned.

Default is a less strict bound than Zeroable, and it's all we need - so it's more ergonomic.
NoUninit + CheckedBitPattern are also fairly stringent requirements, but it turns out that the only place that we actually rely on them is when we serialize or deserialize the input type. So we can just rely on the relevant Serde trait bounds instead, which again is more ergonomic.

Both changes are backwards incompatible, though they should be quite easy to resolve for existing users of GGRS (unless they have a philosophical objection to using Serde directly, I suppose).

However, number 2 does have some significant tradeoffs associated with it; in order of least to most bad:

A) Serde becomes part of our public API.
B) Bincode's chosen serialization scheme may interfere with the "XOR with previous input then run length encode the result" trick we use to shrink the size of inputs when we send them to connected endpoints. If this is a problem for a user, they can always manually serialize to a byte array in such a way that the run length encoding works better for their data.
C) Users may bloat their input struct to the point where it (or N instances of it, where N == number of local players) is larger than the maximum size we limit ourselves to for a single UDP packet. This is particularly problematic since right now we just panic (due to a failed assertion) when this happens.

Discussion

Firstly, I am aware that you can already do "custom serialization of an arbitrary input type" (hereafter referred to as W) by declaring Config::Input as [u8; N], with some carefully-chosen-to-be-big-enough N; the XOR + RLE that's done on input bytes should make the cost of any extra padding bytes (over and above what's needed to store the arbitrary input type) close to zero, as any long run of trailing zeros would be compressed to just a couple of bytes.

Now having said that, requiring various Bytemuck bounds on Input is quite a strong stipulation, and it sure would be nice if you could e.g. use enums with fields ( #40 ), or even just plain old bools in your input type.

So that's what this PR gets you: any type you can serialize, you can stick into your Config::Input struct - which is nice and convenient.

But: I am not exactly sure this is a wise PR to merge, mainly because of tradeoff C listed above (B seems manageable, A seems negligible). In fact, I don't think it would make sense to merge this if we don't come up with a good way to mitigate the impact of C, because clearly "crash when enough people press enough buttons" is pretty terrible behavior.

One approach (let's call it X) for mitigating C would be to treat the current assert as an error and propagate it all the way up to user code for handling. But I suspect that would be quite messy to do, and what do you do if you get that error at that point? I think there isn't really much in the way of meaningful error handling.

Another approach I can think of (Y) is that when you set a local player's input, GGRS encodes it together with other inputs this frame so far to check the total size, and if "size of these extra bytes plus the size of other local inputs for this frame so far" exceeds the maximum packet size, then return an error. The user could then choose to drop the input on the floor (or defer it to a later frame) by replacing it with a default input, which would (hopefully) fit.. but there's still a possibility that it wouldn't fit, and then the only sensible thing to do is to crash anyway.

(And besides, the "shape" of solution Y looks very similar to what's possible today via W, but at least in that approach the user is explicitly forced to think about the size of their byte array, rather than attempting to deal with their data not fitting later. Y does open up the possibility of larger input structs since the check could be done after the XOR + RLE, but that would also make it less predictable when errors occur.)

So.. that's where I'm at. I started down this rabbit hole thinking this approach probably wouldn't work, then it turns out it works quite nicely, but it potentially encourages running into a currently-quite-niche edge case - and I haven't got a nice design solution for that.

Still, I figured I'd write up and raise the PR to give others a chance to weigh in (and there's always the chance I've misunderstood something; I'm still new to this crate).

gschup · 2024-11-20T06:55:14Z

Hi, thanks for the PR! I understand the sentiment behind trying to get away from bytemuck. From a ease-of-use perspective, I prefer the serde bounds.

My only concern is performance, which was the motivation to pick bytemuck in the first place. It might be over-optimized and probably doesn't matter, but I would like to see some numbers in order to understand if this approach is similar in performance.

caspark · 2024-11-20T11:58:36Z

Regarding performance, the bytemuck code path is (afaik) approximately C's calloc + Rust's transmute - it is basically impossible to beat that. But, well...

we are talking about initializing/serializing/deserializing structs that are strictly less than 467 bytes, typically closer to 10-100 bytes (before XOR+RLE), and doing that perhaps 1-4 times in the context of doing network IO in the usual case or 1-4x multiplied by up to 8 advance-frames of the main game simulation in the worst case.
using the log benchmark from the rust serialization benchmark as a (probably pessimistic?) proxy for serializing and deserializing some game input, I figure it's approximately 300 nanoseconds to serialize and deserialize a single log entry using bincode1 (set bandwidth to many many tb, CPU to 1, dataset to log, mode to roundtrip, untick zlib & zstd, find bincode1 entry, divide 1 second by 3421986 logs/second -> approx 292 nanoseconds per log).

I personally don't think it's worth doing any benchmarking or number crunching beyond those back of the envelope calculations?

I think point C I discussed above is much more of an issue.

FWIW I think even if we don't¹ merge this particular PR due to that concern, replacing the Zeroable bound with a Default bound would still improve ergonomics.

I am actually fine with not merging this PR - I have applied the "input is a fixed size byte array" workaround myself and it was easy enough and works fine. So I don't really need to have the Bytemuck bounds relaxed myself. ↩

gschup · 2024-12-12T09:48:52Z

I thought a little bit about this and I think it might be best to just simplify these bounds as you did in the PR. If users start to use unreasonably large input structs, then we can help them optimize afterwards.

If you clean up the merge conflicts, I will merge this and make a new GGRS release. We got enough changes by now :)

caspark · 2024-12-13T09:27:27Z

Will do - don't have time today but hope to find some time over the weekend.

Re new release, #79 introduced 2 bugs that I still need to PR:

rolling back to furthest frame in prediction window currently panics - easy to trigger on bad connections. I have a fix for this already.
disconnects aren't handled properly. I fixed this for non-lockstep code but lockstep mode still crashes so I need to find another hour or two to investigate more.

I haven't raised issues for these yet¹, but still, I would recommend holding off on releasing for a bit :)

The new issue template for this repo is extensive enough that by the time I've gathered enough info to fill out the template then I'll have figured out the bug well enough to just raise the PR. Sometimes that's a good thing though. ↩

gschup · 2024-12-13T17:18:39Z

The new issue template for this repo is extensive enough that by the time I've gathered enough info to fill out the template then I'll have figured out the bug well enough to just raise the PR. Sometimes that's a good thing though.

This is simply the default template proposed by github. Please don't feel restricted by it. We could also change it!

Remove the bytemuck bounds on Config::Input, and replace them with Default and Serialize + DeserializeOwned. 1. Default is a less strict bound than Zeroable, and it's all we need - so it's more ergonomic. 2. NoUninit + CheckedBitPattern are also fairly stringent requirements, but it turns out that the only place that we actually rely on them is when we serialize or deserialize the input type. So we can just rely on the relevant Serde trait bounds instead, which again is more ergonomic. Both changes are backwards incompatible, though they should be quite easy to resolve for existing users of GGRS (unless they have a philosophical objection to using Serde directly, I suppose). However, number 2 does have some significant tradeoffs associated with it; in order of least to most bad: * Serde becomes part of our public API. * Bincode's chosen serialization scheme may interfere with the "XOR with previous input then run length encode the result" trick we use to shrink the size of inputs when we send them to connected endpoints. If this is a problem for a user, they can always manually serialize to a byte array in such a way that the run length encoding works better for their data. * Users may bloat their input struct to the point where it (or N instances of it, where N == number of local players) is larger than the maximum size we limit ourselves to for a single UDP packet. This is particularly problematic since right now we just panic (due to a failed assertion) when this happens.

caspark · 2024-12-14T06:44:58Z

I've rebased this change onto main now and fixed the merge conflicts. With #87 "fixing" problem C, I think it should be ready for merge.

Re issue template being the default github one - ha, TIL! I've not turned it on for any projects so I had no idea :)

caspark force-pushed the no-bytemuck-bounds-on-input branch 2 times, most recently from f90605e to 6dcf1c1 Compare November 13, 2024 02:38

caspark mentioned this pull request Dec 12, 2024

Only panic when sending large packets if in debug mode #87

Merged

gschup approved these changes Dec 12, 2024

View reviewed changes

caspark force-pushed the no-bytemuck-bounds-on-input branch from 6dcf1c1 to a06b63b Compare December 14, 2024 06:37

johanhelsing mentioned this pull request Dec 14, 2024

Migrate to Bevy 0.15 gschup/bevy_ggrs#113

Merged

This was referenced Dec 14, 2024

Fix max rollback and lockstep handling #88

Merged

feat: Support custom input prediction #83

Open

gschup merged commit 58ed06e into gschup:main Dec 14, 2024
2 checks passed

caspark mentioned this pull request Dec 15, 2024

Support enum in Config::Input #40

Closed

johanhelsing mentioned this pull request Dec 15, 2024

Bevy 0.15 and latest ggrs main gschup/bevy_ggrs#114

Merged

1 task

This was referenced Dec 15, 2024

Support enum with fields in Config::Input #44

Closed

Default predicted input on first frame (zeroed Input) assumes zeroed is equivalent to no movement #74

Closed

This was referenced Dec 15, 2024

Add changelog entry for bytemuck removal #92

Merged

Consider making GGRS client-oriented ("player-agnostic") #94

Open

Variable size Config::Input causes crash in delta encoding #96

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: feat: Input type must be Default+Serde but not POD #82

RFC: feat: Input type must be Default+Serde but not POD #82

caspark commented Nov 13, 2024 •

edited

Loading

gschup commented Nov 20, 2024

caspark commented Nov 20, 2024 •

edited

Loading

gschup commented Dec 12, 2024

caspark commented Dec 13, 2024

gschup commented Dec 13, 2024

caspark commented Dec 14, 2024

RFC: feat: Input type must be Default+Serde but not POD #82

RFC: feat: Input type must be Default+Serde but not POD #82

Conversation

caspark commented Nov 13, 2024 • edited Loading

Change Description

Discussion

gschup commented Nov 20, 2024

caspark commented Nov 20, 2024 • edited Loading

Footnotes

gschup commented Dec 12, 2024

caspark commented Dec 13, 2024

Footnotes

gschup commented Dec 13, 2024

caspark commented Dec 14, 2024

caspark commented Nov 13, 2024 •

edited

Loading

caspark commented Nov 20, 2024 •

edited

Loading