Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data object encoding #692

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open

Conversation

nevans
Copy link

@nevans nevans commented Oct 30, 2024

The first commit adds basic encoding/decoding for ruby 3.2 Data objects in (what seems to me) the most straightforward way: Visitor::YamlTree copies all of the members out into a mapping, and Visitor::ToRuby calls #initialize with that mapping. This could lead to side effects, if users have overridden #initialize in weird ways, but that's true of Struct and the attr writer methods and many of the types in Visitors::ToRuby, too.

The second commit handles instance variables, and I feel less happy with that implementation. It sets the ivars before calling initialize, which feels wrong. But Data doesn't give us any mechanism for setting the members other than 1) initialize, or 2) drop down into the C API. Since initialize freezes the object, we need to set the ivars before that. I think this is a reasonable compromise—if users need better handling, they can implement their own encode_with and init_with. But it will lead to unhappy surprises for some users.

Alternatively, maybe we could use the C API, similarly to Marshal. Psych is already using the C API for path2class and build_exception. This would be the least surprising behavior for users, I think.

Edited to add: I added ToRuby#init_struct(obj, members) to the C extension, which delegates to rb_struct_initialize.

This sets the ivars _before_ calling initialize, which feels wrong.  But
Data doesn't give us any mechanism for setting the members other than 1)
initialize, or 2) drop down into the C API.  Since initialize freezes
the object, we need to set the ivars before that.  I think this is a
reasonable compromise—if users need better handling, they can implement
their own `encode_with` and `init_with`.  But it will lead to unhappy
surprises for some users.

Alternatively, we could use the C API, similarly to Marshal.  Psych _is_
already using the C API for path2class and build_exception.  This would
be the least surprising behavior for users, I think.
@nevans
Copy link
Author

nevans commented Nov 1, 2024

FYI: I haven't put any guards in yet to make this work for older versions of ruby (before Data was defined).

@nevans
Copy link
Author

nevans commented Nov 5, 2024

I'll update this to pass for earlier versions of ruby.

But I'm also looking for feedback on the approach. The more I think about it, the more I'm thinking this should use rb_struct_initialize from the C API, then copy over any ivars, then freeze. Unless I'm misreading the C code (very possible), that will skip any user-defined #initialize override, and just use the underlying Struct implementation... and, although that's not identical to rb_data_initialize_m, its close enough and it's exported and in the headers. That's what Marshal does, since Marshal just treats Data like Struct... which also means that Marshal does not freeze Data objects!

1. `defined?(::Data)` is insufficient: there was an earlier `::Data`
   class before the new one was added.  Checking for `Data.define`.
2. arg forwarding with `def foo(...)` wasn't introduced until 2.7.3.
3. implicit hash/kwparam values wasn't added until 3.1.
@nevans
Copy link
Author

nevans commented Nov 8, 2024

For what it's worth, I've updated the PR to:

  1. pass with ruby versions down to 2.5 (at least on my machine)
  2. use rb_struct_initialize from the C API

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

1 participant