-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add dtype choice in step type/functions #256
Comments
I think that would be a nice addition, happy to review it 😄 |
To be honest I think the discounts should probably be booleans while they are stored in the timestep because for me they just indicated end of episode, but I think this would add nice flexibility |
I'm fine with both, as long as it doesn't take too much space. I go with argument set by default to boolean ? or just boolean ? |
My only issue is that this strays from the original dm_env api where it is a float so it can represent both RL discount (gamma) and done. Let's definitely add it as an argument, but for the default I'm not sure if boolean or float32 is best @clement-bonnet any thoughts on this? |
To my knowledge, having the discount as a float is more common than as a boolean for the reasons you mentioned @sash-a. I would keep it a float unless there are strong reasons to do otherwise :) |
Great then if you could add the argument with a default of float32, I'm happy to accept the PR |
The PR is available for review :) |
Is your feature request related to a problem? Please describe
In a personal project, I had to be very efficient in the memory management, my reward were taking a lot of space. In my case I needed to change the reward and the discount to float16 instead of float32. I had to copy over the types file to do the modification locally. However I feel like some use case may need this flexibility.
Describe the solution you'd like
Give an extra parameter to the step functions to give dtype. (Example with one of them)
I would be happy to do the PR.
Misc
The text was updated successfully, but these errors were encountered: