Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

livesim: Directly render from GPU data instead of round-tripping #3

Open
7 tasks
HadrienG2 opened this issue Jul 7, 2023 · 0 comments
Open
7 tasks
Labels
enhancement New feature or request

Comments

@HadrienG2
Copy link
Owner

HadrienG2 commented Jul 7, 2023

By virtue of being implemented using the same API as GPU-based compute backends, livesim can apply certain optimizations when running with those backends. Of those, API context sharing has already been implemented, but we're missing a fast data path when we directly use the GPU images produced by the GPU simulation instead of downloading their data to host memory then uploading it to VRAM again.

Implementation plan :

  • In data::concentration, modify Species and Evolving to handle N+1 versions of the concentration, with round robin flipping. Use this in livesim to ask for 3+1 versions of the concentration.
  • Split the livesim::pipeline code into a part that stays the same and a part that is specific to running with a buffer input. Call the latter livesim::pipeline::buffer, disable it in GPU rendering mode.
  • Add a new livesim::pipeline::image module, enabled in GPU rendering mode, which implements a rendering pipeline based on a storage image. Share shader code using the same include trick as the compute_gpu backends, except this time we're sharing most of the main function and it's only the input access that's pipeline-specific.
  • In livesim::input, comment out the upload code and change the Input typedef to Arc<StorageImage>.
  • In livesim::frames, when running under a GPU backend, disable the entire notion of upload buffers and eager inout descriptor set allocation. Instead, prepare a descriptor set cache that will be lazily initialized for each observed (Input, SwapchainImage) combination. Flush the cache anytime the swapchain is recreated. In GPU mode, the rendering callback will only take an inout descriptor set, but Frames::process_frame will take an additional Species parameter. Further, can now avoid eagerly awaiting futures.
  • In the main module, use a GPU-optimized rendering code path in GPU mode.
  • Test, and hopefully observe better rendering performance.
@HadrienG2 HadrienG2 added the enhancement New feature or request label Jan 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant