Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question on simplex bijector implementation #283

Open
Red-Portal opened this issue Aug 12, 2023 · 9 comments
Open

Question on simplex bijector implementation #283

Red-Portal opened this issue Aug 12, 2023 · 9 comments

Comments

@Red-Portal
Copy link
Member

Hi,

It appears that torch.probability simply uses softmax for the simplex bijector.
Is there a reason our simplex transform is much more complicated?
I was also thinking about a GPU-friendly implementation, which the current implementation appears hard do.

@torfjelde
Copy link
Member

Softmax isn't bijective. The one we have now is (maps from d to d-1 dimensional)

@devmotion
Copy link
Member

See also #51.

@Red-Portal
Copy link
Member Author

Betanalpha does discuss a bijective softmax by arbitrarily setting the endpoint logits. Any experience with this?

@devmotion
Copy link
Member

That's supported e.g. in GPLikelihoods (see maybe also the discussion in JuliaGaussianProcesses/GPLikelihoods.jl#55).

@Red-Portal
Copy link
Member Author

Red-Portal commented Aug 13, 2023

Good to know thanks. Though, back to my original intention, I really wish that our simplex bijector could play nicely with GPUs out of the box. Among non-NF bijectors, it seems the simplex bijector is really going to be the big challenge going in that direction. Do we have any plans on how to pursue this? It does seem to me that the softmax approach would be much easier to get this done.

@Red-Portal
Copy link
Member Author

Actually, nevermind. I just wrote a stick-breaking bijector using array operations based on the implementations of numpyro and tensorflow. If this were to be added to Bijectors.jl we'll probably have to add a CUDA array specialization. Let me know how to proceed on this.

@devmotion
Copy link
Member

On Julia >= 1.9, a CUDA specialization could be put in an extension (possibly could even just be an extension with GPUArrays).

@Red-Portal
Copy link
Member Author

I do have the feeling that this will have to wait until the batch operation interface is finalized. @torfjelde Do we have an expectation on when that would be?

@sethaxen
Copy link
Member

There are three main ways to use softmax for simplex transforms. One uses parameter expansion to retain bijectivity: f(y) = [softmax(y); logsumexp(y)]. The other two come from compositional data analysis literature are called additive log-ratio f(y) = softmax(vcat(y, 0)) and isometric log-ratio f(y) = softmaxx(V * y) for a particular choice of semi-orthogonal matrix V. I'm currently testing performance of each of these versus stick-breaking.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants