-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
optimize vector component access and use GLM_ASSERT_LENGTH in dual_quaternion #1308
base: master
Are you sure you want to change the base?
Conversation
I'm the developer of the hit game 4D Miner (which uses GLM). The vector operator[] calls are getting inlined as switch statements, which aren't getting optimized. There is a lot of code in the game that deals with component indices, so there will probably be a considerable speedup for certain functions if this is merged, and some functions will actually be inlined just because of the changes in this pr. |
but this techincally has UB though because the standard allows padding of members of same data type although no implementation does |
I would argue that the consistency and performance improvements far outweigh the risks of UB here then. If this can't be merged, then this optimization should probably be removed from If any major compiler ever breaks this, preprocessor checks can be added in the future for that compiler version, or this optimization can be removed altogether. For now at least, considering the large performance gains which have already been demonstrated for a real-world use case, and that this optimization has already been used elsewhere in the project for years, I see no reason this shouldn't be merged. |
we already have UB because of union type punning but all compilers define it like C behavior and all compiler defines your issue's behavior and even if not we can add a fallback so this is worth it to implement, I agree with you . sad that speezing out performance is technically "ub" |
to note the thing we lose here is constexpr evaluation for any index but 0. but I would say the performance benefits is worth the change |
I would agree. From my experience, the vast majority of use cases for component indexing are when you have to determine which component you are accessing at runtime. Constexpr component indexing seems like a very niche use case that could be done manually with a switch statement if it is ever needed. |
@Mashpoe this actually has a huge impact on other constexpr functions like all of glm matrix multiplications so alot of code needs to be rewritten template<typename T, qualifier Q>
GLM_FUNC_QUALIFIER GLM_CONSTEXPR mat<4, 2, T, Q> operator*(mat<2, 2, T, Q> const& m1, mat<4, 2, T, Q> const& m2)
{
return mat<4, 2, T, Q>(
m1[0][0] * m2[0][0] + m1[1][0] * m2[0][1],
m1[0][1] * m2[0][0] + m1[1][1] * m2[0][1],
m1[0][0] * m2[1][0] + m1[1][0] * m2[1][1],
m1[0][1] * m2[1][0] + m1[1][1] * m2[1][1],
m1[0][0] * m2[2][0] + m1[1][0] * m2[2][1],
m1[0][1] * m2[2][0] + m1[1][1] * m2[2][1],
m1[0][0] * m2[3][0] + m1[1][0] * m2[3][1],
m1[0][1] * m2[3][0] + m1[1][1] * m2[3][1]);
} this function is no longer constexpr and needs to be rewritten in terms of accessing members |
@ZXShady we have tested that exact function in GCC, Clang, and MSVC, and Could you possibly tell us which compiler you are using, as well as any compiler flags that might affect the outcome? Thanks. |
Here’s a little godbolt link to an example of how to achieve something similar, while still being able to use constexpr, including for c++ versions before c++20: |
your code doesn't work if you change the active member of a union constexpr int f() {
vec v{0};
v.x = 1;
return v[0]; // reads from data[0] which is not the active member no longer constexpr
}
static_assert(f() == 1); // not valid constant expression the simplest solution is
but doesn't apply it to 14 and 17 to not break API #if defined __cpp_constexpr && __cpp_constexpr >= 201301L
constexpr
#endif
int operator[](int i)
#if CXX14 || CXX17
#if CXX20
if (std::is_constant_evaluted()) {
#endif
switch(i) {
case 0: return x;
case 1: return y;
case 2: return z;
}
#if CXX20
} else {
#endif
#else
return (&x)[i];
#endif // CXX14 || CXX17
#if CXX20
}
#endif
#endif bit the above still doesn't work because of vec memberd r,g,b,a and t,u,p,q changing the active member will break the constexpr here so I think the firdt option is the best. actually this problem is too complicated because of the aliasing memebrs r,g,b,a and others. look at this code for example if your change was implemented constexpe bool f()
{
glm::vec3 v = {1,2,3};
v.x = 5;
return glm::all(glm::equal(v,{5,2,3}));
}
static_assert(f()); // non constexlr function
// rrading from inactive member `data` I specifically choose yo use glm::equal instead of == to demonstrate that your solution can break with the glm algorithms that use indexing as a way to iterate they all will be no longer constexpr. |
@ZXShady I guess this would break reference-indexing in constexpr contexts, not sure how tolerable that is, because I'd imagine most uses of |
the title says it all
the explanation for the optimization:
switch/case
(which is also clearly visible if you ever try to RE some code that uses thatoperator[]
)type_quat
so i don't understand why it wasn't done intype_vec
salso
dual_quaternion
operator[]
for some reason didn't use theGLM_ASSERT_LENGTH
macro, so i fixed that.