v4.4.0
CUDA v4.4.0
Closed issues:
- Unreachable control flow leads to illegal divergent barriers (#1746)
- CUBLAS fails on new CUDA.jl v4 (#1852)
- Sort fails on Lovelace (sm8.9) GPUs (#1874)
- gesvd! crashes on Pascal and v12.0 (#1932)
- No effect for calling "nsys launch" (#1938)
- Basic math operations with nested adjoint and transpose (#1940)
- CPU and GPU implementations return results at dissimilar scales, even in double precision arithmetics (#1950)
- Failed CUDA.jl initialization breaks Flux? (#1952)
- Recent
mul!
changes break multiplication with matrices that haveStaticArray
elements (#1953) - Test infrastructure: define test groups (#1961)
- Strange
rand
errors when sampling large matrices (#1963) - Add aqua tests (#1964)
- Support of Orin GPU from Nvidia ? (#1966)
- Crash in LLVM (#1971)
- Warning cuDNN Convolution (#1972)
- Strange behaviour when installed at system level (#1973)
Merged pull requests:
- Update benchmarks for 1.8 and 1.9 (#1933) (@maleadt)
- CUSOLVER: Explicitly pass NULL when not requesting svd outputs. (#1934) (@maleadt)
- Detect and complain about loading system libraries. (#1935) (@maleadt)
- Update manifest (#1936) (@github-actions[bot])
- Avoid stack overflow with eary OOM reporting. (#1937) (@maleadt)
- [CUSPARSE] Improved support for UniformScaling ad Diagonal (#1941) (@albertomercurio)
- Update manifest (#1949) (@github-actions[bot])
- Update GPUCompiler to fix unreachable control flow. (#1951) (@maleadt)
- Allow StaticArray eltype in matmat{vec,mul} (#1954) (@lcw)
- Bump CUDNN to v8.9. (#1959) (@maleadt)
- Bump CUTENSOR to v1.7. (#1960) (@maleadt)
- Add and fix some aqua tests (#1965) (@charleskawczynski)
- Fix compatibility of CUDA 11.4 to support Orin. (#1967) (@maleadt)
- Don't use Int32 indices in rand kernels. (#1969) (@maleadt)
- CI simplifications (#1970) (@maleadt)
- Use Base.pkgversion on 1.9. (#1974) (@maleadt)
- Update to LLVM.jl 6. (#1976) (@maleadt)
- fix launch config bug in bitonic sort (#1979) (@xaellison)
- Update manifest (#1980) (@github-actions[bot])