Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FP16 isNaN, isFinite, isInfinite intrinsics without Reinterpret nodes #1242

Draft
wants to merge 2 commits into
base: lworld+fp16
Choose a base branch
from

Conversation

Bhavana-Kilambi
Copy link
Contributor

@Bhavana-Kilambi Bhavana-Kilambi commented Sep 12, 2024

This patch removes the ReinterpretS2HF nodes in the mid-end during the generation of isNaNHF,isFiniteHF and isInfiniteHF nodes.

Performance results for this patch on an aarch64 machine -

Benchmark               Gain over baseline      Gain over default
FP16Ops.isFiniteHF      1.29                    1.85
FP16Ops.isInfiniteHF    1.28                    1.90
FP16Ops.isNaNHF         1.45                    1.58

The baseline patch generates floating point FP16 instructions and the default is where no FP16 intrinsics are used and FP32 instructions are generated.

Gain : thrpt of this patch / thrpt of either baseline or default

Tested FP16Ops.isInfiniteHF test on x86 and the performance is 2.6x better over the default case (which converts FP16 to FP32 and uses the vfpclass instruction).

The JMH tests are added in this patch.


Progress

  • Change must not contain extraneous whitespace

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/valhalla.git pull/1242/head:pull/1242
$ git checkout pull/1242

Update a local copy of the PR:
$ git checkout pull/1242
$ git pull https://git.openjdk.org/valhalla.git pull/1242/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 1242

View PR using the GUI difftool:
$ git pr show -t 1242

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/valhalla/pull/1242.diff

This patch adds intrinsic support for FP16 isNaN, isFinite and
isInfinite methods and also adds aarch64 backend for these intrinsics.

Tested all FP16 related tests successfully on aarch64.
This patch removes the ReinterpretS2HF nodes in the mid-end during the
generation of isNaNHF,isFiniteHF and isInfiniteHF nodes.

Performance results for this patch on an aarch64 machine -

Benchmark               Gain over baseline      Gain over default
FP16Ops.isFiniteHF      1.29                    1.85
FP16Ops.isInfiniteHF    1.28                    1.90
FP16Ops.isNaNHF	65504   1.45                    1.58

The baseline patch generates floating point FP16 instructions and the
default is where no FP16 intrinsics are used and FP32 instructions are
generated.

Gain : thrpt of this patch / thrpt of either baseline or default

Tested FP16Ops.isInfiniteHF test on x86 and the performance is 2.6x
better over the default case (which converts FP16 to FP32 and uses the
vfpclass instruction).

The JMH tests are added in this patch.
@bridgekeeper
Copy link

bridgekeeper bot commented Sep 12, 2024

👋 Welcome back bkilambi! A progress list of the required criteria for merging this PR into lworld+fp16 will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Sep 12, 2024

❗ This change is not yet ready to be integrated.
See the Progress checklist in the description for automated requirements.

@Bhavana-Kilambi Bhavana-Kilambi changed the title JDK-8339473: FP16 isNaN, isFinite, isInfinite intrinsics without Reinterpret nodes FP16 isNaN, isFinite, isInfinite intrinsics without Reinterpret nodes Sep 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

1 participant