Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8350835: C2 SuperWord: assert/wrong result when using Float.float16ToFloat with byte instead of short input #23939

Closed
wants to merge 5 commits into from

Conversation

sviswa7
Copy link

@sviswa7 sviswa7 commented Mar 7, 2025

Float.float16ToFloat generates wrong vectorized code in product build and asserts in fastdebug/debug when argument is of type byte, int, or long array. The short term solution is to not auto vectorize in these cases.

Review comments are welcome.

Best Regards,
Sandhya


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8350835: C2 SuperWord: assert/wrong result when using Float.float16ToFloat with byte instead of short input (Bug - P2)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/23939/head:pull/23939
$ git checkout pull/23939

Update a local copy of the PR:
$ git checkout pull/23939
$ git pull https://git.openjdk.org/jdk.git pull/23939/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 23939

View PR using the GUI difftool:
$ git pr show -t 23939

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/23939.diff

Using Webrev

Link to Webrev Comment

Sorry, something went wrong.

sviswa7 added 2 commits March 6, 2025 17:17
@bridgekeeper
Copy link

bridgekeeper bot commented Mar 7, 2025

👋 Welcome back sviswanathan! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Mar 7, 2025

@sviswa7 This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8350835: C2 SuperWord: assert/wrong result when using Float.float16ToFloat with byte instead of short input

Reviewed-by: epeter, kvn

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 120 new commits pushed to the master branch:

  • 4c6a523: 8352096: Test jdk/jfr/event/profiling/TestFullStackTrace.java shouldn't be executed with -XX:+DeoptimizeALot
  • d68775d: 8351995: JFR: Leftovers from removal of Security Manager
  • e62becc: 8350964: Add an ArtifactResolver.fetch(clazz) method
  • dbf47d6: 8351876: RISC-V: enable and fix some float round tests
  • d207ed3: 8352066: JVM.commit() and JVM.flush() exhibit race conditions against JFR epochs
  • 0450ba9: 8351999: JFR: Incorrect scaling of throttled values
  • e5666f5: 8351976: assert(vthread_epoch == current_epoch) failed: invariant
  • 2eecf15: 8351967: JFR: AnnotationIterator should handle num_annotations = 0
  • c8913d2: 8345555: Improve layout of search results
  • e29d405: 8352110: [BACKOUT] C2: Print compilation bailouts with PrintCompilation compile command
  • ... and 110 more: https://git.openjdk.org/jdk/compare/08929134b3533362133139c4e964b1b28de6ebfb...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk
Copy link

openjdk bot commented Mar 7, 2025

@sviswa7 The following label will be automatically applied to this pull request:

  • hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-compiler hotspot-compiler-dev@openjdk.org label Mar 7, 2025
@sviswa7 sviswa7 marked this pull request as ready for review March 7, 2025 02:20
@openjdk openjdk bot added the rfr Pull request is ready for review label Mar 7, 2025
@mlbridge
Copy link

mlbridge bot commented Mar 7, 2025

Webrevs

Copy link
Contributor

@vnkozlov vnkozlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Mar 7, 2025
Copy link
Member

@jatin-bhateja jatin-bhateja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @sviswa7, The Fix looks reasonable to me. Kindly consider including applicable suggestions.

Best Regards

goldL = testLongKernel(aL);
}

@Test
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@Test
@Test
@IR(failOn = { IRNode.VECTOR_CAST_HF2F }, applyIfCPUFeatureOr = { "avx512vl", "true", "f16c", "true" })

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left out the IR check because we do intend to vectorize this going forward. Instead the bug fix is verified by checkResult. Also the fix is not specific to Intel platform so if we do add IR check it will need to be generic.
@eme64 your thoughts please? Would you like to see an IR check here that vectorization is not happening?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I generally prefer to have failOn IR rules, if we expect that at least for now there should be no vectorization. But add a comment why we expect no vectorization, so that if it ever does vectorize and the IR rule fails the person has a hint, and does not have to reverse-engineer too much. And if it turns out that we should one day vectorize, then we already have all these tests ready to just flip the failOn into count.

return res;
}

@Test
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@Test
@Test
@IR(counts = { IRNode.VECTOR_CAST_HF2F, " >0 " }, applyIfCPUFeatureOr = { "avx512vl", "true", "f16c", "true" })

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Used the IR test from compiler/vectorization/TestFloatConversionsVector.java to include other architectures as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @sviswa7 , these operations are also supported on PPC.

return res;
}

@Test
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@Test
/*
* C2 handles i2s conversion by constraining the value range of the integral argument; thus
* argument fed to ConvHF2F is of type T_INT. Fix for JDK-8350835 skips over vectorizing such a case
* for now.
*/
@Test
@IR(failOn = { IRNode.VECTOR_CAST_HF2F }, applyIfCPUFeatureOr = { "avx512vl", "true", "f16c", "true" })

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any harm in including the above suggested comment as you mentioned we plan to support these auto vrctoriizations in future

return res;
}

@Test
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@Test
/*
* C2 handles this in two steps: l2i handling creates ConvL2I IR ,followed by i2s conversion which onstrains the
* value range of the integral argument; thus, the argument fed to ConvHF2F is of type T_INT. Fix for
* JDK-8350835 skip over vectorizing such a case for now.
*/
@Test
@IR(failOn = { IRNode.VECTOR_CAST_HF2F }, applyIfCPUFeatureOr = { "avx512vl", "true", "f16c", "true" })

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, added the generic failOn tests for byte, int, and long test cases as it is applicable across architectures.

@sviswa7
Copy link
Author

sviswa7 commented Mar 7, 2025

Thanks a lot @vnkozlov for the review and approval.

* @key randomness
* @bug 8350835
* @summary Test bug fix for JDK-8350835 discovered through Template Framework
* @requires vm.compiler2.enabled
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* @requires vm.compiler2.enabled

Is this restriction necessary? I generally prefer running tests on all platforms, and only restricting IR rules. That way we can get more test coverage with result verification.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The restriction is not necessary, removed.

* @summary Test bug fix for JDK-8350835 discovered through Template Framework
* @requires vm.compiler2.enabled
* @library /test/lib /
* @run main/othervm -XX:-TieredCompilation -XX:CompileOnly=compiler.vectorization.TestFloat16ToFloatConv::test* compiler.vectorization.TestFloat16ToFloatConv
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the additional flags really necessary for reproducing the bug? I would suspect not really. The IR framework already takes care of ensuring we run with C2 compilation.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the additional flags.

Comment on lines 43 to 47
private static byte[] aB = new byte[SIZE];
private static short[] aS = new short[SIZE];
private static int[] aI = new int[SIZE];
private static long[] aL = new long[SIZE];
private static float[] goldB, goldS, goldI, goldL;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you testing for char as well?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added test for char.

Comment on lines 50 to 55
for (int i = 0; i < aB.length; i++) {
aB[i] = (byte)RANDOM.nextInt();
aS[i] = (short)RANDOM.nextInt();
aI[i] = RANDOM.nextInt();
aL[i] = RANDOM.nextLong();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer if we could start using Generators. There is a fill method for arrays. It generates more "interesting" values. It is not super relevant here, but it would be nice if we made this common practice now ;)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Generators don't support the bytes, shorts, and chars yet so ended up not using them for this PR.

goldL = testLongKernel(aL);
}

@Test
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I generally prefer to have failOn IR rules, if we expect that at least for now there should be no vectorization. But add a comment why we expect no vectorization, so that if it ever does vectorize and the IR rule fails the person has a hint, and does not have to reverse-engineer too much. And if it turns out that we should one day vectorize, then we already have all these tests ready to just flip the failOn into count.

@eme64
Copy link
Contributor

eme64 commented Mar 10, 2025

@sviswa7 thanks for looking at this! The fix looks good, there are just a few comments about the test :)

@sviswa7
Copy link
Author

sviswa7 commented Mar 10, 2025

@eme64 @jatin-bhateja Your review comments are addressed, please take a look.

@openjdk openjdk bot removed the ready Pull request is ready to be integrated label Mar 10, 2025
@Test
@IR(counts = {IRNode.VECTOR_CAST_HF2F, "> 0"},
applyIfOr = {"UseCompactObjectHeaders", "false", "AlignVector", "false"},
applyIfPlatformOr = {"x64", "true", "aarch64", "true", "riscv64", "true"},
Copy link
Member

@jatin-bhateja jatin-bhateja Mar 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you kindly justify the need for compressed object header usage, it will mainly impact the pre-loop trip count compuation. AlignVector should be sufficient since it's a whitelisted option

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check is taken from compiler/vectorization/TestFloatConversionsVector.java which also has float16 conversion tests to be in sync.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I remove UseCompressedHeaders check then the test will start failing for folks working on compressed headers so good to keep it there and as I mentioned before it is good to be in sync with other Float16ToFloat conversion test.

Copy link
Contributor

@eme64 eme64 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks much better :)

You are right Generators are missing cases for short, byte, char. You could leave those cases with regular Random, but the int and long cases with Generators, to make sure interesing values are added to the mix more frequently.

}

@Test
// Not vectorized due to JDK-8350835
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's very non-descriptive. Actually, that is the current bug, so this is not even a future RFE that intends to fix it.
Can you please say why it is not vectorizing now, and what might be possible conditions when it would be ok to vectorize in the future? Could we even file an RFE for this?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added descriptive comments. RFE filed at https://bugs.openjdk.org/browse/JDK-8352093.

* @bug 8350835
* @summary Test bug fix for JDK-8350835 discovered through Template Framework
* @library /test/lib /
* @run main/othervm compiler.vectorization.TestFloat16ToFloatConv
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* @run main/othervm compiler.vectorization.TestFloat16ToFloatConv
* @run driver compiler.vectorization.TestFloat16ToFloatConv

I don't think you need a new VM if you have no additional flags ;)

@sviswa7
Copy link
Author

sviswa7 commented Mar 15, 2025

@eme64 @jatin-bhateja Your review comments are handled. Please take a look.

@eme64
Copy link
Contributor

eme64 commented Mar 17, 2025

@sviswa7 The code and test looks good to me now. I'm re-running testing. Please ping me in a day for the results :)

Copy link
Contributor

@eme64 eme64 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests are passing. Approved.
@sviswa7 Thanks for the work 😊

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Mar 17, 2025
@sviswa7
Copy link
Author

sviswa7 commented Mar 17, 2025

Thanks a lot @eme64.

@sviswa7
Copy link
Author

sviswa7 commented Mar 17, 2025

/integrate

@openjdk
Copy link

openjdk bot commented Mar 17, 2025

Going to push as commit 3239919.
Since your change was applied there have been 123 commits pushed to the master branch:

  • 47c1960: 8351689: -Xshare:dump with default classlist fails on static JDK
  • 6b82b42: 8348598: Update Libpng to 1.6.47
  • 2674a31: 8351891: Disable TestBreakSignalThreadDump.java#with_jsig and XCheckJSig.java on static JDK
  • 4c6a523: 8352096: Test jdk/jfr/event/profiling/TestFullStackTrace.java shouldn't be executed with -XX:+DeoptimizeALot
  • d68775d: 8351995: JFR: Leftovers from removal of Security Manager
  • e62becc: 8350964: Add an ArtifactResolver.fetch(clazz) method
  • dbf47d6: 8351876: RISC-V: enable and fix some float round tests
  • d207ed3: 8352066: JVM.commit() and JVM.flush() exhibit race conditions against JFR epochs
  • 0450ba9: 8351999: JFR: Incorrect scaling of throttled values
  • e5666f5: 8351976: assert(vthread_epoch == current_epoch) failed: invariant
  • ... and 113 more: https://git.openjdk.org/jdk/compare/08929134b3533362133139c4e964b1b28de6ebfb...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Mar 17, 2025
@openjdk openjdk bot closed this Mar 17, 2025
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Mar 17, 2025
@openjdk
Copy link

openjdk bot commented Mar 17, 2025

@sviswa7 Pushed as commit 3239919.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-compiler hotspot-compiler-dev@openjdk.org integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

None yet

4 participants