-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8300865: C2: product reduction in ProdRed_Double is not vectorized #14065
Conversation
👋 Welcome back sviswanathan! A progress list of the required criteria for merging this PR into |
Webrevs
|
src/hotspot/share/opto/superword.cpp
Outdated
// Matcher::min_vector_size may return 1 in some cases, e.g. double for x86. | ||
// For vector reduction implemented check we need atleast two elements. | ||
int min_vec_size = MAX2(Matcher::min_vector_size(bt), 2); | ||
if (ReductionNode::implemented(use->Opcode(), min_vec_size, bt)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @sviswa7, can we use superword_max_vector_size()
as the input here? MAX2(Matcher::min_vector_size(bt), 2);
may not tally with the actual situation on other 64-bit platforms. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fg1417 Thanks a lot for the review. Yes, we could use superword_max_vector_size here. I have made the change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@sviswa7 Thanks for taking care of this. Looks good, but let me run testing at commit 4. I will report back. |
Thanks a lot @eme64. |
@sviswa7 testing to tier5 and stress testing looks good. Out of curiosity: do you have a benchmark that shows a speedup with this change? Would be nice to add it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fix! Looks good.
Thanks a lot @eme64. There was an existing jmh benchmark for vector reduction. I have updated it to add double reduction case. |
The performance numbers on my desktop are: With the PR reduction succeeds and vectorization of the loop happens when superword is enabled: |
@sviswa7 Thanks for adding the benchmark. The win is small, but that was to be expected given that the double reduction has to be performed in a linear order, and hence has quite a large latency. |
/integrate |
@sviswa7 This pull request has not yet been marked as ready for integration. |
@vnkozlov Could you also please review this PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. You may regress for next case on KNL? when using max vector size
(code is in x86.ad) which I think is fine.
case Op_MinReductionV:
case Op_MaxReductionV:
if (UseAVX > 2 && (!VM_Version::supports_avx512dq() && size_in_bits == 512)) {
return false;
}
@sviswa7 This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 167 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
Yes, that should be ok. Thanks a lot for the review @vnkozlov. |
/integrate |
Going to push as commit f9ad7df.
Your commit was automatically rebased without conflicts. |
This PR fixes the problem with double reduction on x86_64.
In the test compiler.loopopts.superword.ProdRed_Double, the product reduction loop in prodReductionImplement() was not getting vectorized when run as follows:
jtreg -XX:CompileCommand=PrintAssembly,compiler.loopopts.superword.ProdRed_Double::prodReductionImplement compiler/loopopts/superword/ProdRed_Double.java
The print assembly generated in the pid-xxx.log output in JTwork/scratch directory was not showing any vector_reduction_double node.
This was happening as the ReductionNode::implemented was passed a vector size of one element. For the vector reduction implemented we need to check with at least vector size of two elements.
With this PR the vector_reduction_double node is generated.
Please review.
Best Regards,
Sandhya
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/14065/head:pull/14065
$ git checkout pull/14065
Update a local copy of the PR:
$ git checkout pull/14065
$ git pull https://git.openjdk.org/jdk.git pull/14065/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 14065
View PR using the GUI difftool:
$ git pr show -t 14065
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/14065.diff
Webrev
Link to Webrev Comment