Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8348907: Stress times out when is executed with ZGC #24209

Closed
wants to merge 4 commits into from

Conversation

mgronlun
Copy link

@mgronlun mgronlun commented Mar 24, 2025

Greetings,

Here is a suggested solution for solving the intricate deadlock issues involving virtual threads, ZGC load barriers, and JFR.

A JFR event can be allocated and committed in specific sensitive contexts, such as inside mutex-protected load barriers. If the thread is a virtual thread, JFR determines its thread name by loading the oop from the thread (jt->vthread()) as part of the event commit.

This operation again triggers the load barrier, which contains a non-reentrant lock, effectively deadlocking the thread with itself.

So, for specific sensitive event sites, JFR mustn't recurse or reenter into the same event site as part of the event commit.

After a few iterations and prototypes, which failed because they eventually ended up touching some oop, I came up with the following.

From a user perspective, an event (site) can now be marked as "non-reentrant" by wrapping it in a helper class.

This instruction now guarantees JFR will not reenter this site again as part of the event.commit().

The tradeoff is that we cannot write the virtual thread name for these sensitive event sites; we will instead report "" as the virtual thread name, which is the default virtual thread name in Java. All other information about the thread, such as the thread ID, virtual thread, etc., will still be reported.

I believe it is a reasonable tradeoff and a general solution for sensitive JFR event sites, which are rare in practice, with minimal impact on event programming.

Testing: jdk_jfr, stress testing

Let me know what you think.

Thanks
Markus


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8348907: Stress times out when is executed with ZGC (Bug - P3)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/24209/head:pull/24209
$ git checkout pull/24209

Update a local copy of the PR:
$ git checkout pull/24209
$ git pull https://git.openjdk.org/jdk.git pull/24209/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 24209

View PR using the GUI difftool:
$ git pr show -t 24209

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/24209.diff

Using Webrev

Link to Webrev Comment

Sorry, something went wrong.

@bridgekeeper
Copy link

bridgekeeper bot commented Mar 24, 2025

👋 Welcome back mgronlun! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Mar 24, 2025

@mgronlun This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8348907: Stress times out when is executed with ZGC

Reviewed-by: egahlin, aboldtch, eosterlund

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 29 new commits pushed to the master branch:

  • 79bffe2: 8349361: C2: RShiftL should support all applicable transformations that RShiftI does
  • eef6aef: 8352623: MultiExchange should cancel exchange impl if responseFilters throws
  • e2a461b: 8351332: Line breaks in search tag descriptions corrupt JSON search index
  • c14bbea: 8352740: Introduce new factory method HtmlTree.IMG
  • 84d3dc7: 8352965: [BACKOUT] 8302459: Missing late inline cleanup causes compiler/vectorapi/VectorLogicalOpIdentityTest.java IR failure
  • b4dc364: 8346931: Replace divisions by zero in sharedRuntimeTrans.cpp
  • bc5cde1: 8352692: Add support for extra jlink options
  • 059f190: 8352490: Fatal error message for unhandled bytecode needs more detail
  • ee710fe: 8345169: Implement JEP 503: Remove the 32-bit x86 Port
  • eb6e828: 8351002: com/sun/management/OperatingSystemMXBean cpuLoad tests fail intermittently
  • ... and 19 more: https://git.openjdk.org/jdk/compare/bab93729c26907dc51d15dbb5651f860f0cb58ab...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the rfr Pull request is ready for review label Mar 24, 2025
@openjdk
Copy link

openjdk bot commented Mar 24, 2025

@mgronlun The following label will be automatically applied to this pull request:

  • hotspot

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot hotspot-dev@openjdk.org label Mar 24, 2025
@mgronlun
Copy link
Author

/label remove hotspot

@mgronlun
Copy link
Author

/label add hotspot-gc

@mgronlun
Copy link
Author

/label add hotspot-jfr

@openjdk openjdk bot removed the hotspot hotspot-dev@openjdk.org label Mar 24, 2025
@openjdk
Copy link

openjdk bot commented Mar 24, 2025

@mgronlun
The hotspot label was successfully removed.

@openjdk openjdk bot added the hotspot-gc hotspot-gc-dev@openjdk.org label Mar 24, 2025
@openjdk
Copy link

openjdk bot commented Mar 24, 2025

@mgronlun
The hotspot-gc label was successfully added.

@openjdk openjdk bot added the hotspot-jfr hotspot-jfr-dev@openjdk.org label Mar 24, 2025
@openjdk
Copy link

openjdk bot commented Mar 24, 2025

@mgronlun
The hotspot-jfr label was successfully added.

@mlbridge
Copy link

mlbridge bot commented Mar 24, 2025

Verified

This commit was signed with the committer’s verified signature.
robjtede Rob Ede
@openjdk
Copy link

openjdk bot commented Mar 25, 2025

⚠️ @mgronlun This pull request contains merges that bring in commits not present in the target repository. Since this is not a "merge style" pull request, these changes will be squashed when this pull request in integrated. If this is your intention, then please ignore this message. If you want to preserve the commit structure, you must change the title of this pull request to Merge <project>:<branch> where <project> is the name of another project in the OpenJDK organization (for example Merge jdk:master).

@openjdk
Copy link

openjdk bot commented Mar 25, 2025

@mgronlun Please do not rebase or force-push to an active PR as it invalidates existing review comments. Note for future reference, the bots always squash all changes into a single commit automatically as part of the integration. See OpenJDK Developers’ Guide for more information.

Verified

This commit was signed with the committer’s verified signature.
robjtede Rob Ede

Verified

This commit was signed with the committer’s verified signature.
robjtede Rob Ede

Verified

This commit was signed with the committer’s verified signature.
robjtede Rob Ede
@openjdk openjdk bot added the ready Pull request is ready to be integrated label Mar 25, 2025
Copy link
Member

@xmas92 xmas92 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a pragmatic solution.

I am not reviewing the the implications of not setting the epoch, as my understanding here is a bit lacking.

The name JfrNonReentrant seems a little general for how tightly coupled the property is to running on a virtual thread and loading an oop. At the same time this is currently the only interaction which exhibits problems with reentry, and I am not sure if there is a better name.

Copy link
Contributor

@fisk fisk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a pragmatic workaround for the problem. We might want to revisit this at some point to make it easier to use JFR in the GC code, but I think this is an appropriate fix for the bug right now. Thanks for fixing this Markus.

@mgronlun
Copy link
Author

/integrate

@openjdk
Copy link

openjdk bot commented Mar 26, 2025

Going to push as commit c2a4fed.
Since your change was applied there have been 32 commits pushed to the master branch:

  • 5392674: 8352766: Problemlist hotspot tier1 tests requiring tools that are not included in static JDK
  • 1d205f5: 8352716: (tz) Update Timezone Data to 2025b
  • a2a64da: 8352588: GenShen: Enabling JFR asserts when getting GCId
  • 79bffe2: 8349361: C2: RShiftL should support all applicable transformations that RShiftI does
  • eef6aef: 8352623: MultiExchange should cancel exchange impl if responseFilters throws
  • e2a461b: 8351332: Line breaks in search tag descriptions corrupt JSON search index
  • c14bbea: 8352740: Introduce new factory method HtmlTree.IMG
  • 84d3dc7: 8352965: [BACKOUT] 8302459: Missing late inline cleanup causes compiler/vectorapi/VectorLogicalOpIdentityTest.java IR failure
  • b4dc364: 8346931: Replace divisions by zero in sharedRuntimeTrans.cpp
  • bc5cde1: 8352692: Add support for extra jlink options
  • ... and 22 more: https://git.openjdk.org/jdk/compare/bab93729c26907dc51d15dbb5651f860f0cb58ab...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Mar 26, 2025
@openjdk openjdk bot closed this Mar 26, 2025
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Mar 26, 2025
@openjdk
Copy link

openjdk bot commented Mar 26, 2025

@mgronlun Pushed as commit c2a4fed.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-gc hotspot-gc-dev@openjdk.org hotspot-jfr hotspot-jfr-dev@openjdk.org integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

None yet

4 participants