8287281: adjust guarantee in Handshake::execute for the case of target thread being current #8992

jdksjolen · 2022-06-02T13:47:23Z

Please review this PR for fixing JDK-8287281.

If a thread is handshake safe we immediately execute the closure, instead of going through the regular Handshake process.

Finally: Should VirtualThreadGetThreadClosure and its do_thread() body be inlined instead? We can do this in this PR, imho, but I'm hoping to get some input on this.

Passes tier1. Running tier2-5.

Progress

Change must be properly reviewed (1 review required, with at least 1 Reviewer)
Change must not contain extraneous whitespace
Commit message must refer to an issue

Issue

JDK-8287281: adjust guarantee in Handshake::execute for the case of target thread being current

Reviewers

Robbin Ehn (@robehn - Reviewer) ⚠️ Review applies to bf75d4c8
Patricio Chilano Mateo (@pchilano - Reviewer) ⚠️ Review applies to bf75d4c8
David Holmes (@dholmes-ora - Reviewer)
Daniel D. Daugherty (@dcubed-ojdk - Reviewer)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk pull/8992/head:pull/8992
$ git checkout pull/8992

Update a local copy of the PR:
$ git checkout pull/8992
$ git pull https://git.openjdk.org/jdk pull/8992/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 8992

View PR using the GUI difftool:
$ git pr show -t 8992

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/8992.diff

bridgekeeper · 2022-06-02T13:48:49Z

👋 Welcome back jdksjolen! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

openjdk · 2022-06-02T13:51:12Z

@jdksjolen The following labels will be automatically applied to this pull request:

hotspot
serviceability

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

robehn

I agree, thanks for fixing.

openjdk · 2022-06-02T13:53:10Z

@jdksjolen This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8287281: adjust guarantee in Handshake::execute for the case of target thread being current

Reviewed-by: rehn, pchilanomate, dholmes, dcubed

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 346 new commits pushed to the master branch:

2728770: 8288589: Files.readString ignores encoding errors for UTF-16
ef17ee4: 8288515: (ch) Unnecessary use of Math.addExact() in java.nio.channels.FileLock.overlaps()
72f286a: 8287580: (se) CancelledKeyException during channel registration
b8db0c3: 6980847: (fs) Files.copy needs to be "tuned"
d579916: 8288740: Change incorrect documentation for sjavac flag
26c03c1: 8288719: [arm32] SafeFetch32 thumb interleaving causes random crashes
a802b98: 8287760: --do-not-resolve-by-default gets overwritten if --warn-if-resolved flags is used
bf0623b: 8286314: Trampoline not created for far runtime targets outside small CodeCache
5b583e4: Merge
6458ebc: 8288988: ProblemList serviceability/jvmti/vthread/ContStackDepthTest/ContStackDepthTest.java in -Xcomp mode
... and 336 more: https://git.openjdk.org/jdk/compare/6e55a72f25f7273e3a8a19e0b9a97669b84808e9...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@robehn, @pchilano, @dholmes-ora, @dcubed-ojdk) but any other Committer may sponsor as well.

➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).

mlbridge · 2022-06-02T13:55:56Z

Webrevs

jdksjolen · 2022-06-03T07:37:19Z

The tests failed and my assumption was wrong: There are other instances of handshaking with their own thread as target, We reverse the strategy and call do_thread directly in Handshake::execute.

robehn

Thanks for the update, looks good.

Remember to re-run the tests!

pchilano

Looks good to me.

Thanks,
Patricio

jdksjolen · 2022-06-05T16:01:25Z

Passed tier1-5.

/integrate

openjdk · 2022-06-05T16:03:12Z

@jdksjolen
Your change (at version bf75d4c) is now ready to be sponsored by a Committer.

dholmes-ora

Hi Johan,

I like the idea of this, but am not clear on all the details for all possible cases - see below.

It also makes me wonder about the async case, where Handshake::execute(AsyncHandshakeClosure*, ...) never processes the handshake directly even if it is for the current thread. The async case seems to be a two phase protocol:

Install async op on yourself
At some later handshake state poll discover the op you previously installed.
??

There are a few minor nits/suggestions below as well.

Thanks.

src/hotspot/share/runtime/handshake.cpp

src/hotspot/share/prims/jvmtiEnvThreadState.cpp

dholmes-ora · 2022-06-06T07:17:15Z

src/hotspot/share/prims/jvmtiEnvThreadState.cpp

+      Handshake::execute(&op, thread);
+      guarantee(op.completed(), "Handshake failed. Target thread is not alive?");


I much prefer that the current-thread case is internalised by Handshake::execute now. The code creating the handshake op shouldn't have to worry about current thread or not.

Having Handshake::execute() handle the current-thread case will certainly
allow us to make the code consistent in all the callers of Handshake::execute().

src/hotspot/share/prims/jvmtiEventController.cpp

dholmes-ora · 2022-06-06T07:28:07Z

src/hotspot/share/runtime/handshake.cpp

+  if (target->is_handshake_safe_for(self)) {
+    hs_cl->do_thread(target);
+    return;


I like the idea of doing this, but I can't quite convince myself that it will always be safe when the target is not the current thread. ??

Because we're pushing the special case handling for current-thread down
into the three parameter version of Handshake::execute(), we'll also
directly execute the closure's do_thread() function in other calls to the
three parameter version of Handshake::execute() where we didn't change
the calling code site in this patch:

src/hotspot/share/classfile/javaClasses.cpp: async_get_stack_trace()

src/hotspot/share/prims/jvmtiExtensions.cpp: GetCarrierThread()

src/hotspot/share/prims/whitebox.cpp: WB_HandshakeReadMonitors(), WB_HandshakeWalkStack()

src/hotspot/share/runtime/handshake.cpp: execute(HandshakeClosure* hs_cl, JavaThread* target)
Of course, since the two parameter version of Handshake::execute() is
now a changed code path, that means that all callers to the two parameter
version of Handshake::execute() are also affected. No, I'm not going to
list all those call sites.

This is a change in behavior and I'm not saying that this is wrong, but it's
not clear to me that the repercussions are understood and discussed in
this PR.

What I'm mumbling about here might be the same thing that @dholmes-ora is
worried about, but I'm just being more verbose about it. :-)

dcubed-ojdk

This bug/PR is specifically about this block of code:

  if (tlh == nullptr) {
    guarantee(Thread::is_JavaThread_protected_by_TLH(target),
              "missing ThreadsListHandle in calling context.");
    target->handshake_state()->add_operation(&op);

and the bug makes the claim that we need to adjust this
guarantee(). Okay, but this proposed fix is indirectly changing
the guarantee() by inserting this block of code before the
guarantee():

  if (target->is_handshake_safe_for(self)) {
    hs_cl->do_thread(target);
    return;
}

so we still have the original guarantee() that checks a
specific state with respect to ThreadsListHandles and
we replace it with a check, the is_handshake_safe_for()
call, that has nothing to do with ThreadsListHandles!

The original purpose of this logic block:

  if (tlh == nullptr) {
    guarantee(Thread::is_JavaThread_protected_by_TLH(target),
              "missing ThreadsListHandle in calling context.");
    target->handshake_state()->add_operation(&op);

is to require a protecting ThreadsListHandle to be in place
somewhere in the calling context since we have not
passed in a ThreadsListHandle from the calling context.

When I added the above block of code, I intentionally
updated all of the call sites that reached the new strict
check with ThreadsListHandles. This included calls sites
where the caller was the current thread. This was an
intentional change on my part to make sure that all the
JavaThreads being operated (including current) on are
protected by ThreadsListHandles.

When the Loom project was being developed, a number
of these carefully placed ThreadsListHandles were moved
and unprotected code paths were introduced. We believe
that these unprotected code paths are safe because we
believe that they are only used by the current thread and
the current thread does not really need a ThreadsListHandle.
That might be true, but it certainly complicates the reasoning
about the code paths.

The bug talks about adjusting the guarantee() to allow the
current thread to be unprotected by a ThreadsListHandle, but
the logic that we have switched to:

  // A JavaThread can always safely operate on it self and other threads
  // can do it safely if they are the active handshaker.
  bool is_handshake_safe_for(Thread* th) const {
    return _handshake.active_handshaker() == th || this == th;
  }

does more than that. It also allows the target to be unprotected
by a ThreadsListHandle if the calling thread is the active handshaker.
I'm not (yet) convinced that is a good policy.

dcubed-ojdk · 2022-06-06T16:32:33Z

src/hotspot/share/prims/jvmtiEnvThreadState.cpp

+      Handshake::execute(&op, thread);
+      guarantee(op.completed(), "Handshake failed. Target thread is not alive?");


Having Handshake::execute() handle the current-thread case will certainly
allow us to make the code consistent in all the callers of Handshake::execute().

src/hotspot/share/prims/jvmtiEventController.cpp

dcubed-ojdk · 2022-06-06T17:03:32Z

src/hotspot/share/runtime/handshake.cpp

+  if (target->is_handshake_safe_for(self)) {
+    hs_cl->do_thread(target);
+    return;


Because we're pushing the special case handling for current-thread down
into the three parameter version of Handshake::execute(), we'll also
directly execute the closure's do_thread() function in other calls to the
three parameter version of Handshake::execute() where we didn't change
the calling code site in this patch:

src/hotspot/share/classfile/javaClasses.cpp: async_get_stack_trace()

src/hotspot/share/prims/jvmtiExtensions.cpp: GetCarrierThread()

src/hotspot/share/prims/whitebox.cpp: WB_HandshakeReadMonitors(), WB_HandshakeWalkStack()

src/hotspot/share/runtime/handshake.cpp: execute(HandshakeClosure* hs_cl, JavaThread* target)
Of course, since the two parameter version of Handshake::execute() is
now a changed code path, that means that all callers to the two parameter
version of Handshake::execute() are also affected. No, I'm not going to
list all those call sites.

This is a change in behavior and I'm not saying that this is wrong, but it's
not clear to me that the repercussions are understood and discussed in
this PR.

What I'm mumbling about here might be the same thing that @dholmes-ora is
worried about, but I'm just being more verbose about it. :-)

jdksjolen · 2022-06-07T12:19:00Z

It seems that we have at least two choices here:

Change the is_handshake_safe_for to current == target and be done with it.
Investigate whether is_handshake_safe_for is OK to be used in this context.

Is there anything I am missing?

I'm fine with going for option 1 but unless we need to get this change in quickly (it's a P3 bug, not sure what that entails) I'd like to wait for @robehn's input.

dholmes-ora · 2022-06-07T13:31:39Z

I have to agree with Dan. This is supposed to only be about targeting the current thread, but we are now no longer ensuring the target is protected by a TLH when the current thread is the active_handshaker. So I would vote for:

Change the is_handshake_safe_for to current == target and be done with it.

robehn · 2022-06-13T07:45:41Z

The only way to become an active handshaker is to handshake another thread (target), when that happens we verify that target is ThreadsList safe.
Thus active handshaker is guaranteed that the target is already verified on a ThreadsList.
As long as we are the active handshake the target is blocked, i.e. target is safepoint safe.

The reason I think handshake safe is good is because we have 3 (4) cases:
1: Current != Target (Not 3 and not 4)
2: Current == Target
3: Current != Target, but already executing a handshake for target
4: Current != Target, but we are in a safepoint (still no internally handled)

dholmes-ora · 2022-06-13T11:42:43Z

@robehn can you explain to me how the current thread can both be the active handshaker of the target and at the same time be executing another handshake with the target? This is making my head spin.

This change has deviated quite considerably from the issue that caused a bug to be filed. And Dan still has concerns that the current thread should still be protected by a TLH even if not strictly necessary. Maybe we actually need to backtrack and restore an invariant that there is always a TLH even for the current thread and fix the JVMTI code that did things differently?

robehn · 2022-06-13T16:44:02Z

@dholmes-ora it can't.

The point was, the code was originally is only truly tested and written for the case:
1: Current != Target

The other cases, 2-4, use to be externally handle.
There was an suggestion that 2 (current == target) should be internally handled. (let it slide pass the guarantee)

In all cases current and target (even if they are the same) must be present on some ThreadsList (e.g. main list when current == target).
I.e. they may not be terminated, e.g. since handshake operation may use handles, thus must be processes by GC.
The same goes for newly created before added to ThreadsList. (@dcubed-ojdk is correct)

So we are letting a "new" case happen when 'just' adjusting the guarantee, maybe this case works fine, I don't know.

sspitsyn · 2022-06-14T02:26:51Z

Maybe we actually need to backtrack and restore an invariant that there is always a TLH even for the current thread and fix the JVMTI code that did things differently?

This will make JVMTI code unnecessarily ugly in a couple of spots.
But I'm okay with that if keeping this invariant is important.
I can help with fixing JVMTI if needed.

jdksjolen · 2022-06-21T19:22:55Z

The scope of the ticket is precisely for the case when Thread::current() == target and as such the fix only checks for this particular case now.

@dcubed-ojdk, does this change look good to you?

dcubed-ojdk · 2022-06-21T21:31:03Z

@jdksjolen - I've reread all the comments in the PR and the latest version of the code
and I'm okay with the latest version. Please clarify what testing has been done on this
latest version of the fix.

dcubed-ojdk

Thumbs up. Please see my query about the latest testing.

dcubed-ojdk · 2022-06-21T21:37:16Z

Just to be clear:

@dholmes-ora wrote this:

Maybe we actually need to backtrack and restore an invariant that there is always a TLH even for the current thread and fix the JVMTI code that did things differently?

@sspitsyn wrote this:

This will make JVMTI code unnecessarily ugly in a couple of spots.
But I'm okay with that if keeping this invariant is important.
I can help with fixing JVMTI if needed.

The current version of the fix does NOT restore the invariant that there is always a TLH
even for the current thread. I'm (mostly) okay with this.

dholmes-ora

I'm okay with this in current form.

I'll leave it to Dan to decide whether he thinks restoring the old TLH "invariant" should be done in a separate RFE.

Thanks.

jdksjolen · 2022-06-23T16:04:50Z

Cheers. Feel free to sponsor this.

/integrate

openjdk · 2022-06-23T16:05:53Z

@jdksjolen
Your change (at version cc6736c) is now ready to be sponsored by a Committer.

dholmes-ora · 2022-06-24T04:59:20Z

/sponsor

openjdk · 2022-06-24T05:01:09Z

Going to push as commit 9dc9a64.
Since your change was applied there have been 351 commits pushed to the master branch:

64782a7: 8288623: Move Continuation classes out of javaClasses.hpp
c8cc94a: 8288979: Improve CLDRConverter run time
740169c: 8285521: Minor improvements in java.net.URI
13cbb3a: 8289073: (fs) UnsatisfiedLinkError for sun.nio.fs.UnixCopyFile.bufferedCopy0()
b206d2d: 8289006: Cleanup from thread.hpp split
2728770: 8288589: Files.readString ignores encoding errors for UTF-16
ef17ee4: 8288515: (ch) Unnecessary use of Math.addExact() in java.nio.channels.FileLock.overlaps()
72f286a: 8287580: (se) CancelledKeyException during channel registration
b8db0c3: 6980847: (fs) Files.copy needs to be "tuned"
d579916: 8288740: Change incorrect documentation for sjavac flag
... and 341 more: https://git.openjdk.org/jdk/compare/6e55a72f25f7273e3a8a19e0b9a97669b84808e9...master

Your commit was automatically rebased without conflicts.

openjdk · 2022-06-24T05:01:39Z

@dholmes-ora @jdksjolen Pushed as commit 9dc9a64.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

dcubed-ojdk · 2022-06-24T15:41:34Z

This PR has been backed out because it caused failures in Mach5 Tier[1-4].
See JDK-8289129 [BACKOUT] JDK-8287281 adjust guarantee in Handshake::execute for the case of target thread being current

Disallow handshaking with self

14bee58

openjdk bot added the rfr Pull request is ready for review label Jun 2, 2022

openjdk bot added serviceability serviceability-dev@openjdk.org hotspot hotspot-dev@openjdk.org labels Jun 2, 2022

robehn approved these changes Jun 2, 2022

View reviewed changes

openjdk bot added the ready Pull request is ready to be integrated label Jun 2, 2022

Call do_thread if handshaking with current thread as target

b7b4c08

jdksjolen added 4 commits June 3, 2022 09:41

Use is_handshake_safe_for and add the return

b40577f

Switch order of handshake check

8ddb442

Remove checks for is_handshake_for, instead call Handshake::execute

e356932

do_thread(target) not self

bf75d4c

robehn approved these changes Jun 3, 2022

View reviewed changes

pchilano approved these changes Jun 3, 2022

View reviewed changes

openjdk bot added the sponsor Pull request is ready to be sponsored label Jun 5, 2022

dholmes-ora suggested changes Jun 6, 2022

View reviewed changes

dcubed-ojdk suggested changes Jun 6, 2022

View reviewed changes

Use current instead of self as name for current thread

d4aea87

openjdk bot removed the sponsor Pull request is ready to be sponsored label Jun 7, 2022

Remove unused variable

c476d30

Move assert up and remove other assert, remove unused var

f2646e0

Check only for target == current

cc6736c

dcubed-ojdk approved these changes Jun 21, 2022

View reviewed changes

dholmes-ora approved these changes Jun 22, 2022

View reviewed changes

openjdk bot added the sponsor Pull request is ready to be sponsored label Jun 23, 2022

openjdk bot added the integrated Pull request has been integrated label Jun 24, 2022

openjdk bot closed this Jun 24, 2022

openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review sponsor Pull request is ready to be sponsored labels Jun 24, 2022

		Handshake::execute(&op, thread);
		guarantee(op.completed(), "Handshake failed. Target thread is not alive?");

8287281: adjust guarantee in Handshake::execute for the case of target thread being current #8992

8287281: adjust guarantee in Handshake::execute for the case of target thread being current #8992

Conversation

jdksjolen commented Jun 2, 2022 • edited by openjdk bot

Progress

Issue

Reviewers

Reviewing

bridgekeeper bot commented Jun 2, 2022

openjdk bot commented Jun 2, 2022

robehn left a comment

Choose a reason for hiding this comment

openjdk bot commented Jun 2, 2022 • edited

mlbridge bot commented Jun 2, 2022 • edited

Webrevs

jdksjolen commented Jun 3, 2022

robehn left a comment

Choose a reason for hiding this comment

pchilano left a comment

Choose a reason for hiding this comment

jdksjolen commented Jun 5, 2022

openjdk bot commented Jun 5, 2022

dholmes-ora left a comment

Choose a reason for hiding this comment

dholmes-ora Jun 6, 2022

Choose a reason for hiding this comment

dcubed-ojdk Jun 6, 2022

Choose a reason for hiding this comment

dholmes-ora Jun 6, 2022

Choose a reason for hiding this comment

dcubed-ojdk Jun 6, 2022

Choose a reason for hiding this comment

dcubed-ojdk left a comment

Choose a reason for hiding this comment

dcubed-ojdk Jun 6, 2022

Choose a reason for hiding this comment

dcubed-ojdk Jun 6, 2022

Choose a reason for hiding this comment

jdksjolen commented Jun 7, 2022

dholmes-ora commented Jun 7, 2022

robehn commented Jun 13, 2022 • edited

dholmes-ora commented Jun 13, 2022

robehn commented Jun 13, 2022 • edited

sspitsyn commented Jun 14, 2022

jdksjolen commented Jun 21, 2022

dcubed-ojdk commented Jun 21, 2022

dcubed-ojdk left a comment

Choose a reason for hiding this comment

dcubed-ojdk commented Jun 21, 2022

dholmes-ora left a comment

Choose a reason for hiding this comment

jdksjolen commented Jun 23, 2022

openjdk bot commented Jun 23, 2022

dholmes-ora commented Jun 24, 2022

openjdk bot commented Jun 24, 2022

openjdk bot commented Jun 24, 2022

dcubed-ojdk commented Jun 24, 2022

jdksjolen commented Jun 2, 2022 •

edited by openjdk bot

openjdk bot commented Jun 2, 2022 •

edited

mlbridge bot commented Jun 2, 2022 •

edited

robehn commented Jun 13, 2022 •

edited

robehn commented Jun 13, 2022 •

edited