Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8305670: Performance regression in LockSupport.unpark with lots of idle threads #13519

Closed
wants to merge 22 commits into from

Conversation

dcubed-ojdk
Copy link
Member

@dcubed-ojdk dcubed-ojdk commented Apr 18, 2023

Address the performance regression in LockSupport.unpark() with the following changes:

  • Add FastThreadsListHandle helper class to threadSMR.[ch]pp for quickly determining if a JavaThread* is protected by the ThreadsListHandle embedded in the new helper object.
  • Update Unsafe_Unpark() to pass java_lang_Thread::thread_acquire(thread_oop) to a new instance of the FastThreadsListHandle object and only do the unpark work if the target JavaThread* is protected.

Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8305670: Performance regression in LockSupport.unpark with lots of idle threads

Reviewers

Contributors

  • Robbin Ehn <rehn@openjdk.org>

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/13519/head:pull/13519
$ git checkout pull/13519

Update a local copy of the PR:
$ git checkout pull/13519
$ git pull https://git.openjdk.org/jdk.git pull/13519/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 13519

View PR using the GUI difftool:
$ git pr show -t 13519

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/13519.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Apr 18, 2023

👋 Welcome back dcubed! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Apr 18, 2023

@dcubed-ojdk The following labels will be automatically applied to this pull request:

  • hotspot
  • serviceability

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added serviceability serviceability-dev@openjdk.org hotspot hotspot-dev@openjdk.org labels Apr 18, 2023
@dcubed-ojdk
Copy link
Member Author

/label add hotspot-runtime
/label remove serviceability
/label remove hotspot

These changes were tested with a new test that's in the bug report and
with Mach5 Tier[1-8] testing.

@openjdk openjdk bot added the hotspot-runtime hotspot-runtime-dev@openjdk.org label Apr 18, 2023
@openjdk
Copy link

openjdk bot commented Apr 18, 2023

@dcubed-ojdk
The hotspot-runtime label was successfully added.

@openjdk openjdk bot removed the serviceability serviceability-dev@openjdk.org label Apr 18, 2023
@openjdk
Copy link

openjdk bot commented Apr 18, 2023

@dcubed-ojdk
The serviceability label was successfully removed.

@openjdk openjdk bot removed the hotspot hotspot-dev@openjdk.org label Apr 18, 2023
@openjdk
Copy link

openjdk bot commented Apr 18, 2023

@dcubed-ojdk
The hotspot label was successfully removed.

@dcubed-ojdk dcubed-ojdk marked this pull request as ready for review April 18, 2023 21:18
@openjdk openjdk bot added the rfr Pull request is ready for review label Apr 18, 2023
@dcubed-ojdk
Copy link
Member Author

@dholmes-ora and @robehn - The merge of my (mostly comments) changes with @robehn's
"quick_mode" changes are tested and ready for review.

@mlbridge
Copy link

mlbridge bot commented Apr 18, 2023

@apangin
Copy link

apangin commented Apr 18, 2023

@dcubed-ojdk Thank you for working on this. The changes make total sense to me (I'm not a reviewer though).
I tested the patch with the UnparkRegression benchmark from the bug report and it now produces 151K calls per second comparing to 70K without the patch. That is a quite good improvement.

Copy link
Contributor

@robehn robehn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Dan, thanks!

Looks good to me.

@openjdk
Copy link

openjdk bot commented Apr 19, 2023

@dcubed-ojdk This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8305670: Performance regression in LockSupport.unpark with lots of idle threads

Co-authored-by: Robbin Ehn <rehn@openjdk.org>
Reviewed-by: rehn, dholmes

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 31 new commits pushed to the master branch:

  • ceca198: 8307068: store a JavaThread* in the java.lang.Thread object after the JavaThread* is added to the main ThreadsList
  • 5e26e64: 8307067: remove broken EnableThreadSMRExtraValidityChecks option
  • e54051a: 8307935: Class space argument processing can be simplified
  • 46e3d24: 8155191: Specify that SecureRandom.nextBytes(byte[]) throws NullPointerException when byte array is null
  • 3bf3876: 8307297: Move some DnD tests to open
  • d8afc7b: 8300204: Sealed-class hierarchy graph missing nodes
  • 38838b3: 8307480: Improve SA "transported core" documentation for windows
  • 9842ff4: 8306607: Apply 80-column output to javac supported version output
  • d809823: 8306471: Add virtual threads support to JDWP ThreadReference.Stop and JDI ThreadReference.stop()
  • 4441a23: 6714245: [Col] Collator - Faster Comparison for identical strings.
  • ... and 21 more: https://git.openjdk.org/jdk/compare/54c06d2d915e57d822136b403ea7a7675325d6fb...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Apr 19, 2023
Copy link
Member

@dholmes-ora dholmes-ora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry Dan but it seems to me that if quick_mode is true then we don't even need to call cv_internal_thread_to_JavaThread in the first place.

p->unpark();
oop thread_oop = JNIHandles::resolve_non_null(jthread);
if (java_lang_Thread::thread(thread_oop) != nullptr) {
ThreadsListHandle tlh; // Provides memory barrier
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what the comment means - what kind of memory barrier?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Store side do:
Store to threadslist
Store to eetop

Read side do:
Read eetop
Read threadslist
Read eetop again

We want them to stay in this order.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps the comment should change to:

      ThreadsListHandle tlh; // Also is a barrier between the two JavaThread* queries.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree we need these ordering constraints, but I'm not clear what is actually guaranteeing them on either the reader or writer side. I imagine there are a number of "barriers" in relation to TLH construction, but as adding to the threadsList is done under a lock (as is setting eetop) then I'm not sure what explicit barriers exist there. Maybe java_lang_Thread::set_thread needs to use release_address_field_put?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are ~76 ThreadsListHandle variables in the code base. These are the only ones with comments:

src/hotspot/share/runtime/threadSMR.cpp:    ThreadsListHandle tlh;  // make the current ThreadsList safe for reporting
src/hotspot/share/prims/unsafe.cpp:      ThreadsListHandle tlh; // Provides memory barrier
src/hotspot/share/compiler/compileBroker.cpp:        ThreadsListHandle tlh;  // name() depends on the TLH.
src/hotspot/share/compiler/compileBroker.cpp:        ThreadsListHandle tlh;  // name() depends on the TLH.
src/hotspot/share/compiler/compileBroker.cpp:        ThreadsListHandle tlh;  // name() depends on the TLH.
src/hotspot/share/compiler/compileBroker.cpp:        ThreadsListHandle tlh;  // name() depends on the TLH.

The new "// Provides memory barrier" is the only one that even talks about barriers.
I'm inclined to delete the comment given that these current lines of code:

    if (java_lang_Thread::thread(thread_oop) != nullptr) {
      // Try to capture the live JavaThread in a ThreadsListHandle:
      ThreadsListHandle tlh;
      if (java_lang_Thread::thread(thread_oop) != nullptr) {
        // The still live JavaThread is protected by the ThreadsListHandle.
        JavaThread* thr = nullptr;

I think the comments added in the previous comment are enough to make this code clear.
@robehn and @dholmes-ora - are you both good with this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind if the "memory barrier" comment is there or not. I'm more concerned about whether the actual required ordering constraints are in fact enforced. I think java_lang_Thread::set_thread may need to have release semantics (and the matching acquire is (implicitly) within the TLH construction).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dholmes-ora - I'm sorry that I forgot to reply to this earlier comment in this thread:
#13519 (comment)

but as adding to the threadsList is done under a lock (as is setting eetop)

Adding a new JavaThread* to the main ThreadsList is done under a lock.
The java_lang_Thread::set_thread() calls are less clean:

  • there's one call fromcreate_initial_thread() that doesn't have the Threads_lock, but I also don't think it needs it.

  • there's one call from JavaThread::allocate_threadObj() which happens during JNI attach that doesn't have the Threads_lock, but I don't think it needs it

  • the call from JavaThread::prepare() has the Threads_lock

  • the call from JavaThread::start_internal_daemon() has the Threads_lock

Of course, here's our favorite call from ensure_join():

static void ensure_join(JavaThread* thread) {
  // We do not need to grab the Threads_lock, since we are operating on ourself.
<snip>
  OrderAccess::release();
  java_lang_Thread::set_thread(threadObj(), nullptr);

Obviously someone thought that call to java_lang_Thread::set_thread() needed a "release".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I need to address part of this comment:
#13519 (comment)

I think java_lang_Thread::set_thread may need to have release semantics
(and the matching acquire is (implicitly) within the TLH construction).

I agree we should put release semantics in java_lang_Thread::set_thread() and
I'll work on that next.

This part is puzzling me:

the matching acquire is (implicitly) within the TLH construction

TLH construction does nothing with the JavaThread* stored in the java.lang.Thread object
unless I'm really missing something here. I would think the matching acquire would be in
any call to java_lang_Thread::thread().

So I also think that java_lang_Thread::thread() needs to use address_field_acquire()
unless I'm missing something.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes you are right that makes things simpler and much neater. We end up with two acquires in the path:

Read eetop
Read threadslist
Read eetop again

but that is okay.

HOTSPOT_THREAD_UNPARK((uintptr_t) p);
p->unpark();
oop thread_oop = JNIHandles::resolve_non_null(jthread);
if (java_lang_Thread::thread(thread_oop) != nullptr) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion for a comment after this line:

// We initially have a live JavaThread, so try to capture it in a ThreadsListHandle

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added:

    if (java_lang_Thread::thread(thread_oop) != nullptr) {
      // Try to capture the live JavaThread in a ThreadsListHandle:
      ThreadsListHandle tlh; // Provides memory barrier

oop thread_oop = JNIHandles::resolve_non_null(jthread);
if (java_lang_Thread::thread(thread_oop) != nullptr) {
ThreadsListHandle tlh; // Provides memory barrier
if (java_lang_Thread::thread(thread_oop) != nullptr) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion for comment after this line:

// We still have a live JavaThread, now protected by the ThreadsListHandle.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added:

      if (java_lang_Thread::thread(thread_oop) != nullptr) {
        // The still live JavaThread is protected by the ThreadsListHandle.
        JavaThread* thr = nullptr;

// We verified that java_lang_Thread::thread(thread_oop) != nullptr
// before and after creating tlh above so quick_mode can be used in
// this conversion call.
if (tlh.cv_internal_thread_to_JavaThread(jthread, &thr, nullptr, true /* quick_mode */)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I'm having a mental blank here now. Why do we even need to call this if we can access java_lang_Thread::thread(thread_oop) after the TLH was created?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a caller of cv_internal_thread_to_JavaThread we do not know the implementation of quick_mode.

As the implementation is now, we still want the asserts.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In particular we want this assert from cv_internal_thread_to_JavaThread():

  assert(includes(java_thread), "must be");

because that verifies our assumptions from this call-site that quick_mode is okay.

Also, it is possible that a JavaThread that's racing to exit will make it past ensure_join()
after our check on L789 above and run into this block in the conversion function:

  JavaThread *java_thread = java_lang_Thread::thread(thread_oop);
  if (java_thread == nullptr) {
    // The java.lang.Thread does not contain a JavaThread* so it has not
    // run enough to be put on a ThreadsList or it has exited enough to
    // make it past ensure_join() where the JavaThread* is cleared.
    return false;
  }

Because we (the observer thread) captured the target JavaThread in the
ThreadsListHandle, it's safe for us to use the JavaThread*, but I've always
been a fan of bailing on operations like these as soon as we detect that the
JavaThread* is exiting.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But ignoring the asserts cv_internal_thread_to_JavaThread() serves absolutely no purpose when we could request quick-mode - and we don't need to know what cv_internal_thread_to_JavaThread does - we know how thread startup and exit is managed and we know what a TLH guarantees . Sure we can bail sooner but there is no need to, so that is not required. In quick-mode, barring the assertion for includes, it all boils down to nothing! We don't need the Thread_oop passed out as we already have it before we start; we don't even need the JavaThread* passed out as we already have it too! So what does it actually do for us? Nothing. We could just have:

UNSAFE_ENTRY(void, Unsafe_Unpark(JNIEnv *env, jobject unsafe, jobject jthread)) {
  if (jthread != nullptr) {
    oop thread_oop = JNIHandles::resolve_non_null(jthread);
    if (java_lang_Thread::thread(thread_oop) != nullptr) {
      // Try to capture the live JavaThread in a ThreadsListHandle:
      ThreadsListHandle tlh; // Provides memory barrier
      JavaThread* thr;
      if ( (thr = java_lang_Thread::thread(thread_oop)) != nullptr) {
        assert(tlh.includes(thr), "must be");
        // The still live JavaThread is protected by the ThreadsListHandle
        // so is safe to access.
        Parker* p = thr->parker();
        HOTSPOT_THREAD_UNPARK((uintptr_t) p);
        p->unpark();
      }
    } // ThreadsListHandle is destroyed here.
  }
} UNSAFE_END

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before we went down the road of creating this quick_mode variant of the
conversion function, I added this comment about "API indigestion" to the bug report:

https://bugs.openjdk.org/browse/JDK-8305670?focusedCommentId=14573647&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14573647

@robehn's proposed quick_mode changes were an attempt to make the indigestion
a little less bothersome (Tums anyone?), but @dholmes-ora's comments have made a
good case here for what we've done is jump thru hoops to keep a call to the (slightly
modified) conversion function when it's clear that the general purpose conversion
function just isn't the right tool for this particular job.

I'm going to take another look at backing out portions of the quick_mode fix while
trying to keep the performance improvements. This will result in Unsafe_Unpark()
not trying to use the general purpose conversion function.

On a related note, we're having a bit of trouble trying to reproduce the performance
improvements on the machines in our performance lab.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's also possible to create a special version of ThreadsListHandle which takes an Thread object in constructor.
Do the eetop check and then safe fetch the ThreadsList.
If you call include() with the same JavaThread* as the eetop was before creating the ThreadsList, you can return true directly, i.e. quick mode.

SpecialThreadsListHandle::SpecialThreadsListHandle(oop thread_oop) {
  _javaThread_from_eetop = java_lang_Thread::thread(thread_oop);
  _threadList = ...
}

bool includes(JavaThread* jt) {
  if (jt == _javaThread_from_eetop && jt != nullptr) return true;
  return _threadsList->includes(jt);
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we could create a special version of ThreadsListHandle that would allow us
to encapsulate this special purpose logic in threadsSMR source files where it
belongs. Of course, because I can't stand the 'eetop' name, I'll use some other
naming here. :-)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm... it occurs me that this special version of ThreadsListHandle will need a
way to communicate with the call-site that created it and return the protected
JavaThread*. If it returns nullptr, then the thread_oop couldn't be
converted into a protected JavaThread*.

Perhaps stlh.protected_java_thread()...

@dcubed-ojdk
Copy link
Member Author

dcubed-ojdk commented Apr 19, 2023

@apangin - thanks for taking a look at the code. Even as a committer you can chose to
add your review (plus or minus) to a PR. It would still require a (R)eviewer, but I always
count all reviews. Also thanks for the testing!

@robehn - Thanks for the review and thanks for chiming in with replies to @dholmes-ora
queries. I may add more to those threads.

@dholmes-ora - Thanks for the review! I'll start going thru your comments shortly.

@dcubed-ojdk
Copy link
Member Author

dcubed-ojdk commented Apr 21, 2023

v02 has been tested with the usual local tests for this project.
Mach5 Tier1 passed with no failures. I'm doing additional Mach5 testing.
Mach5 Tier2 and Tier3 also passed (Tier2 had 1 unrelated test failure).
Since I'm going to make more changes, I'm stopping testing on v02.

Copy link
Member

@dholmes-ora dholmes-ora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unpark changes look good. :) Thanks.

Comment on lines -952 to -957
product(bool, EnableThreadSMRExtraValidityChecks, true, DIAGNOSTIC, \
"Enable Thread SMR extra validity checks") \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you still want to make this change as part of this PR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I would. All the analysis is recorded in this bug and the PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm going to change my mind on this topic. I'll likely split the
EnableThreadSMRExtraValidityChecks changes off into a sub-task.
This will allow for easier reviewing and backport decisions (if any for
this particular fix).

@dcubed-ojdk
Copy link
Member Author

dcubed-ojdk commented Apr 21, 2023

v03 has been tested with the usual local tests for this project.
Mach5 Tier1 passed with no failures. I'm doing additional Mach5 testing.
Update: Mach5 Tier[2-7] passed with no failures. Mach5 Tier8 has just
has the usual unrelated failures that we see with a project based on
jdk-21+18.

So Mach5 testing looks good.

Copy link
Contributor

@robehn robehn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Works for me!

@dcubed-ojdk
Copy link
Member Author

@robehn - Thanks for the re-review!

@dcubed-ojdk
Copy link
Member Author

@dholmes-ora - Please re-review this PR now that two pieces have been refactored
out into JDK-8307067 and JDK-830-7068.

@robehn - Any chance you are around for a re-review?
@coleenp - Any chance you can review this part since you already reviewed JDK-8307067?
@apangin - You looked at this fix when it was part of #13519. Any chance you can take another look?

Copy link
Contributor

@robehn robehn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm good! Thanks!

Copy link
Member

@dholmes-ora dholmes-ora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still have reservations about the utility of FastTLH given it requires a load-acquire by the caller but can't check/enforce that. But I can live with this.

One nit below.

Thanks.

if (java_thread != nullptr) {
// We captured a non-nullptr JavaThread* before the _tlh was created
// so that covers the early life stage of the target JavaThread.
_protected_java_thread = java_lang_Thread::thread_acquire(thread_oop);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The caller must use thread_acquire so it serves no purpose doing it again here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will have to disagree since the initialization and construction of this helper object
is racing with the target JavaThread's termination code path in ensure_join().

The caller's thread_acquire() and the thread_acquire() on L847 are racing with
this code block in ensure_join():

 // Clear the native thread instance - this makes isAlive return false and allows the join()
  // to complete once we've done the notify_all below. Needs a release() to obey Java Memory Model
  // requirements.
  java_lang_Thread::release_set_thread(threadObj(), nullptr);
  lock.notify_all(thread);
  // Ignore pending exception, since we are exiting anyway
  thread->clear_pending_exception();

If the caller's thread_acquire() happens before the call to release_set_thread(),
then we'll execute the code block controlled by L844 if (java_thread != nullptr) {
and we will get to the thread_acquire() call on L847. Let's say that the target
JavaThread has now made it past the release_set_thread() call so now the
JavaThread* stored in the java.lang.Thread object is now nullptr. We want the
thread_acquire() on L847 to match up with the release_set_thread() so that
we see the fact that the JavaThread* stored in the java.lang.Thread object is now
nullptr.

Please let me know if I missed something here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That release in ensure_join is for Java volatile semantics for threads calling is_alive(). It is there to ensure that any Java writes made by the thread before it terminated are seen by the thread that sees is_alive() return false. The VM code we are dealing with here does not need to see any such writes by the terminating thread, nor are there any interesting writes in the VM code which must be seen if thread() returned null. The code we are dealing with here only needs to synchronize with the thread startup logic and in particular its addition to the ThreadsList. The thread_acquire in the caller ensures that if we see a non-null value then we also see the writes to the ThreadsList as required. The subsequent re-reading of that value will return null or non-null but it does not need to synchronize with any other memory writes, hence acquire semantics are not needed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry, but I will have to disagree again. Here's the first thread_acquire() call
along with the comment that explains that this first call is for the early life stage of
the target JavaThread:

    // Get the JavaThread* stored in the java.lang.Thread object _before_
    // the embedded ThreadsListHandle is constructed so we know if the
    // early life stage of the JavaThread* is protected.
    FastThreadsListHandle ftlh(thread_oop, java_lang_Thread::thread_acquire(thread_oop));

At this point in the code, all we have done is potentially gather the JavaThread*
that's stored in the java.lang.Thread object. That value we gathered, might be
nullptr and it might be non-nullptr. For the first thread_acquire call, I see it
as matching up with the release_set_thread calls at the end of thread startup.

Here's the helper's constructor code:

FastThreadsListHandle::FastThreadsListHandle(oop thread_oop, JavaThread* java_thread) : _protected_java_thread(nullptr) {
  assert(thread_oop != nullptr, "must be");
  if (java_thread != nullptr) {
    // We captured a non-nullptr JavaThread* before the _tlh was created
    // so that covers the early life stage of the target JavaThread.
    _protected_java_thread = java_lang_Thread::thread_acquire(thread_oop);

We check the java_thread parameter and if it's non-nullptr, then we know
that the JavaThread* stored in the java.lang.Thread object was non-nullptr
before we constructed the embedded ThreadsListHandle (_tlh). Now we can
check the second half of the question...

Here's the second thread_acquire() call again along with the assert() and the
comment that explains that this second call is for end life stage of the target
JavaThread*.

    _protected_java_thread = java_lang_Thread::thread_acquire(thread_oop);
    assert(_protected_java_thread == nullptr || _tlh.includes(_protected_java_thread), "must be");
    // If we captured a non-nullptr JavaThread* after the _tlh was created
    // then that covers the end life stage of the target JavaThread and we
    // we know that _tlh protects the JavaThread*.

The subsequent re-reading of that value will return null or non-null but it does
not need to synchronize with any other memory writes

The second thread_acquire most certainly does need to synchronize with the
code in ensure_join() that does a release_set_thread of a nullptr. If that code
set a nullptr before we could read a non-nullptr value a second time, then we can
make no inferences about what _tlh contains. So I see the second thread_acquire
as matching up with the release_set_thread in ensure_join.

If the _protected_java_thread field is nullptr, then we don't know that _tlh has
captured the target JavaThread* and the JavaThread* is NOT protected.
If _protected_java_thread != nullptr, then we do know that _tlh contains the
target JavaThread* and JavaThread* is protected.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The second thread_acquire most certainly does need to synchronize with the
code in ensure_join() that does a release_set_thread of a nullptr. If that code
set a nullptr before we could read a non-nullptr value a second time, then we can
make no inferences about what _tlh contains. So I see the second thread_acquire
as matching up with the release_set_thread in ensure_join.

The only memory value of interest is the read of eetop. The release/acquire have no affect on the reading or writing of that field itself. We would only need acquire semantics when reading the potentially null eetop if we were then to read another memory value that would be written prior to eetop being set to null. That is not the case hence the acquire semantics are unnecessary.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've read thru this thread of comments three times and I think I see where and how my
brain got confused by what we are doing here. Let me see if I can unwind this mess that
I have made...

Our new FastThreadsListHandle helper object is used by Unsafe_Unpark() to quickly
determine whether our target thread (jthread parameter) is protected by the embedded
_tlh in the helper object. The Unsafe_Unpark() call is racing against two different
situations in the target thread's life-cycle:

  1. The first situation is the early life stage of the target thread. It is during this stage that
    the target JavaThread* gets added to the main ThreadsList AND the JavaThread*
    gets stored in the java.lang.Thread object.

    From the Unsafe_Unpark() callers POV, we need to see two things happen: the target
    JavaThread* getting added to the main ThreadsList and a non-nullptr value being
    stored in the java.lang.Thread object. Seeing both memory operations requires that we
    use java_lang_Thread::thread_acquire() for the query in Unsafe_Unpark().

  2. The second situation is the end life stage of the target thread. It is during this stage that
    nullptr gets stored in the java.lang.Thread object. The target JavaThread* will also
    get removed from the main ThreadsList later, but we don't care about that part since the
    setting of the nullptr in the java.lang.Thread object is a perfect bailout sentinel.

    From the FastThreadsListHandle helper object's POV, we only care about whether we
    see a non-nullptr value or a nullptr value. Since we only care about a single memory
    operation, we use java_lang_Thread::thread() which is implemented with an underlying
    atomic load operation. Acquire semantics are not needed for this query.

Okay so those are the two situations of interest in the target thread. Now let's talk about
Unsafe_Unpark(), its use of the FastThreadsListHandle helper object and how that
applies to the target thread.

  1. The first query of the JavaThread* in the java.lang.Thread object that is made in
    Unsafe_Unpark() is trying to determine if the target thread is past the early life stage.
    A non-nullptr value means that the _tlh embedded in the FastThreadsListHandle
    might be protecting the JavaThread*. Since this code is primarily targeted at the
    early life stage, we have to use java_lang_Thread::thread_acquire() in order to get
    a proper answer from that code path in the target thread.

    Of course, this code might also execute when the target thread is in the end life stage,
    but we'll still get a valid answer from it so no harm, no foul.

  2. Unsafe_Unpark() passes that first JavaThread* value to the FastThreadsListHandle
    helper object and that constructor does the remainder of the work if the value passed
    in is non-nullptr. The constructor does the second query of the JavaThread* in the
    java.lang.Thread object and it only cares about whether we see a nullptr value or a
    non-nullptr value so we can use just java_lang_Thread::thread() here to fetch the
    value. A non-nullptr value means that the embedded _tlh does protect the
    JavaThread* and Unsafe_Unpark() can proceed to use that JavaThread* to do
    its work. A nullptr value means that the target JavaThread* has exited enough to
    make it past ensure_join() where the JavaThread* is cleared from the
    java.lang.Thread object.

    This second query does not execute if the target thread is in the early life stage and a
    nullptr value is returned by the first query so there is no opportunity for the
    java_lang_Thread::thread() query to return an invalid answer from an early life stage.

@dcubed-ojdk
Copy link
Member Author

@robehn and @dholmes-ora - Thanks for the re-reviews!

…hread() instead of java_lang_Thread::thread_acquire(); clarify comments.
@dcubed-ojdk
Copy link
Member Author

dholmes CR - FastThreadsListHandle ctr should use java_lang_Thread::thread() instead of java_lang_Thread::thread_acquire(); clarify comments.

Passes local build and the usual local tests on my MBP13.

Copy link
Member

@dholmes-ora dholmes-ora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Dan - looks good.

@dcubed-ojdk
Copy link
Member Author

@dholmes-ora - Thanks for the re-review and approval.

@dcubed-ojdk dcubed-ojdk changed the base branch from pr/13723 to master May 13, 2023 14:41
@openjdk openjdk bot added the ready Pull request is ready to be integrated label May 13, 2023
@dcubed-ojdk
Copy link
Member Author

/integrate

@openjdk
Copy link

openjdk bot commented May 13, 2023

Going to push as commit f030937.
Since your change was applied there have been 31 commits pushed to the master branch:

  • ceca198: 8307068: store a JavaThread* in the java.lang.Thread object after the JavaThread* is added to the main ThreadsList
  • 5e26e64: 8307067: remove broken EnableThreadSMRExtraValidityChecks option
  • e54051a: 8307935: Class space argument processing can be simplified
  • 46e3d24: 8155191: Specify that SecureRandom.nextBytes(byte[]) throws NullPointerException when byte array is null
  • 3bf3876: 8307297: Move some DnD tests to open
  • d8afc7b: 8300204: Sealed-class hierarchy graph missing nodes
  • 38838b3: 8307480: Improve SA "transported core" documentation for windows
  • 9842ff4: 8306607: Apply 80-column output to javac supported version output
  • d809823: 8306471: Add virtual threads support to JDWP ThreadReference.Stop and JDI ThreadReference.stop()
  • 4441a23: 6714245: [Col] Collator - Faster Comparison for identical strings.
  • ... and 21 more: https://git.openjdk.org/jdk/compare/54c06d2d915e57d822136b403ea7a7675325d6fb...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label May 13, 2023
@openjdk openjdk bot closed this May 13, 2023
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels May 13, 2023
@openjdk
Copy link

openjdk bot commented May 13, 2023

@dcubed-ojdk Pushed as commit f030937.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@dcubed-ojdk dcubed-ojdk deleted the JDK-8305670 branch June 9, 2023 19:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-runtime hotspot-runtime-dev@openjdk.org integrated Pull request has been integrated
4 participants