-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8306767: Concurrent repacking of extra data in MethodData is potentially unsafe #16840
Changes from 21 commits
04e6522
9336c4f
5b819f1
54f2c49
83cda01
c126f34
465c381
e0fc8d1
70327d3
30e5aeb
0ec5371
2bda9d7
68ca3bd
1c19953
6647e11
b0ff5d1
e1e9174
825a3d8
671ead2
78a2cdb
4dbfe9a
f8a81cd
c408f22
3dad293
5f471ff
ff581b0
96af505
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,5 @@ | ||
/* | ||
* Copyright (c) 2001, 2023, Oracle and/or its affiliates. All rights reserved. | ||
* Copyright (c) 2001, 2024, Oracle and/or its affiliates. All rights reserved. | ||
* DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. | ||
* | ||
* This code is free software; you can redistribute it and/or modify it | ||
|
@@ -82,13 +82,24 @@ class PrepareExtraDataClosure : public CleanExtraDataClosure { | |
return _safepoint_tracker.safepoint_state_changed(); | ||
} | ||
|
||
bool finish() { | ||
bool finish(NoSafepointVerifier* no_safepoint) { | ||
if (_uncached_methods.length() == 0) { | ||
// Preparation finished iff all Methods* were already cached. | ||
return true; | ||
} | ||
// Holding locks through safepoints is bad practice. | ||
MutexUnlocker mu(_mdo->extra_data_lock()); | ||
// We are currently holding the extra_data_lock and ensuring | ||
// no safepoint breaks the lock. | ||
_mdo->check_extra_data_locked(); | ||
|
||
// We now want to cache some method data. This could cause a safepoint. | ||
// We temporarily release the lock and allow safepoints, and revert that | ||
// at the end of the scope. This is safe, since we currently do not hold | ||
// any extra_method_data: finish is called only after clean_extra_data, | ||
// and the outer scope that first aquired the lock should not hold any | ||
// extra_method_data while cleaning is performed, as the offsets can change. | ||
MutexUnlocker mu(_mdo->extra_data_lock(), Mutex::_no_safepoint_check_flag); | ||
PauseNoSafepointVerifier pause_no_safepoints(no_safepoint); | ||
|
||
for (int i = 0; i < _uncached_methods.length(); ++i) { | ||
if (has_safepointed()) { | ||
// The metadata in the growable array might contain stale | ||
|
@@ -104,14 +115,14 @@ class PrepareExtraDataClosure : public CleanExtraDataClosure { | |
} | ||
}; | ||
|
||
void ciMethodData::prepare_metadata() { | ||
void ciMethodData::prepare_metadata(NoSafepointVerifier* no_safepoint) { | ||
MethodData* mdo = get_MethodData(); | ||
|
||
for (;;) { | ||
ResourceMark rm; | ||
PrepareExtraDataClosure cl(mdo); | ||
mdo->clean_extra_data(&cl); | ||
if (cl.finish()) { | ||
if (cl.finish(no_safepoint)) { | ||
// When encountering uncached metadata, the Compile_lock might be | ||
// acquired when creating ciMetadata handles, causing safepoints | ||
// which requires a new round of preparation to clean out potentially | ||
|
@@ -123,12 +134,15 @@ void ciMethodData::prepare_metadata() { | |
|
||
void ciMethodData::load_remaining_extra_data() { | ||
MethodData* mdo = get_MethodData(); | ||
MutexLocker ml(mdo->extra_data_lock()); | ||
|
||
// Lock to read ProfileData, and ensure lock is not unintentionally broken by a safepoint | ||
MutexLocker ml(mdo->extra_data_lock(), Mutex::_no_safepoint_check_flag); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is there anyway to have MutexLocker take care of verifying that the there's no safepoint? It would be nice to replace:
by:
only. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @fisk @tkrodriguez what do you suggest for that? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should I create a wrapper object for it? Maybe a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @rwestrel I introduced a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. But in this specific instance I keep them separate, because I need to pass the |
||
NoSafepointVerifier no_safepoint; | ||
|
||
// Deferred metadata cleaning due to concurrent class unloading. | ||
prepare_metadata(); | ||
prepare_metadata(&no_safepoint); | ||
// After metadata preparation, there is no stale metadata, | ||
// and no safepoints can introduce more stale metadata. | ||
NoSafepointVerifier no_safepoint; | ||
|
||
assert((mdo->data_size() == _data_size) && (mdo->extra_data_size() == _extra_data_size), "sanity, unchanged"); | ||
assert(extra_data_base() == (DataLayout*)((address) _data + _data_size), "sanity"); | ||
|
@@ -562,6 +576,9 @@ void ciMethodData::set_argument_type(int bci, int i, ciKlass* k) { | |
VM_ENTRY_MARK; | ||
MethodData* mdo = get_MethodData(); | ||
if (mdo != nullptr) { | ||
// Lock to read ProfileData, and ensure lock is not broken by a safepoint | ||
NoSafepointMutexLocker ml(mdo->extra_data_lock(), Mutex::_no_safepoint_check_flag); | ||
|
||
ProfileData* data = mdo->bci_to_data(bci); | ||
if (data != nullptr) { | ||
if (data->is_CallTypeData()) { | ||
|
@@ -586,6 +603,9 @@ void ciMethodData::set_return_type(int bci, ciKlass* k) { | |
VM_ENTRY_MARK; | ||
MethodData* mdo = get_MethodData(); | ||
if (mdo != nullptr) { | ||
// Lock to read ProfileData, and ensure lock is not broken by a safepoint | ||
NoSafepointMutexLocker ml(mdo->extra_data_lock(), Mutex::_no_safepoint_check_flag); | ||
|
||
ProfileData* data = mdo->bci_to_data(bci); | ||
if (data != nullptr) { | ||
if (data->is_CallTypeData()) { | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,5 @@ | ||
/* | ||
* Copyright (c) 2015, 2023, Oracle and/or its affiliates. All rights reserved. | ||
* Copyright (c) 2015, 2024, Oracle and/or its affiliates. All rights reserved. | ||
* DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. | ||
* | ||
* This code is free software; you can redistribute it and/or modify it | ||
|
@@ -687,12 +687,15 @@ address CompiledMethod::continuation_for_implicit_exception(address pc, bool for | |
ResourceMark rm(thread); | ||
CodeBlob* cb = CodeCache::find_blob(pc); | ||
assert(cb != nullptr && cb == this, ""); | ||
ttyLocker ttyl; | ||
tty->print_cr("implicit exception happened at " INTPTR_FORMAT, p2i(pc)); | ||
print(); | ||
method()->print_codes(); | ||
print_code(); | ||
print_pcs(); | ||
|
||
// Keep tty output consistent. To avoid ttyLocker, we buffer in stream, and print all at once. | ||
stringStream ss; | ||
ss.print_cr("implicit exception happened at " INTPTR_FORMAT, p2i(pc)); | ||
print_on(&ss); | ||
method()->print_codes_on(&ss); | ||
print_code_on(&ss); | ||
print_pcs_on(&ss); | ||
tty->print("%s", ss.as_string()); // print all at once | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It seems like these ttyLocker changes should be checked in as a different cleanup, ie removing ttyLocker is a really good thing. Can you make these changes a separate patch? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, I filed the RFE: |
||
} | ||
#endif | ||
if (cont_offset == 0) { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this safe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We only do this in the
finish
method, where we hold no reference to any profiled-data anymore. We only really need to hold the lock duringclean_extra_data
andis_live
. But after those are done, we can quickly release the lock so that we can callget_method
.Does that make sense to you? I'm not super happy with the general pattern here... I basically kept the old pattern.
I wonder, maybe there is a way to move the scope of the lock, such that we only need to lock inside of
clean_extra_data
, and do not hold it before we enterclean_extra_data
. Do you think that would be preferable?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rwestrel I still believe this is safe. But maybe also ugly.
I looked into making the locking more fine-grained, so that we could avoid unlocking the lock temporarily.
The biggest problem is in
ciMethodData::load_remaining_extra_data
. Here we first (iteratively) clean, and then assume that we still hold the lock when we copy it for theciMethodData
. Hence, it seems the lock has to be held at this outer scope, but then temporarily unlocked to allow calls toget_method
inPrepareExtraDataClosure::finish
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatives to make it prettier:
Make
prepare_metadata
lock, and pass out an object that holds that lock, i.e. widen the scope of theMutexLocker
. Maybe this can be done with return-value-optimization? But I'm not sure this is a great idea. Another idea @chhagedorn and I thought about was having some Locker object that you can call lock/unlock on, repeatedly. But once the Locker goes out of scope, it checks if it is in the locked state, and only unlocks then. Or maybe it asserts that it is in the locked state, and then unlocks.Because essencially we need to allow the retry-logic to unlock in between tries. But we also still need to access the
uncached_methods
array that is filled inside the locked region.I'm not sure such a refactoring is worth it. Let me know what you think @rwestrel
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that logic existed, I think you can leave it as is but a comment that explains why it is safe would be useful.