8325670: GenShen: Allow old to expand at end of each GC #394

kdnilsen · 2024-02-12T17:36:45Z

At the end of GC, we set aside collector reserves to satisfy anticipated needs of the next GC.

This PR reverts a change that accidentally prevents old-gen from being enlarged by this action. The observed failure condition was that mixed evacuations were not able to be performed, because old-gen was not large enough to receive the results of the desired evacuations.

Progress

Change must not contain extraneous whitespace
Commit message must refer to an issue
Change must be properly reviewed (1 review required, with at least 1 Committer)

Issue

JDK-8325670: GenShen: Allow old to expand at end of each GC (Bug - P3)

Reviewers

Y. Srinivas Ramakrishna (@ysramakrishna - Committer)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/shenandoah.git pull/394/head:pull/394
$ git checkout pull/394

Update a local copy of the PR:
$ git checkout pull/394
$ git pull https://git.openjdk.org/shenandoah.git pull/394/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 394

View PR using the GUI difftool:
$ git pr show -t 394

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/shenandoah/pull/394.diff

Webrev

Link to Webrev Comment

bridgekeeper · 2024-02-12T17:37:31Z

👋 Welcome back kdnilsen! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

mlbridge · 2024-02-12T17:41:23Z

Webrevs

01: Full - Incremental (e0761e82)
00: Full (f583f713)

ysramakrishna · 2024-02-12T21:20:21Z

src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp

+  // In the case that ShenandoahOldEvacRatioPercent equals 100, max_old_reserve is limited only by xfer_limit.
+  const size_t max_old_reserve = (ShenandoahOldEvacRatioPercent == 100) ?
+    old_available + xfer_limit: (young_reserve * ShenandoahOldEvacRatioPercent) / (100 - ShenandoahOldEvacRatioPercent);


I guess I don't understand two things here:

Why do we special-case ShenandoahOldEvacRationPercent == 100 here? When it's less that 100, we consider xfer_limit only in the deficit calculations below. Should we be adding xfer_limit to the result of the above calculation irrespective of the setting of ShenandoahOldEvacRationPercent ?

Where was this adjustment being made in the code before the changes of 8321939: [GenShen] ShenandoahOldEvacRatioPercent=100 fails with divide-by-zero #369 ?

We special case ShenandoahOldEvacRatioPercent==100 because the "other case" has divide by (100 - ShenandoahOldEvacRatioPercent), which becomes divide by zero.

To generalize the form of the other expression, if ShenandoahOldEvacRatioPercent is 100, then there is no bound on maximum_old_evacuation_reserve. Or in other words, the bound is infinity times maximum_young_evacuation_reserve.

In the original code, before the referenced change, if we can get past the divide-by-zero issue, we would find expansion of old to be limited by the xfer_limit at line 1265:
if (old_region_deficit > max_old_region_xfer) {
old_region_deficit = max_old_region_xfer;
}

We still ultimately limit expansion by xfer_limit.

I may have misunderstood your questions. Please let me know if I missed the mark.

We special case ShenandoahOldEvacRatioPercent==100 because the "other case" has divide by (100 - ShenandoahOldEvacRatioPercent), which becomes divide by zero.

Yes, that I realize. I was asking about the addition of xfer_limit in just this case and not otherwise.

To generalize the form of the other expression, if ShenandoahOldEvacRatioPercent is 100, then there is no bound on maximum_old_evacuation_reserve. Or in other words, the bound is infinity times maximum_young_evacuation_reserve.

Correct. So I bounded it by max available. You corrected it to max_available + xfer_limit. It seems as if you want to bound everything by (max_available + xfer_limit).

In the original code, before the referenced change, if we can get past the divide-by-zero issue, we would find expansion of old to be limited by the xfer_limit at line 1265: if (old_region_deficit > max_old_region_xfer) { old_region_deficit = max_old_region_xfer; }

That's still the case with old_region_deficit without your current change.

We still ultimately limit expansion by xfer_limit.

I think that happened before as well, except when now because of your change we treat SOERP=100 specially (but nothing else).

I may have misunderstood your questions. Please let me know if I missed the mark.

What I am suggesting is that where we used to do:

const size_t max_old_reserve = (ShenandoahOldEvacRatioPercent == 100) ? old_available : MIN2((young_reserve * ShenandoahOldEvacRatioPercent) / (100 - ShenandoahOldEvacRatioPercent), old_available);

instead of doing what you suggest above, viz.:

// In the case that ShenandoahOldEvacRatioPercent equals 100, max_old_reserve is limited only by xfer_limit. const size_t max_old_reserve = (ShenandoahOldEvacRatioPercent == 100) ? old_available + xfer_limit: (young_reserve * ShenandoahOldEvacRatioPercent) / (100 - ShenandoahOldEvacRatioPercent);

that we do:

const size_t max_old_reserve = (ShenandoahOldEvacRatioPercent == 100) ? (old_available + xfer_limit) : MIN2((young_reserve * ShenandoahOldEvacRatioPercent) / (100 - ShenandoahOldEvacRatioPercent), old_available + xfer_limit);

Effectively, you are using old_available + xfer_limit for what we can ever have for the maximum size of old_reserve. Otherwise, for suitably large values of ShenandoahOldEvacRatioPercent, you'll use a larger value of max_old_reserve than you have available even after using the transfer from young.

I guess I am not understanding enough of the subsequent bounds; I am just looking at the equivalence of the old code (before my change), the current code (after my previous change), and your proposed change which basically appears to say that we must augment whatever is availabe in old with whatever young is willing to transfer to old. That should happen irrespective of the what the combination of young_reserve and SOERP happens to be, not just special casing the extremal case that the previous fix handled. (Think about what happens in the usual cases where this value is left at the default: your proposed change would have no effect as far as I can see.)

With this PR, I was attempting to restore the "normal-case behavior" (when ShenandoahOldEvacRatioPercenet != 100) to how it behaved before #369

Before that change, this line of code did not impose any restriction on the size of old_evacuation_reserve based on old_available:

size_t maximum_old_evacuation_reserve = maximum_young_evacuation_reserve * ShenandoahOldEvacRatioPercent / (100 - ShenandoahOldEvacRatioPercent);

For this new code, I invented an "artificial limit" to replace "infiinity" in the case that ShenandoahOldEvacRatioPercent equals 100.

Having studied this issue in the current implementaiton, I am inclined to pursue an even more aggressive change in a distinct PR, which allows OLD to grow not only by stealing memory from the Mutator's excesses, but also by borrowing from the Young Collector's reserves. So I'd prefer not to place more restrictions on the allowed growth of old at this line of code.

If you feel more comfortable, I can put the MIN2 expression into the normal case handling. But I'll be wanting to take it out in the upcoming complementary PR.

Context: xfer_limit will be zero if there is no "planned" GC idle time. The allocation runway goes to zero if we have experienced recent degens and/or full GCs, because penalties accumulate which cause us to immediately trigger young GCs.

I am beginning to better understand what you were trying to achieve, but I am still not quite there.

Is there a natural sensible limit at which max_old_reserve can be bounded? It would seem then that, since you were not previously bounding the computation of max_old_reserve in any manner and you don't want to bound it to old_available + xfer_limit, that a more natural and essentially largest possible value would be the sum of what young can promote and what old can evacuate, which would look something like heap->max_capacity(), since it would effectively be morally equivalent to imposing no limits on max_old_reserve.

Alternatively, if you are considering changing this whole thing anyway, perhaps we just do that directly. If you expect that PR to take a while and you just want to restore old behaviour, I'd suggest bounding the calculation of max_old_reserve to heap->max_capacity(), since that is a natural limit irrespective of what SOERP happens to be (and not artificial and confusing like the one that you suggested covering that one case of SOERP sending the value to NaN but not otherwise bounding it and allowing it to grow arbitrarily large and wrapping around).

In other words, I suggest using:

const size_t max_old_reserve = (ShenandoahOldEvacRatioPercent == 100) ? heap->max_capacity() : MIN2((young_reserve * ShenandoahOldEvacRatioPercent) / (100 - ShenandoahOldEvacRatioPercent), heap->max_capacity());

Let me know if that makes sense.

... covering that one case of SOERP sending the value to NaN but not otherwise bounding it and allowing it to grow arbitrarily large and wrapping around).

I realize that there is in fact a natural bound to that value when SOERP < 100, viz. when it's 99 (since it's not a float): young_reserve * 99/(100-99), i.e. 99 * young_reserve. I guess the simpler thing to do then is to just avoid this completely and declare ShenandoahEvactReserve to range(1,99) and be done, throwing awy the protection for the lone case of SOERP=100 -- after all we don't allow SOERP=0, so by symmetry it looks like we shouldn't allow 100 either, just range(1,99).

Thanks for these suggestions. I'll see if I can stabilize a solution that works for all cases.

I've committed a new revision of this code. Does this make it more clear?

(I'll still keep the sharing of Collector reserve code in a different PR. That's a bit more subtle, and my first attempt at that code has introduced some regressions, which I'm debugging.)

ysramakrishna

LGTM!

openjdk · 2024-02-22T23:47:31Z

@kdnilsen This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8325670: GenShen: Allow old to expand at end of each GC

Reviewed-by: ysr

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 61 new commits pushed to the master branch:

c7bcd74: Merge
8cb9b47: 8321282: RISC-V: SpinPause() not implemented
1aae980: 8323994: gtest runner repeats test name for every single gtest assertion
810daf8: 8325910: Rename jnihelper.h
22e8181: 8325682: Rename nsk_strace.h
b823fa4: 8325574: Shenandoah: Simplify and enhance reporting of requested GCs
09d4936: 8252136: Several methods in hotspot are missing "static"
f6e2851: 8316340: (bf) Missing {@inheritdoc} for exception in MappedByteBuffer::compact
53878ee: 8325643: G1: Refactor G1FlushHumongousCandidateRemSets
130f429: 8325403: Add SystemGC JMH benchmarks
... and 51 more: https://git.openjdk.org/shenandoah/compare/8f4e6e226de7cb08f60bfd8dbbede466463d5b9d...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

kdnilsen · 2024-02-26T19:11:55Z

/integrate

openjdk · 2024-02-26T19:12:32Z

Going to push as commit 2f1fa6d.
Since your change was applied there have been 61 commits pushed to the master branch:

c7bcd74: Merge
8cb9b47: 8321282: RISC-V: SpinPause() not implemented
1aae980: 8323994: gtest runner repeats test name for every single gtest assertion
810daf8: 8325910: Rename jnihelper.h
22e8181: 8325682: Rename nsk_strace.h
b823fa4: 8325574: Shenandoah: Simplify and enhance reporting of requested GCs
09d4936: 8252136: Several methods in hotspot are missing "static"
f6e2851: 8316340: (bf) Missing {@inheritdoc} for exception in MappedByteBuffer::compact
53878ee: 8325643: G1: Refactor G1FlushHumongousCandidateRemSets
130f429: 8325403: Add SystemGC JMH benchmarks
... and 51 more: https://git.openjdk.org/shenandoah/compare/8f4e6e226de7cb08f60bfd8dbbede466463d5b9d...master

Your commit was automatically rebased without conflicts.

openjdk · 2024-02-26T19:12:40Z

@kdnilsen Pushed as commit 2f1fa6d.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Allow old-gen to expand when mutator memory is available

f583f71

openjdk bot added the rfr label Feb 12, 2024

ysramakrishna reviewed Feb 12, 2024

View reviewed changes

Refine calculation of max_old_reserve

Loading
Loading status checks…

e0761e8

ysramakrishna approved these changes Feb 22, 2024

View reviewed changes

openjdk bot added the ready label Feb 22, 2024

openjdk bot added the integrated label Feb 26, 2024

openjdk bot closed this Feb 26, 2024

openjdk bot removed ready rfr labels Feb 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

8325670: GenShen: Allow old to expand at end of each GC #394

8325670: GenShen: Allow old to expand at end of each GC #394

kdnilsen commented Feb 12, 2024 •

edited by openjdk bot

Loading

bridgekeeper bot commented Feb 12, 2024

mlbridge bot commented Feb 12, 2024 •

edited

Loading

ysramakrishna Feb 12, 2024

kdnilsen Feb 13, 2024

ysramakrishna Feb 13, 2024 •

edited

Loading

kdnilsen Feb 19, 2024 •

edited

Loading

kdnilsen Feb 19, 2024

kdnilsen Feb 19, 2024

ysramakrishna Feb 19, 2024 •

edited

Loading

ysramakrishna Feb 20, 2024

kdnilsen Feb 20, 2024

kdnilsen Feb 20, 2024

ysramakrishna left a comment

openjdk bot commented Feb 22, 2024

kdnilsen commented Feb 26, 2024

openjdk bot commented Feb 26, 2024

openjdk bot commented Feb 26, 2024

8325670: GenShen: Allow old to expand at end of each GC #394

8325670: GenShen: Allow old to expand at end of each GC #394

Conversation

kdnilsen commented Feb 12, 2024 • edited by openjdk bot Loading

Progress

Issue

Reviewers

Reviewing

Webrev

bridgekeeper bot commented Feb 12, 2024

mlbridge bot commented Feb 12, 2024 • edited Loading

Webrevs

ysramakrishna Feb 12, 2024

Choose a reason for hiding this comment

kdnilsen Feb 13, 2024

Choose a reason for hiding this comment

ysramakrishna Feb 13, 2024 • edited Loading

Choose a reason for hiding this comment

kdnilsen Feb 19, 2024 • edited Loading

Choose a reason for hiding this comment

kdnilsen Feb 19, 2024

Choose a reason for hiding this comment

kdnilsen Feb 19, 2024

Choose a reason for hiding this comment

ysramakrishna Feb 19, 2024 • edited Loading

Choose a reason for hiding this comment

ysramakrishna Feb 20, 2024

Choose a reason for hiding this comment

kdnilsen Feb 20, 2024

Choose a reason for hiding this comment

kdnilsen Feb 20, 2024

Choose a reason for hiding this comment

ysramakrishna left a comment

Choose a reason for hiding this comment

openjdk bot commented Feb 22, 2024

kdnilsen commented Feb 26, 2024

openjdk bot commented Feb 26, 2024

openjdk bot commented Feb 26, 2024

kdnilsen commented Feb 12, 2024 •

edited by openjdk bot

Loading

mlbridge bot commented Feb 12, 2024 •

edited

Loading

ysramakrishna Feb 13, 2024 •

edited

Loading

kdnilsen Feb 19, 2024 •

edited

Loading

ysramakrishna Feb 19, 2024 •

edited

Loading