Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8325670: GenShen: Allow old to expand at end of each GC #394

Closed
wants to merge 2 commits into from

Conversation

kdnilsen
Copy link
Contributor

@kdnilsen kdnilsen commented Feb 12, 2024

At the end of GC, we set aside collector reserves to satisfy anticipated needs of the next GC.

This PR reverts a change that accidentally prevents old-gen from being enlarged by this action. The observed failure condition was that mixed evacuations were not able to be performed, because old-gen was not large enough to receive the results of the desired evacuations.


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed (1 review required, with at least 1 Committer)

Issue

  • JDK-8325670: GenShen: Allow old to expand at end of each GC (Bug - P3)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/shenandoah.git pull/394/head:pull/394
$ git checkout pull/394

Update a local copy of the PR:
$ git checkout pull/394
$ git pull https://git.openjdk.org/shenandoah.git pull/394/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 394

View PR using the GUI difftool:
$ git pr show -t 394

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/shenandoah/pull/394.diff

Webrev

Link to Webrev Comment

Sorry, something went wrong.

@bridgekeeper
Copy link

bridgekeeper bot commented Feb 12, 2024

👋 Welcome back kdnilsen! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Pull request is ready for review label Feb 12, 2024
@mlbridge
Copy link

mlbridge bot commented Feb 12, 2024

Webrevs

Comment on lines 1273 to 1275
// In the case that ShenandoahOldEvacRatioPercent equals 100, max_old_reserve is limited only by xfer_limit.
const size_t max_old_reserve = (ShenandoahOldEvacRatioPercent == 100) ?
old_available + xfer_limit: (young_reserve * ShenandoahOldEvacRatioPercent) / (100 - ShenandoahOldEvacRatioPercent);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I don't understand two things here:

  1. Why do we special-case ShenandoahOldEvacRationPercent == 100 here? When it's less that 100, we consider xfer_limit only in the deficit calculations below. Should we be adding xfer_limit to the result of the above calculation irrespective of the setting of ShenandoahOldEvacRationPercent ?
  2. Where was this adjustment being made in the code before the changes of 8321939: [GenShen] ShenandoahOldEvacRatioPercent=100 fails with divide-by-zero #369 ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We special case ShenandoahOldEvacRatioPercent==100 because the "other case" has divide by (100 - ShenandoahOldEvacRatioPercent), which becomes divide by zero.

To generalize the form of the other expression, if ShenandoahOldEvacRatioPercent is 100, then there is no bound on maximum_old_evacuation_reserve. Or in other words, the bound is infinity times maximum_young_evacuation_reserve.

In the original code, before the referenced change, if we can get past the divide-by-zero issue, we would find expansion of old to be limited by the xfer_limit at line 1265:
if (old_region_deficit > max_old_region_xfer) {
old_region_deficit = max_old_region_xfer;
}

We still ultimately limit expansion by xfer_limit.

I may have misunderstood your questions. Please let me know if I missed the mark.

Copy link
Member

@ysramakrishna ysramakrishna Feb 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We special case ShenandoahOldEvacRatioPercent==100 because the "other case" has divide by (100 - ShenandoahOldEvacRatioPercent), which becomes divide by zero.

Yes, that I realize. I was asking about the addition of xfer_limit in just this case and not otherwise.

To generalize the form of the other expression, if ShenandoahOldEvacRatioPercent is 100, then there is no bound on maximum_old_evacuation_reserve. Or in other words, the bound is infinity times maximum_young_evacuation_reserve.

Correct. So I bounded it by max available. You corrected it to max_available + xfer_limit. It seems as if you want to bound everything by (max_available + xfer_limit).

In the original code, before the referenced change, if we can get past the divide-by-zero issue, we would find expansion of old to be limited by the xfer_limit at line 1265: if (old_region_deficit > max_old_region_xfer) { old_region_deficit = max_old_region_xfer; }

That's still the case with old_region_deficit without your current change.

We still ultimately limit expansion by xfer_limit.

I think that happened before as well, except when now because of your change we treat SOERP=100 specially (but nothing else).

I may have misunderstood your questions. Please let me know if I missed the mark.

What I am suggesting is that where we used to do:

  const size_t max_old_reserve = (ShenandoahOldEvacRatioPercent == 100) ?
     old_available : MIN2((young_reserve * ShenandoahOldEvacRatioPercent) / (100 - ShenandoahOldEvacRatioPercent),
                          old_available);

instead of doing what you suggest above, viz.:

  // In the case that ShenandoahOldEvacRatioPercent equals 100, max_old_reserve is limited only by xfer_limit.
  const size_t max_old_reserve = (ShenandoahOldEvacRatioPercent == 100) ?
    old_available + xfer_limit: (young_reserve * ShenandoahOldEvacRatioPercent) / (100 - ShenandoahOldEvacRatioPercent);

that we do:

  const size_t max_old_reserve = (ShenandoahOldEvacRatioPercent == 100) ?
     (old_available + xfer_limit) : MIN2((young_reserve * ShenandoahOldEvacRatioPercent) / (100 - ShenandoahOldEvacRatioPercent),
                                         old_available + xfer_limit);

Effectively, you are using old_available + xfer_limit for what we can ever have for the maximum size of old_reserve. Otherwise, for suitably large values of ShenandoahOldEvacRatioPercent, you'll use a larger value of max_old_reserve than you have available even after using the transfer from young.

I guess I am not understanding enough of the subsequent bounds; I am just looking at the equivalence of the old code (before my change), the current code (after my previous change), and your proposed change which basically appears to say that we must augment whatever is availabe in old with whatever young is willing to transfer to old. That should happen irrespective of the what the combination of young_reserve and SOERP happens to be, not just special casing the extremal case that the previous fix handled. (Think about what happens in the usual cases where this value is left at the default: your proposed change would have no effect as far as I can see.)

Copy link
Contributor Author

@kdnilsen kdnilsen Feb 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this PR, I was attempting to restore the "normal-case behavior" (when ShenandoahOldEvacRatioPercenet != 100) to how it behaved before #369

Before that change, this line of code did not impose any restriction on the size of old_evacuation_reserve based on old_available:

 size_t maximum_old_evacuation_reserve =  maximum_young_evacuation_reserve * ShenandoahOldEvacRatioPercent / (100 - ShenandoahOldEvacRatioPercent);

For this new code, I invented an "artificial limit" to replace "infiinity" in the case that ShenandoahOldEvacRatioPercent equals 100.

Having studied this issue in the current implementaiton, I am inclined to pursue an even more aggressive change in a distinct PR, which allows OLD to grow not only by stealing memory from the Mutator's excesses, but also by borrowing from the Young Collector's reserves. So I'd prefer not to place more restrictions on the allowed growth of old at this line of code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you feel more comfortable, I can put the MIN2 expression into the normal case handling. But I'll be wanting to take it out in the upcoming complementary PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Context: xfer_limit will be zero if there is no "planned" GC idle time. The allocation runway goes to zero if we have experienced recent degens and/or full GCs, because penalties accumulate which cause us to immediately trigger young GCs.

Copy link
Member

@ysramakrishna ysramakrishna Feb 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am beginning to better understand what you were trying to achieve, but I am still not quite there.

Is there a natural sensible limit at which max_old_reserve can be bounded? It would seem then that, since you were not previously bounding the computation of max_old_reserve in any manner and you don't want to bound it to old_available + xfer_limit, that a more natural and essentially largest possible value would be the sum of what young can promote and what old can evacuate, which would look something like heap->max_capacity(), since it would effectively be morally equivalent to imposing no limits on max_old_reserve.

Alternatively, if you are considering changing this whole thing anyway, perhaps we just do that directly. If you expect that PR to take a while and you just want to restore old behaviour, I'd suggest bounding the calculation of max_old_reserve to heap->max_capacity(), since that is a natural limit irrespective of what SOERP happens to be (and not artificial and confusing like the one that you suggested covering that one case of SOERP sending the value to NaN but not otherwise bounding it and allowing it to grow arbitrarily large and wrapping around).

In other words, I suggest using:

  const size_t max_old_reserve = (ShenandoahOldEvacRatioPercent == 100) ?
    heap->max_capacity() : MIN2((young_reserve * ShenandoahOldEvacRatioPercent) / (100 - ShenandoahOldEvacRatioPercent),
                                heap->max_capacity());

Let me know if that makes sense.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... covering that one case of SOERP sending the value to NaN but not otherwise bounding it and allowing it to grow arbitrarily large and wrapping around).

I realize that there is in fact a natural bound to that value when SOERP < 100, viz. when it's 99 (since it's not a float): young_reserve * 99/(100-99), i.e. 99 * young_reserve. I guess the simpler thing to do then is to just avoid this completely and declare ShenandoahEvactReserve to range(1,99) and be done, throwing awy the protection for the lone case of SOERP=100 -- after all we don't allow SOERP=0, so by symmetry it looks like we shouldn't allow 100 either, just range(1,99).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for these suggestions. I'll see if I can stabilize a solution that works for all cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've committed a new revision of this code. Does this make it more clear?

(I'll still keep the sharing of Collector reserve code in a different PR. That's a bit more subtle, and my first attempt at that code has introduced some regressions, which I'm debugging.)

Copy link
Member

@ysramakrishna ysramakrishna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@openjdk
Copy link

openjdk bot commented Feb 22, 2024

@kdnilsen This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8325670: GenShen: Allow old to expand at end of each GC

Reviewed-by: ysr

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 61 new commits pushed to the master branch:

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Feb 22, 2024
@kdnilsen
Copy link
Contributor Author

/integrate

@openjdk
Copy link

openjdk bot commented Feb 26, 2024

Going to push as commit 2f1fa6d.
Since your change was applied there have been 61 commits pushed to the master branch:

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Feb 26, 2024
@openjdk openjdk bot closed this Feb 26, 2024
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Feb 26, 2024
@openjdk
Copy link

openjdk bot commented Feb 26, 2024

@kdnilsen Pushed as commit 2f1fa6d.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

None yet

2 participants