-
Notifications
You must be signed in to change notification settings - Fork 5.8k
8346920: Serial: Support allocation in old generation before GC #22899
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
👋 Welcome back ayang! A progress list of the required criteria for merging this PR into |
@albertnetymk This change is no longer ready for integration - check the PR body for details. |
@albertnetymk The following label will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command. |
Webrevs
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice improvement.
// Note that only large objects get a shot at being | ||
// allocated in later generations. | ||
bool first_only = !should_try_older_generation_allocation(size); | ||
|
||
bool first_only = !should_try_older_generation_allocation(size) | ||
&& is_long_enough_from_prev_gc_pause_end(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the first_only
actually young_only
? It means that the allocation is only attempted in young generation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's true. I believe Serial used to support >2 generations, so some generic names are left as is. Fixing it requires updating in multiple places, such as the method name older_generation
, arg in attempt_allocation
, etc. How about we do that invasive but pure refactoring in its own PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, such refactoring should be finished in other issues.
|
||
Ticks prev_gc_pause_end; | ||
Tickspan gc_pause; | ||
if (full_gc_pause_end < young_gc_pause_end ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A unnecessary space after young_gc_pause_end
.
@@ -50,6 +50,7 @@ class STWGCTimer; | |||
|
|||
class DefNewGeneration: public Generation { | |||
friend class VMStructs; | |||
friend class SerialHeap; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really know whether it is a good way. Maybe we can export a method DefNewGeneration::gc_timer
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
As far as I understand, the test program expands that The problem to be solved in this case is how quickly the collector goes OOM; it also is related to the C2 compiler doing reallocation of C2 stack-allocated objects (when I ran the test with Current code, since the typical allocation here is small, prefers doing a GC pause to allocation in old gen because the allocation is small (that The reason why the application will fail faster is that it allows many more increments of that (It would be nice if the title and description of the CR somehow reflected this - it reads like there is no support of direct allocation in old gen for Serial GC which is in fact not the case). Incidentally, Parallel GC has the same issue, running hundreds of full gcs before terminating (I did not check how many it takes, and probably never also waited for Serial GC). As far as I understand the change, this change prevents "back-to-back" GCs, opting for allocating everything into the old generation (if not full) instead. I wonder if that has not negative impact (more full gcs) for applications that legitimately allocate lots of (short-living) objects as this will fill up the old generation (much) faster than before. Particularly applications that use relatively small heap for that. Did you run some other benchmarks with this change? (Maybe the smaller dacapo benchmarks at "reasonably" sized heaps will show something?) I'm actually a bit conflicted about this change. It seems to fix a (rare?) policy issue, but at the same time adds a new mechanic (use of Hth, |
Mailing list message from Kirk Pepperdine on hotspot-gc-dev: Hi, One of the more important goals when tuning generational collectors and especially when tuning the the serial and parallel is to minimize the number of transients that end up in tenured. With that in mind I?m nervous about allocating in tenured to avoid a collection cycle. First, the probability that these (prematurely promoted) allocations are transit is likely > 90%. More over, transients in tenured tend to create zombies. Zombies put further pressure on tenure. Each of these ?conditions? will increase the frequency of full collections and fulls are way more costly than young collections. IME, techniques that have been tried in the past to avoid collections end up being more expensive than simply triggering the collection when it is needed. Worst case, I?d allocate in the active survivor space as that should be about 50% empty by default. At least that space will be collected by the next young gen thus avoiding the problems that come with having transients in tenured. Kind regards, |
I would not say this patch prefers allocation-in-old-gen over running gc. The behavior is governed by
Tried dacapo bms but no perf diff using 2x-live-size heap sizes. (This patch mostly affects the back-to-back gc case, which should be rare in properly configured system/bms.)
Currently, there is no way to tune Serial based on app-vs-gc time, and this patch essentially adds that support. |
Mailing list message from Kirk Pepperdine on hotspot-gc-dev: Hi,
This is counter to my experience. If young gen is properly tuned, GC overhead is generally minimalized. IME, this is the third most optimal way to control GC overhead. Using GCTimeRatio is way down on that list given the current implementations. Kind regards, |
That approach is inflexible and requires trial and error, as it depends on knowing the allocation rate. The GC percentage (
Serial doesn't support |
Superseded by #23270, where it doesn't use |
This PR introduces a new strategy to determine whether an allocation should be attempted in the old generation or if a GC cycle should be initiated, based on the
GCTimeRatio
. With this change, the benchmark attached to the ticket now completes in ~13 GC, a significant improvement compared to the >1000 GC observed previously.Test: tier1-3
Progress
Integration blocker
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/22899/head:pull/22899
$ git checkout pull/22899
Update a local copy of the PR:
$ git checkout pull/22899
$ git pull https://git.openjdk.org/jdk.git pull/22899/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 22899
View PR using the GUI difftool:
$ git pr show -t 22899
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/22899.diff
Using Webrev
Link to Webrev Comment