8346920: Serial: Support allocation in old generation before GC #22899

albertnetymk · 2025-01-02T06:30:00Z

This PR introduces a new strategy to determine whether an allocation should be attempted in the old generation or if a GC cycle should be initiated, based on the GCTimeRatio. With this change, the benchmark attached to the ticket now completes in ~13 GC, a significant improvement compared to the >1000 GC observed previously.

Test: tier1-3

Progress

Change must be properly reviewed (1 review required, with at least 1 Reviewer)
Change must not contain extraneous whitespace
Commit message must refer to an issue

Integration blocker

⚠️ Title mismatch between PR and JBS for issue JDK-8346920

Issue

JDK-8346920: Serial: Support allocation in old generation when heap is almost full (Enhancement - P4) ⚠️ Title mismatch between PR and JBS.

Reviewers

Guoxiong Li (@lgxbslgx - Reviewer)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/22899/head:pull/22899
$ git checkout pull/22899

Update a local copy of the PR:
$ git checkout pull/22899
$ git pull https://git.openjdk.org/jdk.git pull/22899/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 22899

View PR using the GUI difftool:
$ git pr show -t 22899

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/22899.diff

Using Webrev

Link to Webrev Comment

bridgekeeper · 2025-01-02T06:30:57Z

👋 Welcome back ayang! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

openjdk · 2025-01-02T06:31:23Z

@albertnetymk This change is no longer ready for integration - check the PR body for details.

openjdk · 2025-01-02T06:31:56Z

@albertnetymk The following label will be automatically applied to this pull request:

hotspot-gc

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

mlbridge · 2025-01-02T06:35:02Z

Webrevs

01: Full - Incremental (52aa4ce4)
00: Full (ee6300d7)

lgxbslgx

Nice improvement.

lgxbslgx · 2025-01-06T14:05:04Z

src/hotspot/share/gc/serial/serialHeap.cpp

-      // Note that only large objects get a shot at being
-      // allocated in later generations.
-      bool first_only = !should_try_older_generation_allocation(size);
+
+      bool first_only = !should_try_older_generation_allocation(size)
+                     && is_long_enough_from_prev_gc_pause_end();


Is the first_only actually young_only? It means that the allocation is only attempted in young generation?

That's true. I believe Serial used to support >2 generations, so some generic names are left as is. Fixing it requires updating in multiple places, such as the method name older_generation, arg in attempt_allocation, etc. How about we do that invasive but pure refactoring in its own PR?

Yes, such refactoring should be finished in other issues.

lgxbslgx · 2025-01-06T14:08:57Z

src/hotspot/share/gc/serial/serialHeap.cpp

+
+  Ticks prev_gc_pause_end;
+  Tickspan gc_pause;
+  if (full_gc_pause_end < young_gc_pause_end ) {


A unnecessary space after young_gc_pause_end.

lgxbslgx · 2025-01-06T14:13:44Z

src/hotspot/share/gc/serial/defNewGeneration.hpp

@@ -50,6 +50,7 @@ class STWGCTimer;

 class DefNewGeneration: public Generation {
  friend class VMStructs;
+  friend class SerialHeap;


Not really know whether it is a good way. Maybe we can export a method DefNewGeneration::gc_timer.

lgxbslgx

Looks good.

tschatzl · 2025-01-20T14:55:05Z

As far as I understand, the test program expands that ArrayList with new Short objects, allocating until OOM.

The problem to be solved in this case is how quickly the collector goes OOM; it also is related to the C2 compiler doing reallocation of C2 stack-allocated objects (when I ran the test with -XX:TieredStopAtLevel=3 Serial GC also quickly OOMEd reproducably).

Current code, since the typical allocation here is small, prefers doing a GC pause to allocation in old gen because the allocation is small (that Short object). The GC pause finds out that it should do a full gc only because its predictions indicate that the young gc is "unsafe" anyway (it's not "unsafe" as in not able to do it, but likely to end up in an evacuation failure).

The reason why the application will fail faster is that it allows many more increments of that list array without GC. Without the change, as the that array gets larger, the frequency of the GCs will get larger without noticable more promotion going on as the amount of possible reallocations of list per GC gets smaller. And it will do full gcs all the time because of the predictions indicating an unsafe young gc (which is correct as the list array will be > to-space quickly).

(It would be nice if the title and description of the CR somehow reflected this - it reads like there is no support of direct allocation in old gen for Serial GC which is in fact not the case).

Incidentally, Parallel GC has the same issue, running hundreds of full gcs before terminating (I did not check how many it takes, and probably never also waited for Serial GC).

As far as I understand the change, this change prevents "back-to-back" GCs, opting for allocating everything into the old generation (if not full) instead. I wonder if that has not negative impact (more full gcs) for applications that legitimately allocate lots of (short-living) objects as this will fill up the old generation (much) faster than before. Particularly applications that use relatively small heap for that.
It seems like this behavior may be advantageous for startup when supposedly initializing/loading data structures too (just fill up the whole heap and compact completely).

Did you run some other benchmarks with this change? (Maybe the smaller dacapo benchmarks at "reasonably" sized heaps will show something?)

I'm actually a bit conflicted about this change. It seems to fix a (rare?) policy issue, but at the same time adds a new mechanic (use of GCTimeRatio) for Serial GC.
So for now I would like to wait for measurements on potential impact on "more realistic" applications.

Hth,
Thomas

mlbridge · 2025-01-20T19:18:28Z

Mailing list message from Kirk Pepperdine on hotspot-gc-dev:

Hi,

One of the more important goals when tuning generational collectors and especially when tuning the the serial and parallel is to minimize the number of transients that end up in tenured. With that in mind I?m nervous about allocating in tenured to avoid a collection cycle. First, the probability that these (prematurely promoted) allocations are transit is likely > 90%. More over, transients in tenured tend to create zombies. Zombies put further pressure on tenure. Each of these ?conditions? will increase the frequency of full collections and fulls are way more costly than young collections.

IME, techniques that have been tried in the past to avoid collections end up being more expensive than simply triggering the collection when it is needed. Worst case, I?d allocate in the active survivor space as that should be about 50% empty by default. At least that space will be collected by the next young gen thus avoiding the problems that come with having transients in tenured.

Kind regards,
Kirk Pepperdine

albertnetymk · 2025-01-20T20:02:53Z

As far as I understand the change, this change prevents "back-to-back" GCs, opting for allocating everything into the old generation (if not full) instead.

I would not say this patch prefers allocation-in-old-gen over running gc. The behavior is governed by GCTimeRatio. One can get back the original behavior with a small GCTimeRatio.

Did you run some other benchmarks with this change? (Maybe the smaller dacapo benchmarks at "reasonably" sized heaps will show something?)

Tried dacapo bms but no perf diff using 2x-live-size heap sizes. (This patch mostly affects the back-to-back gc case, which should be rare in properly configured system/bms.)

It seems to fix a (rare?) policy issue, but at the same time adds a new mechanic (use of GCTimeRatio) for Serial GC.

Currently, there is no way to tune Serial based on app-vs-gc time, and this patch essentially adds that support.

mlbridge · 2025-01-20T20:27:28Z

Mailing list message from Kirk Pepperdine on hotspot-gc-dev:

Hi,

On Jan 20, 2025, at 9:05?PM, Albert Mingkun Yang <ayang at openjdk.org> wrote:

On Mon, 20 Jan 2025 14:55:05 GMT, Thomas Schatzl <tschatzl at openjdk.org> wrote:

As far as I understand the change, this change prevents "back-to-back" GCs, opting for allocating everything into the old generation (if not full) instead.

I would not say this patch prefers allocation-in-old-gen over running gc. The behavior is governed by `GCTimeRatio`. One can get back the original behavior with a small `GCTimeRatio`.

Did you run some other benchmarks with this change? (Maybe the smaller dacapo benchmarks at "reasonably" sized heaps will show something?)

Tried dacapo bms but no perf diff using 2x-live-size heap sizes. (This patch mostly affects the back-to-back gc case, which should be rare in properly configured system/bms.)

It seems to fix a (rare?) policy issue, but at the same time adds a new mechanic (use of GCTimeRatio) for Serial GC.

Currently, there is no way to tune Serial based on app-vs-gc time, and this patch essentially adds that support.

This is counter to my experience. If young gen is properly tuned, GC overhead is generally minimalized. IME, this is the third most optimal way to control GC overhead. Using GCTimeRatio is way down on that list given the current implementations.

Kind regards,
Kirk

albertnetymk · 2025-01-21T09:19:55Z

If young gen is properly tuned, GC overhead is generally minimalized.

That approach is inflexible and requires trial and error, as it depends on knowing the allocation rate. The GC percentage (GCTimeRatio) provides a more direct way to express that requirement or intention.

Using GCTimeRatio is way down on that list given the current implementations.

Serial doesn't support GCTimeRatio; this PR aims to add support for it in Serial. In the long run, as I envision it, once Serial implements auto-heap-resizing, MaxGCPauseMillis and GCTimeRatio will become the two primary flags for controlling GC.

albertnetymk · 2025-01-23T15:13:47Z

Superseded by #23270, where it doesn't use GCTimeRatio and affects only tight-heap configurations.

s1-gc-time-ratio

ee6300d

openjdk bot changed the title ~~8346920~~ 8346920: Serial: Support allocation in old generation before GC Jan 2, 2025

openjdk bot added the rfr label Jan 2, 2025

openjdk bot added the hotspot-gc label Jan 2, 2025

lgxbslgx suggested changes Jan 6, 2025

View reviewed changes

albertnetymk added 2 commits January 6, 2025 15:23

Merge branch 'master' into s1-gc-time-ratio

453dcc3

review

52aa4ce

lgxbslgx approved these changes Jan 7, 2025

View reviewed changes

openjdk bot added the ready label Jan 7, 2025

openjdk bot removed the ready label Jan 23, 2025

albertnetymk closed this Jan 23, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

8346920: Serial: Support allocation in old generation before GC #22899

8346920: Serial: Support allocation in old generation before GC #22899

albertnetymk commented Jan 2, 2025 •

edited by openjdk bot

Loading

bridgekeeper bot commented Jan 2, 2025

openjdk bot commented Jan 2, 2025 •

edited

Loading

openjdk bot commented Jan 2, 2025

mlbridge bot commented Jan 2, 2025 •

edited

Loading

lgxbslgx left a comment

lgxbslgx Jan 6, 2025

albertnetymk Jan 6, 2025

lgxbslgx Jan 7, 2025

lgxbslgx Jan 6, 2025

lgxbslgx Jan 6, 2025

lgxbslgx left a comment

tschatzl commented Jan 20, 2025

mlbridge bot commented Jan 20, 2025

albertnetymk commented Jan 20, 2025

mlbridge bot commented Jan 20, 2025

albertnetymk commented Jan 21, 2025

albertnetymk commented Jan 23, 2025

8346920: Serial: Support allocation in old generation before GC #22899

8346920: Serial: Support allocation in old generation before GC #22899

Conversation

albertnetymk commented Jan 2, 2025 • edited by openjdk bot Loading

Progress

Integration blocker

Issue

Reviewers

Reviewing

bridgekeeper bot commented Jan 2, 2025

openjdk bot commented Jan 2, 2025 • edited Loading

openjdk bot commented Jan 2, 2025

mlbridge bot commented Jan 2, 2025 • edited Loading

Webrevs

lgxbslgx left a comment

Choose a reason for hiding this comment

lgxbslgx Jan 6, 2025

Choose a reason for hiding this comment

albertnetymk Jan 6, 2025

Choose a reason for hiding this comment

lgxbslgx Jan 7, 2025

Choose a reason for hiding this comment

lgxbslgx Jan 6, 2025

Choose a reason for hiding this comment

lgxbslgx Jan 6, 2025

Choose a reason for hiding this comment

lgxbslgx left a comment

Choose a reason for hiding this comment

tschatzl commented Jan 20, 2025

mlbridge bot commented Jan 20, 2025

albertnetymk commented Jan 20, 2025

mlbridge bot commented Jan 20, 2025

albertnetymk commented Jan 21, 2025

albertnetymk commented Jan 23, 2025

albertnetymk commented Jan 2, 2025 •

edited by openjdk bot

Loading

openjdk bot commented Jan 2, 2025 •

edited

Loading

mlbridge bot commented Jan 2, 2025 •

edited

Loading