Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8305898: Alternative self-forwarding mechanism #17755

Closed
wants to merge 5 commits into from

Conversation

rkennke
Copy link
Contributor

@rkennke rkennke commented Feb 7, 2024

Currently, the Serial, Parallel and G1 GCs store a pointer to self into object headers to indicate promotion failure. This is problematic for compact object headers (JDK-8294992) because it would (temporarily) over-write the crucial class information, which we need for heap parsing. I would like to propose an alternative: use the currently unused 3rd header bit (previously biased-locking bit) to indicate that an object is 'self-forwarded'. That preserves the crucial class information in the upper bits of the header until the full header gets restored.

This is a trimmed-down/simplified version of the original proposal #13779:

  • It doesn't use/introduce any flags and avoids the associated branching.
  • It doesn't (need to) deal with displaced headers. (Current code would preserve header if necessary, Lilliput code would not use displaced headers and set the 3rd bit directly in existing header.)

Testing:

  • hotspot_gc
  • tier1
  • tier2

Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Integration blocker

 ⚠️ Whitespace errors (failed with updated jcheck configuration in pull request)

Issue

  • JDK-8305898: Alternative self-forwarding mechanism (Enhancement - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/17755/head:pull/17755
$ git checkout pull/17755

Update a local copy of the PR:
$ git checkout pull/17755
$ git pull https://git.openjdk.org/jdk.git pull/17755/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 17755

View PR using the GUI difftool:
$ git pr show -t 17755

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/17755.diff

Webrev

Link to Webrev Comment

Sorry, something went wrong.

@bridgekeeper
Copy link

bridgekeeper bot commented Feb 7, 2024

👋 Welcome back rkennke! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Pull request is ready for review label Feb 7, 2024
@openjdk
Copy link

openjdk bot commented Feb 7, 2024

@rkennke The following label will be automatically applied to this pull request:

  • hotspot

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot hotspot-dev@openjdk.org label Feb 7, 2024
@mlbridge
Copy link

mlbridge bot commented Feb 7, 2024

Webrevs

@rkennke
Copy link
Contributor Author

rkennke commented Feb 7, 2024

/label add hotspot-gc

@openjdk openjdk bot added the hotspot-gc hotspot-gc-dev@openjdk.org label Feb 7, 2024
@openjdk
Copy link

openjdk bot commented Feb 7, 2024

@rkennke
The hotspot-gc label was successfully added.

Copy link
Member

@stefank stefank left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This adds a branch to oopDesc::forwardee. Do you have an GC time measurements to show that this isn't causing a regression?

@rkennke
Copy link
Contributor Author

rkennke commented Feb 9, 2024

This adds a branch to oopDesc::forwardee. Do you have an GC time measurements to show that this isn't causing a regression?

I just ran the (young-GC-heavy) derby benchmark, with more iterations, which does about 1600 GC cycles. Those are the average GC times of young GCs, running with G1:
baseline: 2.88624ms
selffwd: 2.88000ms

With Parallel GC, I get:
baseline: 2.80709
selffwd: 2.74724

When repeating the experiment I get similar results, sometimes baseline is slightliy better, sometimes selffwd is slightly better, but I don't see anything that would support a 'regression' with selffwd.

// Align at least to 8 bytes, otherwise the lowest address bit
// could be interpreted as 'self-forwarded' when encoded as
// forwardee in the mark-word.
alignas(BytesPerLong) oopDesc _oop;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain why this is required? (I'd expect oopDesc to be properly aligned already, since it's >= 8 bytes.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a problem with x86_32 builds, where oopDesc is only 4 byte. It is only a problem in this test, in real world, objects in the heap are always at least 8-byte-aligned.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even on 32bits system, shouldn't _mark = 4 bytes and _metadata = 4 bytes, so the total size of oopDesc is 8 bytes?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The alignment of a struct must just ensure that the alignments requirement of its fields are satisfied (in the general case) So a struct of two 4 byte ints will have an alignof 4 bytes and a sizeof 8 bytes

#include <cstdint>
struct S {
    uint32_t a;
    uint32_t b;
};

static_assert(alignof(S) == 4);
static_assert(sizeof(S) == 8);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the explanation; I mistakenly thought a struct is aligned to its size...

Copy link
Contributor

@tschatzl tschatzl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please close out PR#13779.

@openjdk
Copy link

openjdk bot commented Feb 15, 2024

@rkennke This change is no longer ready for integration - check the PR body for details.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Feb 15, 2024
@rkennke
Copy link
Contributor Author

rkennke commented Feb 26, 2024

A potential improvement to this might be that we could avoid preserving the header around self-forwarding/promo-failure (and therefore remove a bunch of code/complexity). What's needed for this is a fast way to detect is_forwarded() that includes both scenarios normal-forwarded and self-forwarded.

Currently I also set normal-forwarded bits when self-forwarding to facilitate simple (v & 3) == 3 test. We would need something like ((v & 4) != 0) || ((v & 3) == 3) test, but ideally without the branch. We could test ((v & 7) > 2. That is same number of ops as the current test, so wouldn't add any overhead.

Then, instead of init-ing the self-forwarded header, and restoring any 'interesting' marks (hashed, locked), we could simply flip-back the self-fwd bit and that'd be it, and would not need any header-preservation in the promo-failure path.

  1. Do we want this?
  2. Do we want this in this PR, or better in a follow-up PR (including the removal of header-preservation around promo-failures)?

@albertnetymk
Copy link
Member

I'm leaning towards doing that in a follow-up PR.

@tschatzl
Copy link
Contributor

I'm leaning towards doing that in a follow-up PR.

+1

@rkennke
Copy link
Contributor Author

rkennke commented Feb 27, 2024

I'm leaning towards doing that in a follow-up PR.

Ok. But maybe this would be an incentive to do this change ahead of the Lilliput main JEP, in JDK23? This PR, as it is proposed now doesn't provide much value in itself, but with the proposed follow-up, it does improve/simplify promotion-failure handling, and restricts usage of the PreservedMarks stuff to the full-GCs only.

@tschatzl
Copy link
Contributor

We often do preparatory changes separately from the main improvement for several reasons, one of them to reduce complexity of the follow-up.

@openjdk
Copy link

openjdk bot commented Apr 10, 2024

@rkennke this pull request can not be integrated into master due to one or more merge conflicts. To resolve these merge conflicts and update this pull request you can run the following commands in the local repository for your personal fork:

git checkout JDK-8305898-v2
git fetch https://git.openjdk.org/jdk.git master
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge master"
git push

@openjdk openjdk bot added merge-conflict Pull request has merge conflict with target branch and removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Apr 10, 2024
@rkennke
Copy link
Contributor Author

rkennke commented Apr 29, 2024

Will re-open when ready in Lilliput

@rkennke rkennke closed this Apr 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot hotspot-dev@openjdk.org hotspot-gc hotspot-gc-dev@openjdk.org merge-conflict Pull request has merge conflict with target branch
Development

Successfully merging this pull request may close these issues.

None yet

5 participants