Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8312182: THPs cause huge RSS due to thread start timing issue #1679

Closed
wants to merge 1 commit into from

Conversation

tstuefe
Copy link
Member

@tstuefe tstuefe commented Aug 21, 2023

Unclean composite backport to jdk17u. Fixes JDK-8312182 - "THPs cause huge RSS due to thread start timing issue" (https://bugs.openjdk.org/browse/JDK-8312182)

Problem:

On a machine with transparent huge pages (THP) unconditionally enabled (/sys/kernel/mm/transparent_hugepage/enabled = "always"), the JVM may show a huge memory footprint (RSS) and degraded thread start performance.

The following factors make the problem more severe and more likely:

  • thread stack size of 2M (on arm64 or x64) or larger
  • many threads, or high thread creation churn
  • a slow or overloaded machine (since part of the problem is timing-dependent)

For a detailed discussion of the underlying problem, please see openjdk/jdk#14919.


In jdk Head, the issue got fixed with a sequence of patches:

  • JDK-8303215 "Make thread stacks not use huge pages"
  • JDK-8312182 "THPs cause huge RSS due to thread start timing"

However, JDK-8312182 itself needed one preparatory fix:

and then we had several corner-case test problems which are fixed with:

  • JDK-8312394 "[linux] SIGSEGV if kernel was built without hugepage support"
  • JDK-8312620 "WSL Linux build crashes after JDK-8310233"
  • JDK-8314139 "TEST_BUG: runtime/os/THPsInThreadStackPreventionTest.java could fail on machine with large number of cores"

and finally, we decided to rename the switch that allows to switch off the THP mitigation with a final patch:

  • JDK-8312585 "Rename DisableTHPStackMitigation flag to THPStackMitigation"

Instead of downporting these 7 patches verbatim, I prepared a composite patch containing only the necessary mitigation and mitigation tests.

This is similar to the jdk11u downport, but in jdk17u, JDK-8303215 had been already backported. Therefore there are some minor differences.

This patch does:

  • make sure that all thread stacks have at least one glibc guard page to prevent clustering of adjacent thread stacks into one VMA
  • change the default size of stacks to be not aligned to 2MB to prevent intra-stack THPs from forming

The patch needs some infrastructure, but I downported only the necessary parts: the helper class "HugePages", which is used in head to scan the operating system for information about THP settings. I only included the parts to do with THPs and left the rest out.

The patch also includes a regression test.


Testing:

I manually tested the JVM on Linux x64 with THP=always:

Without the patch (-Xmx1g -Xms1g -XX:+AlwaysPreTouch -Xss2m, 10000 threads started), I see slow thread startup and 11 GB - 14 GB of RSS.

The patched version comes up a lot faster and only shows 1.3 GB of RSS.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issues

  • JDK-8312182: THPs cause huge RSS due to thread start timing issue (Bug - P3)
  • JDK-8310687: JDK-8303215 is incomplete (Bug - P4)
  • JDK-8310233: Fix THP detection on Linux (Bug - P4)
  • JDK-8312394: [linux] SIGSEGV if kernel was built without hugepage support (Bug - P3)
  • JDK-8312620: WSL Linux build crashes after JDK-8310233 (Bug - P3)
  • JDK-8314139: TEST_BUG: runtime/os/THPsInThreadStackPreventionTest.java could fail on machine with large number of cores (Bug - P4)
  • JDK-8312585: Rename DisableTHPStackMitigation flag to THPStackMitigation (Enhancement - P4)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk17u-dev.git pull/1679/head:pull/1679
$ git checkout pull/1679

Update a local copy of the PR:
$ git checkout pull/1679
$ git pull https://git.openjdk.org/jdk17u-dev.git pull/1679/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 1679

View PR using the GUI difftool:
$ git pr show -t 1679

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk17u-dev/pull/1679.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Aug 21, 2023

👋 Welcome back stuefe! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot changed the title Backport 84b325b844c08809448a9c073a11443d9e3c3f8e 8312182: THPs cause huge RSS due to thread start timing issue Aug 21, 2023
@openjdk
Copy link

openjdk bot commented Aug 21, 2023

This backport pull request has now been updated with issues from the original commit.

@openjdk openjdk bot added the backport label Aug 21, 2023
@tstuefe tstuefe marked this pull request as ready for review August 21, 2023 16:05
@openjdk openjdk bot added the rfr Pull request is ready for review label Aug 21, 2023
@mlbridge
Copy link

mlbridge bot commented Aug 21, 2023

Webrevs

@shipilev
Copy link
Member

Process-wise and for historical record, if you do a composite patch out of several issues, you need at least mark those issues as fixed by this PR. Tell bot like this: /issue add 8303215,8312182,...

@tstuefe
Copy link
Member Author

tstuefe commented Aug 23, 2023

/issue add 8310233 8312394 8312620 8314139 8312585

@openjdk
Copy link

openjdk bot commented Aug 23, 2023

@tstuefe
Adding additional issue to issue list: 8310233: Fix THP detection on Linux.

Adding additional issue to issue list: 8312394: [linux] SIGSEGV if kernel was built without hugepage support.

Adding additional issue to issue list: 8312620: WSL Linux build crashes after JDK-8310233.

Adding additional issue to issue list: 8314139: TEST_BUG: runtime/os/THPsInThreadStackPreventionTest.java could fail on machine with large number of cores.

Adding additional issue to issue list: 8312585: Rename DisableTHPStackMitigation flag to THPStackMitigation.

@tstuefe
Copy link
Member Author

tstuefe commented Aug 24, 2023

Friendly ping. It would be good to get this fixed in time for the next CPU.

@shipilev
Copy link
Member

shipilev commented Aug 24, 2023

Friendly ping. It would be good to get this fixed in time for the next CPU.

I have concerns about this, for several reasons:

  1. The fix has a considerable bugtail in the product code, which does not inspire confidence there would not be more soon;
  2. The fix is enabled by default, which means any bugs in it would be exposed to users immediately;
  3. The review is hard, because it cobbles 7 commits together, so it would take a while to disentangle if any mistakes crept in (note: the usual way to deal with this problem is to propose the PR with a commit per issue, so we can review individual commits);

This also does not look like a recent regression, but rather a long-standing bug, right?
So, is there a reason to rush it for 17.0.9 in October?

@tstuefe
Copy link
Member Author

tstuefe commented Aug 24, 2023

@shipilev Thanks for looking at this. I debated with myself for a long time whether this was the right approach. I did not choose to build a composite patch out of laziness (if anything, downporting the issues separately and verbatim would have been much simpler, albeit slower).

By providing a minimal (not cobbled together but carefully selected) patch I minimize the problem surface because I leave out code that have nothing to do with the goal of this patch: the static hugepage detection of JDK-8310233 "Fix THP detection on Linux").

  • The fix has a considerable bugtail in the product code, which does not inspire confidence there will not be more soon;

we have two trailing bugs in product code:
a) JDK-8312394 "[linux] SIGSEGV if kernel was built without hugepage support"
b) JDK-8312620 "WSL Linux build crashes after JDK-8310233"

(a) JDK-8312394 is of no concern since it only affects code I explicitly left out from the patch (it does not affect THPs). It would be a concern were I to downport patches individually and verbatim.

That leaves (b), which is a bug in a super obscure context (building on Windows with WSL that carries an arguably broken Linux kernel). One bug is not a very long tail.

  • The fix is enabled by default, which means any bugs in it would be exposed to users immediately;

I think this patch is just not that risky.

  • The review is hard, because it cobbles 7 commits together, so it would take a while to disentangle if any mistakes crept in (note: the usual way to deal with this problem is to propose the PR with a commit per issue, so we can review individual commits);

But that process would carry more risk since I would have to downport unnecessary parts and have time windows with unfixed bugs.

This also shows a shortcoming of the review process. If I downport stuff verbatim, reviews are simple since they are mostly mechanical mental diffing; but that is not the most ideal patch, which would be one that is small and confined.

This also does not look like a recent regression, but rather a long-standing bug, right? So, is there a reason to rush it for 17.0.9 in October?

We have customers running with THP enabled always and unwilling or unable to change that; they would be happy about a fix.

I'm fine with postponing this patch (not that I have any choice since I lack reviews and the window is almost closed). But the whole discussion leaves me dissatisfied with the practice of downporting whole patch trees to get a single issue fixed. We recently had similar discussions when downporting openjdk/jdk11u-dev#2035 which ended up a far bigger change than necessary, carrying a lot of code for the sole purpose of keeping a low delta to upstream.

@tstuefe
Copy link
Member Author

tstuefe commented Aug 25, 2023

okay, I'll withdraw. Let's do it piece by piece as usual.

@tstuefe tstuefe closed this Aug 25, 2023
@shipilev
Copy link
Member

okay, I'll withdraw. Let's do it piece by piece as usual.

Thanks!

Yes, piece by piece would be the right approach here. I don't actually mind clustering several changes into one PR, as long as PR commits tell the story well: what was picked, in what order, and what changes were done along the way.

I also don't mind bringing in safe-ish improvements to the common code if it resolves significant deviation from mainline. A palpable number of deviations we did over the years bit us in the back at unfortunate times...

@tstuefe tstuefe deleted the tstuefe-backport-84b325b8 branch August 25, 2023 11:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport rfr Pull request is ready for review
2 participants