Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8324241: Always record evol_method deps to avoid excessive method flushing #17509

Closed
wants to merge 5 commits into from

Conversation

simonis
Copy link
Member

@simonis simonis commented Jan 20, 2024

Currently we don't record dependencies on redefined methods (i.e. evol_method dependencies) in JIT compiled methods if none of the can_redefine_classes, can_retransform_classes or can_generate_breakpoint_events JVMTI capabalities is set. This means that if a JVMTI agent which requests one of these capabilities is dynamically attached, all the methods which have been JIT compiled until that point, will be marked for deoptimization and flushed from the code cache. For large, warmed-up applications this mean deoptimization and instant recompilation of thousands if not then-thousands of methods, which can lead to dramatic performance/latency drop-downs for several minutes.

One could argue that dynamic agent attach is now deprecated anyway (see JEP 451: Prepare to Disallow the Dynamic Loading of Agents) and this problem could be solved by making the recording of evol_method dependencies dependent on the new -XX:+EnableDynamicAgentLoading flag isntead of the concrete JVMTI capabilities (because the presence of the flag indicates that an agent will be loaded eventually).

But there a single, however important exception to this rule and that's JFR. JFR is advertised as low overhead profiler which can be enabled in production at any time. However, when JFR is started dynamically (e.g. through JCMD or JMX) it will silently load a HotSpot internl JVMTI agent which requests the can_retransform_classes and retransforms some classes. This will inevitably trigger the deoptimization of all compiled methods as described above.

I'd therefor like to propose to always and unconditionally record evol_method dependencies in JIT compiled code by exporting the relevant properties right at startup in init_globals():

 jint init_globals() {
   management_init();
   JvmtiExport::initialize_oop_storage();
+#if INCLUDE_JVMTI
+  JvmtiExport::set_can_hotswap_or_post_breakpoint(true);
+  JvmtiExport::set_all_dependencies_are_recorded(true);
+#endif

My measurements indicate that the overhead of doing so is minimal (around 1% increase of nmethod size) and justifies the benefit. E.g. a Spring Petclinic application started with -Xbatch -XX:+UnlockDiagnosticVMOptions -XX:+LogCompilation compiles about ~11500 methods (~9000 with C1 and ~2500 with C2) resulting in an aggregated nmethod size of around ~40bm. Additionally recording evol_method dependencies only increases this size be about 400kb. The ration remains about the same if we run the same experiment with only C1 or only C2.

Instead of unconditionally recording evol_method dependencies we could guard the recording by a new flag. But this would only make sense if that flag would be on by default and I don't know if such a flag is justified just for the rare (or non-existent?) cases where somebody wants to disable the recording of the dependencies.

Another alternative would be to remove the set_can_hotswap_or_post_breakpoint()/set_all_dependencies_are_recorded() from JvmtiExport altogether (and always record the dependencies). It is currently only used at a few call sites in the C1/C2 compiler (set_can_hotswap_or_post_breakpoint()) and once when redefining classes (set_all_dependencies_are_recorded()).


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8324241: Always record evol_method deps to avoid excessive method flushing (Enhancement - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/17509/head:pull/17509
$ git checkout pull/17509

Update a local copy of the PR:
$ git checkout pull/17509
$ git pull https://git.openjdk.org/jdk.git pull/17509/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 17509

View PR using the GUI difftool:
$ git pr show -t 17509

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/17509.diff

Webrev

Link to Webrev Comment

Sorry, something went wrong.

@bridgekeeper
Copy link

bridgekeeper bot commented Jan 20, 2024

👋 Welcome back simonis! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Pull request is ready for review label Jan 20, 2024
@openjdk
Copy link

openjdk bot commented Jan 20, 2024

@simonis The following label will be automatically applied to this pull request:

  • hotspot-runtime

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-runtime hotspot-runtime-dev@openjdk.org label Jan 20, 2024
@mlbridge
Copy link

mlbridge bot commented Jan 20, 2024

Webrevs

@coleenp
Copy link
Contributor

coleenp commented Jan 21, 2024

/label add serviceability
/label add hotspot-compiler

@openjdk openjdk bot added the serviceability serviceability-dev@openjdk.org label Jan 21, 2024
@openjdk
Copy link

openjdk bot commented Jan 21, 2024

@coleenp
The serviceability label was successfully added.

@openjdk openjdk bot added the hotspot-compiler hotspot-compiler-dev@openjdk.org label Jan 21, 2024
@openjdk
Copy link

openjdk bot commented Jan 21, 2024

@coleenp
The hotspot-compiler label was successfully added.

Copy link
Member

@eastig eastig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@openjdk
Copy link

openjdk bot commented Jan 22, 2024

@simonis This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8324241: Always record evol_method deps to avoid excessive method flushing

Reviewed-by: eastigeevich, phh, coleenp, dlong, shade

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 72 new commits pushed to the master branch:

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@shipilev
Copy link
Member

shipilev commented Jan 22, 2024

Instead of unconditionally recording evol_method dependencies we could guard the recording by a new flag. But this would only make sense if that flag would be on by default and I don't know if such a flag is justified just for the rare (or non-existent?) cases where somebody wants to disable the recording of the dependencies.

I think introducing a diagnostic flag is sensible here. If we figure out much later that this solution comes with some other (worse) problems, the diagnostic flag gives us the options to: a) clearly point at this addition as the culprit; b) have the easily deployable solution to restore the original behavior.

For the change itself, we need to amend the comment near VM_RedefineClasses::flush_dependent_code definition that talks about this peculiar behavior, which now changes. Actually, maybe even the implementation of flush_dependent_code should now trust (and assert) that all dependencies are now recorded?

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Jan 22, 2024
@eastig
Copy link
Member

eastig commented Jan 22, 2024

@shipilev I filed https://bugs.openjdk.org/browse/JDK-8324318 which is related to this change.

@coleenp
Copy link
Contributor

coleenp commented Jan 22, 2024

I support adding this as a diagnostic option, therefore you won't have to make the change in VM_RedefineClasses::flush_dependent_code.

@simonis
Copy link
Member Author

simonis commented Jan 22, 2024

Thanks everybody for looking at this. I've now guarded the feature with a diagnostic command, updated the source code comments around VM_RedefineClasses::flush_dependent_code and added an assertion for the new flag.

Copy link
Contributor

@coleenp coleenp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good.

//

void VM_RedefineClasses::flush_dependent_code() {
assert(SafepointSynchronize::is_at_safepoint(), "sanity check");
assert(AlwaysRecordEvolDependencies ? JvmtiExport::all_dependencies_are_recorded() : true, "sanity check");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just "assert all dependencies are recorded, unless we specifically requested not to do so", right?

assert(JvmtiExport::all_dependencies_are_recorded() || !AlwaysRecordEvolDependencies, "sanity check");

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that's true (I must confess I had to use a truth table to verify it :) I'll take your version tough, because I thinks it's simpler to understand. Do you have other assertions in mind (also see my answer to Dean above)?

Copy link
Member

@eastig eastig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@iwanowww
Copy link
Contributor

iwanowww commented Jan 23, 2024

I support keeping the logic under a flag. I have some concerns about unconditionally turning it on.

I expect significantly higher footprint overhead when an application has plenty of tiny methods and deep inlining trees. And java.lang.invoke implementation pushes it even further (with arbitrarily deep MethodHandle trees and unconditional inlining through them), so heavy users of MethodHandle API should experience higher overheads when evol dependencies are recorded.

I suggest to make the flag experimental. Once JFR implementation is improved, it can be superseded by -XX:+EnableDynamicAgentLoading check.

@simonis
Copy link
Member Author

simonis commented Jan 25, 2024

@iwanowww , @shipilev , @dean-long any more comments or concerns? Otherwise I'd like to finish this.

@dean-long
Copy link
Member

No, go ahead.

Copy link
Member

@shipilev shipilev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am good with these changes, only a few stylistic nits.

I am on the fence where the default for this flag should stand. The 1% loss in nmethod size is probably okay, given that we gained as much with denser code improvements like JDK-8319406. Maybe there are some tricks in dependency encoding that would drive this down even more.

@simonis
Copy link
Member Author

simonis commented Jan 26, 2024

/integrate

@openjdk
Copy link

openjdk bot commented Jan 26, 2024

Going to push as commit 62b3293.
Since your change was applied there have been 72 commits pushed to the master branch:

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Jan 26, 2024
@openjdk openjdk bot closed this Jan 26, 2024
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Jan 26, 2024
@openjdk
Copy link

openjdk bot commented Jan 26, 2024

@simonis Pushed as commit 62b3293.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@RacerZ-fighting
Copy link

RacerZ-fighting commented Feb 28, 2024

Will this feature be incorporated into a lower version of JDK, like JDK 8?

@eastig
Copy link
Member

eastig commented Feb 28, 2024

@RacerZ-fighting

Will this feature be incorporated into a lower version of JDK, like JDK 8?

I don't see JDK 8 or 11 in affected versions of JDK-8324241, only 17 and 21.

@leomao10
Copy link

leomao10 commented Aug 5, 2024

Hi there, we are seeing this issue when we run JFR on our services under load, we see a large spike of CPU after JFR is triggered, which cause 500 errors in our service. We are currently using corretto-17 in our service.

Wondering this fix get back ported to JDK 17? As I can't find this change mentioned in JDK update or in jdk17u tag compare

Also, wondering if there is a walk around for this issue if the PR is not back ported to Java 17. XX:+EnableDynamicAgentLoading seems to only supported in Java 21, so that wouldn't help for now

@simonis
Copy link
Member Author

simonis commented Sep 12, 2024

Hi there, we are seeing this issue when we run JFR on our services under load, we see a large spike of CPU after JFR is triggered, which cause 500 errors in our service. We are currently using corretto-17 in our service.

Wondering this fix get back ported to JDK 17? As I can't find this change mentioned in JDK update or in jdk17u tag compare

Also, wondering if there is a walk around for this issue if the PR is not back ported to Java 17. XX:+EnableDynamicAgentLoading seems to only supported in Java 21, so that wouldn't help for now

@leomao10, I'm not sure if this change will ever be downported to older releases like JDK 21 or even JDK 17. I personally consider it low risk, but there have been reports of performance regressions in some cases (e.g. JDK-8336805). I couldn't reproduce them, but I can image that they will make the maintainers of LTS releases even more cautious.

The easiest way to workaround this issue in JDK 17 would be to set the can_redefine_classes, can_retransform_classes or can_generate_breakpoint_events JVMTI capabilities at startup or early in the lifetime of the JVM. There are several ways how you could do this.

  • trigger JFR right after startup. This will still invalidate all the JIT compiled methods but if you do this early enough there won'T be many of them. After you've triggered JFR for the first time, the corresponding JVMTI capabilities will be set and all dependencies will be recorded automatically so any subsequent JFR invocation won't suffer from a performance degradation any more.
  • attach any other JVMTI agent like for example async profiler which requests the corresponding JVMTI capabilities at startup.
  • write your own, trivial JVMTI agent which merely requests the corresponding JVMTI capabilities and attach it at startup with agentpath:jvmtiAgent.so. The agent can be as simple as:
/*

g++ -fPIC -shared -I $JAVA_HOME/include/ -I $JAVA_HOME/inlude/linux -o jvmtiAgent.so jvmtiAgent.cpp
 */

#include <jvmti.h>
#include <stdio.h>
#include <string.h>

extern "C"
JNIEXPORT jint JNICALL Agent_OnLoad(JavaVM* jvm, char* options, void* reserved) {
  jvmtiEnv* jvmti = NULL;
  jvmtiCapabilities capa;
  jvmtiError error;

  jint result = jvm->GetEnv((void**) &jvmti, JVMTI_VERSION_1_1);
  if (result != JNI_OK) {
    fprintf(stderr, "Can't access JVMTI!\n");
    return JNI_ERR;
  }

  memset(&capa, 0, sizeof(jvmtiCapabilities));
  capa.can_redefine_classes = 1;
  capa.can_retransform_classes = 1;
  capa.can_generate_breakpoint_events = 1;
  if (jvmti->AddCapabilities(&capa) != JVMTI_ERROR_NONE) {
    fprintf(stderr, "Can't set capabilities!\n");
    return JNI_ERR;
  } else {
    fprintf(stdout, "Added capabilities!\n");
  }
  return JNI_OK;
}

@leomao10
Copy link

That is awesome, thanks for the response and workarounds @simonis ❤️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-compiler hotspot-compiler-dev@openjdk.org hotspot-runtime hotspot-runtime-dev@openjdk.org integrated Pull request has been integrated serviceability serviceability-dev@openjdk.org
Development

Successfully merging this pull request may close these issues.

None yet

10 participants