Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8309698: [S390X] Test hotspot/jtreg/runtime/NMT/VirtualAllocCommitMerge.java fails after JDK-8299089 #20096

Closed
wants to merge 3 commits into from

Conversation

mmatti-sw
Copy link
Contributor

@mmatti-sw mmatti-sw commented Jul 9, 2024

On s390 the testcase "VirtualAllocCommitMerge.java" when run under Interpreter with option -Xint fails. But when run as compiled with option -Xcomp -Xbatch it Passes.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8309698: [S390X] Test hotspot/jtreg/runtime/NMT/VirtualAllocCommitMerge.java fails after JDK-8299089 (Bug - P4)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/20096/head:pull/20096
$ git checkout pull/20096

Update a local copy of the PR:
$ git checkout pull/20096
$ git pull https://git.openjdk.org/jdk.git pull/20096/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 20096

View PR using the GUI difftool:
$ git pr show -t 20096

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/20096.diff

Webrev

Link to Webrev Comment

Sorry, something went wrong.

…interpreted.
@bridgekeeper
Copy link

bridgekeeper bot commented Jul 9, 2024

👋 Welcome back mmatti-sw! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Jul 9, 2024

❗ This change is not yet ready to be integrated.
See the Progress checklist in the description for automated requirements.

@openjdk openjdk bot changed the title 8309698 8309698: [S390X] Test hotspot/jtreg/runtime/NMT/VirtualAllocCommitMerge.java fails after JDK-8299089 Jul 9, 2024
@openjdk openjdk bot added the rfr Pull request is ready for review label Jul 9, 2024
@openjdk
Copy link

openjdk bot commented Jul 9, 2024

@mmatti-sw The following label will be automatically applied to this pull request:

  • hotspot-runtime

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-runtime hotspot-runtime-dev@openjdk.org label Jul 9, 2024
@mlbridge
Copy link

mlbridge bot commented Jul 9, 2024

Webrevs

Copy link
Member

@dholmes-ora dholmes-ora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the comment saying this is in alternative but we will need to test it again in our CI.

Given this test specifically chooses the compilation mode it should also exclude being run under other modes, or even specify @requires vm.flagless - as it makes no sense to have the -Xint overridden.

@dholmes-ora
Copy link
Member

Testing in our CI was good.

@tstuefe
Copy link
Member

tstuefe commented Jul 10, 2024

Did anyone actually analyze this problem? What is the root cause? Switching the JIT on and off at random should have no bearing on how NMT functions.

I see, looking at the test output, that it fails with

java.lang.RuntimeException: '\\[0x[0]*40dafa4000 - 0x[0]*40db004000\\] committed 384KB from.*' missing from stdout/stderr

But the output is there, in the virtual mapping printout:

[0x00000040dafa4000 - 0x00000040db3a4000] reserved 4096KB for Test 

	[0x00000040dafa4000 - 0x00000040db004000] committed 384KB 

But I guess the callstack is missing? So I would like to understand why, and whats preventing the callstack from being shown, and why switching on the JIT helps?

NMT is an important component. Just disabling tests that indicate problems is not a good practice.

@dholmes-ora
Copy link
Member

Switching the JIT on and off at random should have no bearing on how NMT functions.

The existing comment in the test indicates the test either needs nothing compiled, or everything, else it can't do what it intended.

I agree there is some utility in understanding why S390 interpreter behaviour doesn't meet the test's expectations.

@tstuefe
Copy link
Member

tstuefe commented Jul 11, 2024

Switching the JIT on and off at random should have no bearing on how NMT functions.

The existing comment in the test indicates the test either needs nothing compiled, or everything, else it can't do what it intended.

I agree there is some utility in understanding why S390 interpreter behaviour doesn't meet the test's expectations.

But that's not the reason the test fails. And it should have worked with -Xint.

Here, we don't have a stack at all. That is a different error. And when did it start failing? I see that some stacks are printed. Is stack printing broken for some cases on s390?

@tstuefe
Copy link
Member

tstuefe commented Jul 11, 2024

I also like Xint more, since stacks are more stable - we take the JIT out of the equation. Compile decisions could be different between different runs. If not today, it is not guaranteed never to happen - nobody guarantees exactly how the JIT optimizes and that it is bound to repeat the same decisions on every run.

@mmatti-sw
Copy link
Contributor Author

Did anyone actually analyze this problem? What is the root cause? Switching the JIT on and off at random should have no bearing on how NMT functions.

Let me have a look at this.

@tstuefe
Copy link
Member

tstuefe commented Jul 11, 2024

Did anyone actually analyze this problem? What is the root cause? Switching the JIT on and off at random should have no bearing on how NMT functions.

Let me have a look at this.

It would be interesting to know if this is something that started to happen recently, and if yes, which patch broke it.

@mmatti-sw
Copy link
Contributor Author

It would be interesting to know if this is something that started to happen recently, and if yes, which patch broke it.

According to the issue https://bugs.openjdk.org/browse/JDK-8309698, the commit c705673 broke this!

@offamitkumar
Copy link
Member

I can reproduce it on jdk11u-dev head as well. So I guess this is coming from far back than that commit.

@tstuefe
Copy link
Member

tstuefe commented Jul 11, 2024

I can reproduce it on jdk11u-dev head as well. So I guess this is coming from far back than that commit.

Huh that's weird. Is this a particular hardware configuration that is new, or a new compiler? Because I am sure we do s390 tests. I think Adoptium does too.

What happens:

We use NativeCallStack to get a stack at the point os::commit_memory is called:

MemTracker::record_virtual_memory_commit((address)addr, bytes, CALLER_PC);

CALLER_PC resolves to NativeCallStack(1), which walks the stack starting one frame above. If that fails, the stack will have 0 in its first slot (

address _stack[NMT_TrackingStackDepth];
) and printing will stop right away (
if (stack->is_empty()) {
out->cr();
} else {
out->print_cr(" from");
INDENT_BY(4, _stackprinter.print_stack(stack);)
}
).

But according to the printout, there is no general problem with stack printing, since I see call stacks ending in os::commit_memory (see the many "committed xxx from").

In this case, the stack should show the whitebox function WB_NMTCommitMemory which calls os::commit_memory() (

WB_ENTRY(void, WB_NMTCommitMemory(JNIEnv* env, jobject o, jlong addr, jlong size))
os::commit_memory((char *)(uintptr_t)addr, size, !ExecMem);
MemTracker::record_virtual_memory_type((address)(uintptr_t)addr, mtTest);
WB_END
)

So, the task would be to find out whats so special about this function that it does not show up in the callstack.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
@mmatti-sw
Copy link
Contributor Author

I think the os::current_stack_pointer() implementation for s390 needs to be looked into along with NativeCallStack::NativeCallStack(int toSkip).

@TheShermanTanker
Copy link
Contributor

I think the os::current_stack_pointer() implementation for s390 needs to be looked into along with NativeCallStack::NativeCallStack(int toSkip).

I've been thinking of looking at and possibly improving the implementations of os::current_stack_pointer for each platform recently, let me know if you plan on changing the implementation of s390x

Copy link
Member

@tstuefe tstuefe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still disagree with the patch as it is now. The error should be analyzed, understood, and fixed. E.g. I don't get the relationship to JDK-8299089. What in JDK-8299089 can have caused stack walking on s390 to fail?

This fix is not terribly urgent, since the error just affects stacks in NMT virtual allocation printouts, and only some stacks at that. So, if you want to get the s390 CI clean, it would be absolutely fine to problem-list the test for s390. That is usually how we mute problems that we cannot fix immediately.

Thanks, Thomas

@bridgekeeper
Copy link

bridgekeeper bot commented Aug 14, 2024

@mmatti-sw This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
@bridgekeeper
Copy link

bridgekeeper bot commented Oct 9, 2024

@mmatti-sw This pull request has been inactive for more than 8 weeks and will now be automatically closed. If you would like to continue working on this pull request in the future, feel free to reopen it! This can be done using the /open pull request command.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-runtime hotspot-runtime-dev@openjdk.org rfr Pull request is ready for review
Development

Successfully merging this pull request may close these issues.

None yet

5 participants