8292758: put support for UNSIGNED5 format into its own header file #10067

rose00 · 2022-08-29T18:20:42Z

Refactor code from inside of CompressedStream into its own unit.

This code is likely to be used in future refactorings, such as JDK-8292818 (replace 96-bit representation for field metadata with variable-sized streams).

Add gtests.

Progress

Change must be properly reviewed (1 review required, with at least 1 Reviewer)
Change must not contain extraneous whitespace
Commit message must refer to an issue

Issue

JDK-8292758: put support for UNSIGNED5 format into its own header file

Reviewers

Dean Long (@dean-long - Reviewer)
Coleen Phillimore (@coleenp - Reviewer)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk pull/10067/head:pull/10067
$ git checkout pull/10067

Update a local copy of the PR:
$ git checkout pull/10067
$ git pull https://git.openjdk.org/jdk pull/10067/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 10067

View PR using the GUI difftool:
$ git pr show -t 10067

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/10067.diff

bridgekeeper · 2022-08-29T18:21:20Z

👋 Welcome back jrose! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

openjdk · 2022-08-29T18:24:57Z

@rose00 The following labels will be automatically applied to this pull request:

hotspot
serviceability

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

openjdk · 2022-08-31T16:13:47Z

@rose00 this pull request can not be integrated into master due to one or more merge conflicts. To resolve these merge conflicts and update this pull request you can run the following commands in the local repository for your personal fork:

git checkout compressed-stream
git fetch https://git.openjdk.org/jdk master
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge master"
git push

rose00 · 2022-09-01T23:53:03Z

This code passes tiers 1,2,3.

…stream

mlbridge · 2022-09-02T00:00:16Z

Webrevs

openjdk-notifier · 2022-09-02T00:00:54Z

@rose00 Please do not rebase or force-push to an active PR as it invalidates existing review comments. All changes will be squashed into a single commit automatically when integrating. See OpenJDK Developers’ Guide for more information.

rose00 · 2022-09-02T00:05:49Z

The new header file presents the encoding algorithm by means of templates.
The template arguments in general are:

ARY - a logical base address for reads and writes of bytes
OFF - an integral type (of any size or signed-ness) providing an offset to ARY
GET and SET - function-like arguments (e.g., lambdas) which get or set bytes from an address logically of the form a[i] shaped like ARY[OFF]
GFN a lambda used when the application requires on-the fly resizing of an output buffer (of type ARY)

Defaults are set in such a way that any C++ types that natively support a[i] can be fully inferred, including the get/set behaviors.

In addition, there are small "gadgets" for reading a series of ints from a buffer, writing a series to a buffer, and sizing a series (which is faster than writing or reading). These are not yet used. However, prototyping of further use cases for this compression (particularly, FieldInfo) makes it clear that these are repeated tasks that "canned" templates will help with.

dean-long · 2022-09-02T01:50:59Z

src/hotspot/share/code/compressedStream.cpp

+  const int min_expansion = UNSIGNED5::MAX_LENGTH;
+  if (nsize < min_expansion*2)
+    nsize = min_expansion*2;


It's not clear if this is needed or just an optimization. Maybe add a comment. Also, using MAX2 might be clearer.

dean-long

I'm concerned about the "excluded bytes" change. It appears to change the encoding, which would mean that SA would not be able to read streams from earlier JVMs, which may or may not be a concern. I suggest splitting this up into the straight-forward refactor (without the excluded bytes change), then adding excluded bytes as a follow-up.

rose00 · 2022-09-03T03:34:47Z

I suggest splitting this up into the straight-forward refactor (without the excluded bytes change), then adding excluded bytes as a follow-up.

Yes, that is a slight change. Splitting is not necessary for the reason you mention because this PR includes SA changes. This SA change is tested by several jtreg tests: That is, injecting bugs causes SA tests to fail, and fixing them fixes the tests. This is the case because the compressed stream data structure is used for all stack walking. So if there's a bug, we find it immediately. And the same is true for SA unit tests which peform stack walking. Net result: It's safe, there is no bug, because testing coverage is robust.

The math of the encoding is the same whether there are 255 or 256 byte values available, so the adjustment is very low risk. (The math works for any underlying byte type or size; we choose 8-bit bytes excluding nulls to provide 255 distinct tokens.) The benefit is that the encoding can be accompanied by null termination, which enhances debuggability and resilience against bad counts and bad pointers.

dean-long · 2022-09-03T07:48:50Z

Wouldn't the SA stack walk fail when attaching to a core dump from an earlier JVM that does not exclude nulls?

plummercj · 2022-09-04T00:23:13Z

Wouldn't the SA stack walk fail when attaching to a core dump from an earlier JVM that does not exclude nulls?

I'm not sure how important compatibility is with older JVMs. SA is capable of (if properly maintained) connecting to any JVM. You'll see signs of that in the code:

// The threadStatus is only present starting in 1.5
if (threadStatusField != null) {
  Oop holderOop = threadHolderField.getValue(threadOop);
  return (int) threadStatusField.getValue(holderOop);
} else {
  // All we can easily figure out is if it is alive, but that is
  // enough info for a valid unknown status.
  ...
}

However, attempts like this one are ancient, and certainly this one should be removed. There's probably a 100 other reasons why attaching to a 1.5 JVM won't work.

We don't do any compatibility testing with older JVMs. For the most part I'd say SA users should always match SA with the version of the JVM they are targeting. I'd be interested in hearing if anyone feels otherwise.

coleenp

Thank you for adding the SA and debug.cpp for debugging this.

src/hotspot/share/code/compressedStream.cpp

coleenp · 2022-09-05T16:51:23Z

test/hotspot/gtest/utilities/test_unsigned5.cpp

+
+// T.I.L.
+// $ sh ./configure ... --with-gtest=<...>/googletest ...
+// $ make exploded-test TEST=gtest:unsigned5


What does T.I.L mean? This isn't the only way to run this test, so don't include this.

src/hotspot/share/utilities/unsigned5.cpp

src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/utilities/Unsigned5.java

coleenp · 2022-09-05T18:17:13Z

src/hotspot/share/utilities/unsigned5.hpp

+  }
+
+  // returns the encoded byte length of an unsigned 32-bit int
+  static constexpr int encoded_length(uint32_t value) {


Should all of these functions be private?

coleenp · 2022-09-05T18:18:15Z

src/hotspot/share/utilities/unsigned5.hpp

+    Writer(const ARR& array)
+      : _array(const_cast<ARR&>(array)), _limit_ptr(NULL)
+        // note:  if _limit_ptr is NULL, the ARR& is never reassigned
+    { limit_init(); _position = 0; }


indent needed.

…emove vmStructs coupling from SA

coleenp · 2022-09-07T17:35:38Z

src/hotspot/share/utilities/unsigned5.hpp

+
+  // reports the largest uint32_t value that can be encoded using len bytes
+  // len must be in the range [1..5]
+  static constexpr uint32_t max_encoded_in_length(uint32_t len) {


A few of the classes make the gtest test a friend so that functions that are internal to unsigned5 can be made private. The test would have to call these functions in a class though, and not directly in the test cases. Maybe this is ok.

coleenp

This looks good to me, for the current CompressedStream and for upcoming changes.

openjdk · 2022-09-07T17:46:38Z

@rose00 This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8292758: put support for UNSIGNED5 format into its own header file

Reviewed-by: dlong, coleenp

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 78 new commits pushed to the master branch:

fc5f97f: 8293474: RISC-V: Unify the way of moving function pointer
2d13f53: 8293512: ProblemList serviceability/tmtools/jstat/GcNewTest.java in -Xcomp mode
f84386c: 8293182: Improve testing of CDS archive heap
51de765: 8283010: serviceability/sa/ClhsdbThread.java failed with "'Base of Stack:' missing from stdout/stderr "
8a48965: 8293514: ProblemList gc/metaspace/TestMetaspacePerfCounters.java#Epsilon-64 on all platforms
1e031e6: 8293232: Fix race condition in pkcs11 SessionManager
1080c4e: 8293508: ProblemList gc/metaspace/TestMetaspacePerfCounters.java#Epsilon-64
aff9a69: 8283224: Remove THREAD_NOT_ALIVE from possible JDWP error codes
76df73b: 8293456: runtime/os/TestTracePageSizes.java sub-tests fail with "AssertionError: No memory range found for address: NNNN"
32c7b62: 8293146: Strict DateTimeFormatter fails to report an invalid week 53
... and 68 more: https://git.openjdk.org/jdk/compare/0fb9469d93bffd662848b63792406717f7b4ec0d...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

rose00 · 2022-09-08T07:35:34Z

@dean-long and @coleenp Thank you for the reviews.

The public functions in UNSIGNED5 are not internal but are designed to be useful by themselves. That is why the gtest tests them separately. For example, check_length is used inside the Reader but if you are building your own reader-like logic, you might want to use it to avoid buffer overflows. Likewise fits_in_limit is used by the sizing logic of write_uint_grow but if you are rolling your own auto-grow logic, you would want to call that function on its own, or maybe encoded_length. The fancy API points like Reader and Sizer and write_uint_grow are not intended to be primitives, but rather best practices for using the static functions, which are the real primitives.

Testing notes:

This change passes a targeted test that runs test/hotspot/jtreg/compiler/c2/Test6*.java (about 50 tests) with CONF=fastdebug and JTREG="JAVA_OPTIONS=-Xcomp -XX:+TraceDeoptimization". That run is chosen to deoptimize many times (about 39k), so as to exercise the pre-existing stack walk logic, which relies on compressed streams. Fault injection confirms that any JVM run crashes immediately if there is a bug in the encoding.

It also passes tiers 1/2/3 (in a slightly earlier version) and all gtests (in the latest version).

rose00 · 2022-09-08T07:35:40Z

/integrate

openjdk · 2022-09-08T07:37:06Z

Going to push as commit 8d3399b.
Since your change was applied there have been 80 commits pushed to the master branch:

6677227: 8293497: Build failure due to MaxVectorSize was not declared when C2 is disabled after JDK-8293254
986b834: 8293489: Accept CAs with BasicConstraints without pathLenConstraint
fc5f97f: 8293474: RISC-V: Unify the way of moving function pointer
2d13f53: 8293512: ProblemList serviceability/tmtools/jstat/GcNewTest.java in -Xcomp mode
f84386c: 8293182: Improve testing of CDS archive heap
51de765: 8283010: serviceability/sa/ClhsdbThread.java failed with "'Base of Stack:' missing from stdout/stderr "
8a48965: 8293514: ProblemList gc/metaspace/TestMetaspacePerfCounters.java#Epsilon-64 on all platforms
1e031e6: 8293232: Fix race condition in pkcs11 SessionManager
1080c4e: 8293508: ProblemList gc/metaspace/TestMetaspacePerfCounters.java#Epsilon-64
aff9a69: 8283224: Remove THREAD_NOT_ALIVE from possible JDWP error codes
... and 70 more: https://git.openjdk.org/jdk/compare/0fb9469d93bffd662848b63792406717f7b4ec0d...master

Your commit was automatically rebased without conflicts.

openjdk · 2022-09-08T07:37:19Z

@rose00 Pushed as commit 8d3399b.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

openjdk bot added serviceability serviceability-dev@openjdk.org hotspot hotspot-dev@openjdk.org labels Aug 29, 2022

openjdk bot added the merge-conflict Pull request has merge conflict with target branch label Aug 31, 2022

8292758: put support for UNSIGNED5 format into its own header file

e938b01

rose00 force-pushed the compressed-stream branch from 67af6b1 to e938b01 Compare August 31, 2022 20:31

openjdk bot removed the merge-conflict Pull request has merge conflict with target branch label Aug 31, 2022

rose00 marked this pull request as ready for review September 1, 2022 23:53

openjdk bot added the rfr Pull request is ready for review label Sep 1, 2022

Merge branch 'master' of https://git.openjdk.org/jdk into compressed-…

d903f2c

…stream

dean-long reviewed Sep 2, 2022

View reviewed changes

dean-long suggested changes Sep 2, 2022

View reviewed changes

add more support for SA including vmStructs; tweak some debug code

dc30f7c

coleenp reviewed Sep 5, 2022

View reviewed changes

John Rose added 2 commits September 6, 2022 11:38

respond to reviewer comments; move printing code to a better place; r…

439b108

…emove vmStructs coupling from SA

add missing "this->"

172420e

coleenp reviewed Sep 7, 2022

View reviewed changes

coleenp approved these changes Sep 7, 2022

View reviewed changes

openjdk bot added the ready Pull request is ready to be integrated label Sep 7, 2022

dean-long approved these changes Sep 8, 2022

View reviewed changes

openjdk bot added the integrated Pull request has been integrated label Sep 8, 2022

openjdk bot closed this Sep 8, 2022

openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Sep 8, 2022

rose00 deleted the compressed-stream branch September 8, 2022 07:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

8292758: put support for UNSIGNED5 format into its own header file #10067

8292758: put support for UNSIGNED5 format into its own header file #10067

rose00 commented Aug 29, 2022 •

edited by openjdk bot

bridgekeeper bot commented Aug 29, 2022

openjdk bot commented Aug 29, 2022

openjdk bot commented Aug 31, 2022

rose00 commented Sep 1, 2022

mlbridge bot commented Sep 2, 2022 •

edited

openjdk-notifier bot commented Sep 2, 2022

rose00 commented Sep 2, 2022

dean-long Sep 2, 2022 •

edited

dean-long left a comment

rose00 commented Sep 3, 2022

dean-long commented Sep 3, 2022

plummercj commented Sep 4, 2022

coleenp left a comment

coleenp Sep 5, 2022

coleenp Sep 5, 2022 •

edited

coleenp Sep 5, 2022

coleenp Sep 7, 2022 •

edited

coleenp left a comment

openjdk bot commented Sep 7, 2022 •

edited

rose00 commented Sep 8, 2022

rose00 commented Sep 8, 2022

openjdk bot commented Sep 8, 2022

openjdk bot commented Sep 8, 2022

8292758: put support for UNSIGNED5 format into its own header file #10067

8292758: put support for UNSIGNED5 format into its own header file #10067

Conversation

rose00 commented Aug 29, 2022 • edited by openjdk bot

Progress

Issue

Reviewers

Reviewing

bridgekeeper bot commented Aug 29, 2022

openjdk bot commented Aug 29, 2022

openjdk bot commented Aug 31, 2022

rose00 commented Sep 1, 2022

mlbridge bot commented Sep 2, 2022 • edited

Webrevs

openjdk-notifier bot commented Sep 2, 2022

rose00 commented Sep 2, 2022

dean-long Sep 2, 2022 • edited

Choose a reason for hiding this comment

dean-long left a comment

Choose a reason for hiding this comment

rose00 commented Sep 3, 2022

dean-long commented Sep 3, 2022

plummercj commented Sep 4, 2022

coleenp left a comment

Choose a reason for hiding this comment

coleenp Sep 5, 2022

Choose a reason for hiding this comment

coleenp Sep 5, 2022 • edited

Choose a reason for hiding this comment

coleenp Sep 5, 2022

Choose a reason for hiding this comment

coleenp Sep 7, 2022 • edited

Choose a reason for hiding this comment

coleenp left a comment

Choose a reason for hiding this comment

openjdk bot commented Sep 7, 2022 • edited

rose00 commented Sep 8, 2022

rose00 commented Sep 8, 2022

openjdk bot commented Sep 8, 2022

openjdk bot commented Sep 8, 2022

rose00 commented Aug 29, 2022 •

edited by openjdk bot

mlbridge bot commented Sep 2, 2022 •

edited

dean-long Sep 2, 2022 •

edited

coleenp Sep 5, 2022 •

edited

coleenp Sep 7, 2022 •

edited

openjdk bot commented Sep 7, 2022 •

edited