Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalize PointerInvoke to benchmark by-ref segment return #788

Conversation

mcimadamore
Copy link
Collaborator

@mcimadamore mcimadamore commented Feb 10, 2023

I've generalized an existing benchmark to test by-reference segment return in downcalls.
Ideally, we should see scalarization of the returned segment, and no GC activity.


Progress

  • Change must not contain extraneous whitespace
  • Change must be properly reviewed (1 review required, with at least 1 Committer)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/panama-foreign pull/788/head:pull/788
$ git checkout pull/788

Update a local copy of the PR:
$ git checkout pull/788
$ git pull https://git.openjdk.org/panama-foreign pull/788/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 788

View PR using the GUI difftool:
$ git pr show -t 788

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/panama-foreign/pull/788.diff

Sorry, something went wrong.

@mcimadamore
Copy link
Collaborator Author

mcimadamore commented Feb 10, 2023

Results I got:

Benchmark                                                              Mode  Cnt     Score     Error   Units
PointerInvoke.long_to_long                                             avgt   30    10.016 ±   0.038   ns/op
PointerInvoke.long_to_long:·gc.alloc.rate                              avgt   30     0.001 ±   0.001  MB/sec
PointerInvoke.long_to_long:·gc.alloc.rate.norm                         avgt   30    ≈ 10⁻⁵              B/op
PointerInvoke.long_to_long:·gc.count                                   avgt   30       ≈ 0            counts
PointerInvoke.long_to_ptr                                              avgt   30    14.581 ±   0.185   ns/op
PointerInvoke.long_to_ptr:·gc.alloc.rate                               avgt   30  2354.079 ±  29.942  MB/sec
PointerInvoke.long_to_ptr:·gc.alloc.rate.norm                          avgt   30    72.005 ±   0.001    B/op
PointerInvoke.long_to_ptr:·gc.churn.G1_Eden_Space                      avgt   30  2345.490 ± 102.742  MB/sec
PointerInvoke.long_to_ptr:·gc.churn.G1_Eden_Space.norm                 avgt   30    71.766 ±   3.260    B/op
PointerInvoke.long_to_ptr:·gc.churn.G1_Survivor_Space                  avgt   30     0.018 ±   0.007  MB/sec
PointerInvoke.long_to_ptr:·gc.churn.G1_Survivor_Space.norm             avgt   30     0.001 ±   0.001    B/op
PointerInvoke.long_to_ptr:·gc.count                                    avgt   30   210.000            counts
PointerInvoke.long_to_ptr:·gc.time                                     avgt   30   124.000                ms
PointerInvoke.ptr_to_long                                              avgt   30    10.712 ±   0.088   ns/op
PointerInvoke.ptr_to_long:·gc.alloc.rate                               avgt   30     0.001 ±   0.001  MB/sec
PointerInvoke.ptr_to_long:·gc.alloc.rate.norm                          avgt   30    ≈ 10⁻⁵              B/op
PointerInvoke.ptr_to_long:·gc.count                                    avgt   30       ≈ 0            counts
PointerInvoke.ptr_to_long_new_segment                                  avgt   30    11.386 ±   0.124   ns/op
PointerInvoke.ptr_to_long_new_segment:·gc.alloc.rate                   avgt   30     0.001 ±   0.001  MB/sec
PointerInvoke.ptr_to_long_new_segment:·gc.alloc.rate.norm              avgt   30    ≈ 10⁻⁵              B/op
PointerInvoke.ptr_to_long_new_segment:·gc.count                        avgt   30       ≈ 0            counts
PointerInvoke.ptr_to_ptr                                               avgt   30    15.769 ±   0.186   ns/op
PointerInvoke.ptr_to_ptr:·gc.alloc.rate                                avgt   30  2176.427 ±  25.562  MB/sec
PointerInvoke.ptr_to_ptr:·gc.alloc.rate.norm                           avgt   30    72.005 ±   0.001    B/op
PointerInvoke.ptr_to_ptr:·gc.churn.G1_Eden_Space                       avgt   30  2186.727 ± 109.313  MB/sec
PointerInvoke.ptr_to_ptr:·gc.churn.G1_Eden_Space.norm                  avgt   30    72.331 ±   3.393    B/op
PointerInvoke.ptr_to_ptr:·gc.churn.G1_Survivor_Space                   avgt   30     0.013 ±   0.007  MB/sec
PointerInvoke.ptr_to_ptr:·gc.churn.G1_Survivor_Space.norm              avgt   30    ≈ 10⁻³              B/op
PointerInvoke.ptr_to_ptr:·gc.count                                     avgt   30   187.000            counts
PointerInvoke.ptr_to_ptr:·gc.time                                      avgt   30   109.000                ms
PointerInvoke.ptr_to_ptr_new_segment                                   avgt   30    15.848 ±   0.239   ns/op
PointerInvoke.ptr_to_ptr_new_segment:·gc.alloc.rate                    avgt   30  2165.823 ±  32.328  MB/sec
PointerInvoke.ptr_to_ptr_new_segment:·gc.alloc.rate.norm               avgt   30    72.005 ±   0.001    B/op
PointerInvoke.ptr_to_ptr_new_segment:·gc.churn.G1_Eden_Space           avgt   30  2175.643 ±  93.914  MB/sec
PointerInvoke.ptr_to_ptr_new_segment:·gc.churn.G1_Eden_Space.norm      avgt   30    72.321 ±   2.808    B/op
PointerInvoke.ptr_to_ptr_new_segment:·gc.churn.G1_Survivor_Space       avgt   30     0.019 ±   0.009  MB/sec
PointerInvoke.ptr_to_ptr_new_segment:·gc.churn.G1_Survivor_Space.norm  avgt   30     0.001 ±   0.001    B/op
PointerInvoke.ptr_to_ptr_new_segment:·gc.count                         avgt   30   198.000            counts
PointerInvoke.ptr_to_ptr_new_segment:·gc.time                          avgt   30   116.000                ms

Passing segments as arguments works fine, but returning segments generate allocation. This seems to be unrelated to latest API changes - (even when I tweaked the FFM impl to always use global scope, the allocation seems to be still there).

@bridgekeeper
Copy link

bridgekeeper bot commented Feb 10, 2023

👋 Welcome back mcimadamore! A progress list of the required criteria for merging this PR into foreign-memaccess+abi will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Ready for review label Feb 10, 2023
@mlbridge
Copy link

mlbridge bot commented Feb 10, 2023

Webrevs

@mcimadamore
Copy link
Collaborator Author

Doh - realized that my benchmark was escaping segments "by design" (since the benchmark methods were returning the segments). I've tweaked the benchmark to return the address() value of the returned segment instead and it all got much better :-)

Benchmark                                                  Mode  Cnt   Score    Error   Units
PointerInvoke.long_to_long                                 avgt   30   9.764 ±  0.133   ns/op
PointerInvoke.long_to_long:·gc.alloc.rate                  avgt   30   0.001 ±  0.001  MB/sec
PointerInvoke.long_to_long:·gc.alloc.rate.norm             avgt   30  ≈ 10⁻⁵             B/op
PointerInvoke.long_to_long:·gc.count                       avgt   30     ≈ 0           counts
PointerInvoke.long_to_ptr                                  avgt   30   9.832 ±  0.122   ns/op
PointerInvoke.long_to_ptr:·gc.alloc.rate                   avgt   30   0.001 ±  0.001  MB/sec
PointerInvoke.long_to_ptr:·gc.alloc.rate.norm              avgt   30  ≈ 10⁻⁵             B/op
PointerInvoke.long_to_ptr:·gc.count                        avgt   30     ≈ 0           counts
PointerInvoke.ptr_to_long                                  avgt   30  11.069 ±  0.095   ns/op
PointerInvoke.ptr_to_long:·gc.alloc.rate                   avgt   30   0.001 ±  0.001  MB/sec
PointerInvoke.ptr_to_long:·gc.alloc.rate.norm              avgt   30  ≈ 10⁻⁵             B/op
PointerInvoke.ptr_to_long:·gc.count                        avgt   30     ≈ 0           counts
PointerInvoke.ptr_to_long_new_segment                      avgt   30  11.679 ±  0.160   ns/op
PointerInvoke.ptr_to_long_new_segment:·gc.alloc.rate       avgt   30   0.001 ±  0.001  MB/sec
PointerInvoke.ptr_to_long_new_segment:·gc.alloc.rate.norm  avgt   30  ≈ 10⁻⁵             B/op
PointerInvoke.ptr_to_long_new_segment:·gc.count            avgt   30     ≈ 0           counts
PointerInvoke.ptr_to_ptr                                   avgt   30  10.822 ±  0.141   ns/op
PointerInvoke.ptr_to_ptr:·gc.alloc.rate                    avgt   30   0.001 ±  0.001  MB/sec
PointerInvoke.ptr_to_ptr:·gc.alloc.rate.norm               avgt   30  ≈ 10⁻⁵             B/op
PointerInvoke.ptr_to_ptr:·gc.count                         avgt   30     ≈ 0           counts
PointerInvoke.ptr_to_ptr_new_segment                       avgt   30  11.772 ±  0.094   ns/op
PointerInvoke.ptr_to_ptr_new_segment:·gc.alloc.rate        avgt   30   0.001 ±  0.001  MB/sec
PointerInvoke.ptr_to_ptr_new_segment:·gc.alloc.rate.norm   avgt   30  ≈ 10⁻⁵             B/op
PointerInvoke.ptr_to_ptr_new_segment:·gc.count             avgt   30     ≈ 0           counts

None of the benchmark shows allocation now.

Copy link
Member

@JornVernee JornVernee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice beef up!

@openjdk
Copy link

openjdk bot commented Feb 10, 2023

@mcimadamore This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

Generalize PointerInvoke to benchmark by-ref segment return

Reviewed-by: jvernee

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been no new commits pushed to the foreign-memaccess+abi branch. If another commit should be pushed before you perform the /integrate command, your PR will be automatically rebased. If you prefer to avoid any potential automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the foreign-memaccess+abi branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Ready to be integrated label Feb 10, 2023
@mcimadamore
Copy link
Collaborator Author

/integrate

@openjdk
Copy link

openjdk bot commented Feb 10, 2023

Going to push as commit e3a46c9.

@openjdk openjdk bot added the integrated Pull request has been integrated label Feb 10, 2023
@openjdk openjdk bot closed this Feb 10, 2023
@openjdk openjdk bot removed ready Ready to be integrated rfr Ready for review labels Feb 10, 2023
@openjdk
Copy link

openjdk bot commented Feb 10, 2023

@mcimadamore Pushed as commit e3a46c9.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

None yet

2 participants