Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8336831: Optimize StringConcatHelper.simpleConcat #20253

Closed
wants to merge 10 commits into from

Conversation

wenshao
Copy link
Contributor

@wenshao wenshao commented Jul 19, 2024

Currently simpleConcat is implemented using mix and prepend, but in this simple scenario, it can be implemented in a simpler way and can improve performance.


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed (2 reviews required, with at least 2 Reviewers)

Issue

  • JDK-8336831: Optimize StringConcatHelper.simpleConcat (Enhancement - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/20253/head:pull/20253
$ git checkout pull/20253

Update a local copy of the PR:
$ git checkout pull/20253
$ git pull https://git.openjdk.org/jdk.git pull/20253/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 20253

View PR using the GUI difftool:
$ git pr show -t 20253

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/20253.diff

Webrev

Link to Webrev Comment

Sorry, something went wrong.

@bridgekeeper
Copy link

bridgekeeper bot commented Jul 19, 2024

👋 Welcome back wenshao! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Jul 19, 2024

@wenshao This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8336831: Optimize StringConcatHelper.simpleConcat

Reviewed-by: liach, redestad, rriggs

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 6 new commits pushed to the master branch:

  • 4c91d5c: 8322133: getParameterSpec(ECGenParameterSpec.class) on EC AlgorithmParameters does not return standard names
  • 2f2223d: 8336944: Shenandoah: Should only relativize stack chunks for successful evacuations
  • 8efcb40: 8335823: Update --release 23 symbol information for JDK 23 build 33
  • 8e1f17e: 8327054: DiagnosticCommand Compiler.perfmap does not log on output()
  • 0e555b5: 8204582: Extra spaces in jlink documentation make it incorrect.
  • a2a236f: 8335939: Hide element writing across the ClassFile API

Please see this link for an up-to-date comparison between the source branch of this pull request and the master branch.
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@liach, @cl4es, @RogerRiggs) but any other Committer may sponsor as well.

➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).

@openjdk
Copy link

openjdk bot commented Jul 19, 2024

@wenshao The following label will be automatically applied to this pull request:

  • core-libs

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the core-libs core-libs-dev@openjdk.org label Jul 19, 2024
@wenshao
Copy link
Contributor Author

wenshao commented Jul 19, 2024

Below are the performance numbers running on a MacBook M1 Pro, which is 8.71% faster

-Benchmark                             (intValue)  Mode  Cnt  Score   Error  Units
-StringConcat.concatMethodConstString        4711  avgt   15  5.499 ? 0.046  ns/op

+Benchmark                             (intValue)  Mode  Cnt  Score   Error  Units
+StringConcat.concatMethodConstString        4711  avgt   15  5.058 ? 0.012  ns/op +8.71%

@wenshao wenshao changed the title Optimize StringConcatHelper.simpleConcat 8336831: Optimize StringConcatHelper.simpleConcat Jul 19, 2024
@openjdk openjdk bot added the rfr Pull request is ready for review label Jul 19, 2024
@mlbridge
Copy link

mlbridge bot commented Jul 19, 2024

Webrevs

@liach
Copy link
Member

liach commented Jul 19, 2024

Also beware of #19927: it simplifies the prepend part. How does that patch run in your benchmark?

@wenshao
Copy link
Contributor Author

wenshao commented Jul 19, 2024

I have changed it to a shared newArray, including out of bounds checking, The performance improvement is still the same as before.

-Benchmark                             (intValue)  Mode  Cnt  Score   Error  Units (master)
-StringConcat.concatMethodConstString        4711  avgt   15  5.437 ? 0.073  ns/op

+Benchmark                             (intValue)  Mode  Cnt  Score   Error  Units (38a6f5602f0f1f4a9eec4b7f0ac1b5c4e661f667)
+StringConcat.concatMethodConstString        4711  avgt   15  5.027 ? 0.080  ns/op +8.15%

@wenshao
Copy link
Contributor Author

wenshao commented Jul 19, 2024

Also beware of #19927: it simplifies the prepend part. How does that patch run in your benchmark?

The performance comparison I made is based on the latest master branch (c25c4896ad9ef031e3cddec493aef66ff87c48a7), after #19927.

@liach
Copy link
Member

liach commented Jul 19, 2024

As annoying and risky as this first appeared, this patch is actually in quite good shape: The usage of getBytes is similar to those in StringUTF16, and there's no reason to further complicate this by splitting the handling into StringLatin1 and StringUTF16. 👍

Another question for you: can you check out the original form over here too? 781fb29#diff-f8131d8a48caf7cfc908417fad241393c2ef55408172e9a28dcaa14b1d73e1fbL1968-L1981

simpleConcat is required to create new String objects and String.concat can just return the argument/original when one part is not empty. Is there any value to extract a common doConcat handling both non-empty strings, called by both methods after handling the empty cases?

@wenshao
Copy link
Contributor Author

wenshao commented Jul 19, 2024

@liach is this what you want to change it to?

class String {
    public String concat(String str) {
        if (str.isEmpty()) {
            return this;
        }
        return StringConcatHelper.doConcat(this, str);
    }
}

class StringConcatHelper {
    @ForceInline
    static String simpleConcat(Object first, Object second) {
        String s1 = stringOf(first);
        String s2 = stringOf(second);
        if (s1.isEmpty()) {
            // newly created string required, see JLS 15.18.1
            return new String(s2);
        }
        if (s2.isEmpty()) {
            // newly created string required, see JLS 15.18.1
            return new String(s1);
        }
        return doConcat(s1, s2);
    }

    @ForceInline
    static String doConcat(String s1, String s2) {
        byte coder = (byte) (s1.coder() | s2.coder());
        int newLength = (s1.length() + s2.length()) << coder;
        byte[] buf = newArray(newLength);
        s1.getBytes(buf, 0, coder);
        s2.getBytes(buf, s1.length(), coder);
        return new String(buf, coder);
    }
}

@liach
Copy link
Member

liach commented Jul 19, 2024

Yep. I think this looks cleaner and avoids redundant checks on the String.concat path. What do you think? Is this slower according to your benchmarks?

@liach
Copy link
Member

liach commented Jul 19, 2024

This patch looks ideal to me; but other reviewers may have different opinions on this review. I am ready to answer their questions for you.

I will approve your patch once you fixed the null and empty check in String.concat and the builds look good. Don't change your patch too often, as that means previous build results will be invalid.

wenshao and others added 2 commits July 20, 2024 05:37
@wenshao
Copy link
Contributor Author

wenshao commented Jul 19, 2024

Below are the performance numbers of the latest version running on a MacBook M1 Pro, which is 9.19% faster

-Benchmark                             (intValue)  Mode  Cnt  Score   Error  Units (base c25c4896ad9ef031e3cddec493aef66ff87c48a7)
-StringConcat.concatMethodConstString        4711  avgt   15  5.440 ? 0.075  ns/op

+Benchmark                             (intValue)  Mode  Cnt  Score   Error  Units (current 69901157e4dae9018abd727956f60fd11b8fa252)
+StringConcat.concatMethodConstString        4711  avgt   15  4.982 ? 0.019  ns/op +9.19%

@RogerRiggs
Copy link
Contributor

I'll take a look at this next week.

@RogerRiggs
Copy link
Contributor

/reviewers 2 reviewer

@openjdk
Copy link

openjdk bot commented Jul 19, 2024

@RogerRiggs
The total number of required reviews for this PR (including the jcheck configuration and the last /reviewers command) is now set to 2 (with at least 2 Reviewers).

@cl4es
Copy link
Member

cl4es commented Jul 19, 2024

FWIW one of the ideas when implementing StringConcatHelper.simpleConcat was that by using the primitives used by the StringConcatFactory as straightforwardly as possible the method acts as a documentation-of-sorts or guide to understand how the SCF expression trees are built up. It's not perfect, though. Concatenation of a String + constant would be handled differently for one. So perhaps the value as a guide is not high.

I'm also experimenting with replacing the MH-based strategy with spinning hidden, shareable classes instead. In that case we might actually be better off getting rid of the long indexCoder hacks and generate code more similar to the doConcat you've come up with here. The main benefit of combining coder and index/length into a single long was to reduce the MH combinator overheads, but if we're spinning code with access to optimized primitives then that isn't really needed.

@wenshao
Copy link
Contributor Author

wenshao commented Jul 22, 2024

If code is generated through ClassFile, how to access the private methods of classes such as String/StringConcatHelper?

In that case, you have to encode those methods as live MethodHandle objects, passed to the generated class with defineHiddenClassWithClassData (just pass the class data object) They can be retrieved in generated code with a CONDY using MethodHandles.classData. (This is how the MethodHandle LambdaForm classes referred to other MethodHandles)

You don't really need to worry about experimenting with bytecode generation right now. This patch as is is very good, and tier 1-3 tests pass. Bytecode gen is aimed at more complex scenarios with multiple constants and various types of arguments instead of this simple concat case.

@liach Can you give a sample code?

@liach
Copy link
Member

liach commented Jul 22, 2024

@wenshao Answered at #20273 (comment) since this question is more closely related to that patch.

…at_202407

# Conflicts:
#	src/java.base/share/classes/java/lang/StringConcatHelper.java
Copy link
Member

@cl4es cl4es left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change seems reasonable, and we will need the newArray(int) method in future work. This patch dials back on the idea that simpleConcat is an "explainer" for what StringConcatFactory is doing, but as it was already imprecise in that regard we should favor simplicity and performance here.

@openjdk
Copy link

openjdk bot commented Jul 23, 2024

@wenshao this pull request can not be integrated into master due to one or more merge conflicts. To resolve these merge conflicts and update this pull request you can run the following commands in the local repository for your personal fork:

git checkout optim_simple_concat_202407
git fetch https://git.openjdk.org/jdk.git master
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge master"
git push

@openjdk openjdk bot added the merge-conflict Pull request has merge conflict with target branch label Jul 23, 2024
@openjdk openjdk bot removed the merge-conflict Pull request has merge conflict with target branch label Jul 23, 2024
Copy link
Member

@liach liach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merge conflict fix looks good.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Jul 23, 2024
@liach
Copy link
Member

liach commented Jul 23, 2024

Please hold off a few hours before integration: I will see if @RogerRiggs has any concerns, and I will run the CI tests for your patch again.

Copy link
Contributor

@RogerRiggs RogerRiggs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good.

@liach
Copy link
Member

liach commented Jul 23, 2024

Tier 1-3 tests pass.

@wenshao
Copy link
Contributor Author

wenshao commented Jul 23, 2024

/integrate

@openjdk openjdk bot added the sponsor Pull request is ready to be sponsored label Jul 23, 2024
@openjdk
Copy link

openjdk bot commented Jul 23, 2024

@wenshao
Your change (at version a2be39a) is now ready to be sponsored by a Committer.

@liach
Copy link
Member

liach commented Jul 23, 2024

/sponsor

@openjdk
Copy link

openjdk bot commented Jul 23, 2024

Going to push as commit 476d2ae.
Since your change was applied there have been 6 commits pushed to the master branch:

  • 4c91d5c: 8322133: getParameterSpec(ECGenParameterSpec.class) on EC AlgorithmParameters does not return standard names
  • 2f2223d: 8336944: Shenandoah: Should only relativize stack chunks for successful evacuations
  • 8efcb40: 8335823: Update --release 23 symbol information for JDK 23 build 33
  • 8e1f17e: 8327054: DiagnosticCommand Compiler.perfmap does not log on output()
  • 0e555b5: 8204582: Extra spaces in jlink documentation make it incorrect.
  • a2a236f: 8335939: Hide element writing across the ClassFile API

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Jul 23, 2024
@openjdk openjdk bot closed this Jul 23, 2024
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review sponsor Pull request is ready to be sponsored labels Jul 23, 2024
@openjdk
Copy link

openjdk bot commented Jul 23, 2024

@liach @wenshao Pushed as commit 476d2ae.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core-libs core-libs-dev@openjdk.org integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

None yet

5 participants