Skip to content

8344706: Compiler Implementation of Compact Source Files and Instance Main Methods #24438

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 36 commits into
base: master
Choose a base branch
from

Conversation

lahodaj
Copy link
Contributor

@lahodaj lahodaj commented Apr 4, 2025

This is a PR that implements JEP: Compact Source Files and Instance Main Methods. Changes include:

  • java.io.IO moved to java.lang.IO, and no longer uses System.console() to implement the methods (thanks to @stuart-marks)
  • java. ... .IO is no longer automatically imported in any compilation unit
  • the feature is finalized (i.e. no longer requires --enable-preview)

Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change requires CSR request JDK-8353437 to be approved
  • Change requires a JEP request to be targeted

Integration blocker

 ⚠️ Title mismatch between PR and JBS for issue JDK-8344706

Issues

  • JDK-8344706: Implement JEP 512: Compact Source Files and Instance Main Methods (Sub-task - P4) ⚠️ Title mismatch between PR and JBS.
  • JDK-8344699: JEP 512: Compact Source Files and Instance Main Methods (JEP)
  • JDK-8353437: Implement JEP 512: Compact Source Files and Instance Main Methods (CSR)

Reviewers

Contributors

  • Stuart Marks <smarks@openjdk.org>

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/24438/head:pull/24438
$ git checkout pull/24438

Update a local copy of the PR:
$ git checkout pull/24438
$ git pull https://git.openjdk.org/jdk.git pull/24438/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 24438

View PR using the GUI difftool:
$ git pr show -t 24438

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/24438.diff

Using Webrev

Link to Webrev Comment

Sorry, something went wrong.

stuart-marks and others added 21 commits January 13, 2025 19:41
…spec.

Augment class spec to discuss buffering of bytes from System.in. Change
"terminates the line" with "writes a line separator".
@bridgekeeper
Copy link

bridgekeeper bot commented Apr 4, 2025

👋 Welcome back jlahoda! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Apr 4, 2025

❗ This change is not yet ready to be integrated.
See the Progress checklist in the description for automated requirements.

@openjdk
Copy link

openjdk bot commented Apr 4, 2025

⚠️ @lahodaj This pull request contains merges that bring in commits not present in the target repository. Since this is not a "merge style" pull request, these changes will be squashed when this pull request in integrated. If this is your intention, then please ignore this message. If you want to preserve the commit structure, you must change the title of this pull request to Merge <project>:<branch> where <project> is the name of another project in the OpenJDK organization (for example Merge jdk:master).

@openjdk openjdk bot added the csr Pull request needs approved CSR before integration label Apr 4, 2025
@lahodaj
Copy link
Contributor Author

lahodaj commented Apr 4, 2025

/contributor add @stuart-marks

@lahodaj
Copy link
Contributor Author

lahodaj commented Apr 4, 2025

/jep JDK-8344699

@openjdk
Copy link

openjdk bot commented Apr 4, 2025

@lahodaj
Contributor Stuart Marks <smarks@openjdk.org> successfully added.

@openjdk
Copy link

openjdk bot commented Apr 4, 2025

@lahodaj
This pull request will not be integrated until the JEP JDK-8344699 has been targeted.

Comment on lines 121 to 122
System.out.print(obj);
System.out.flush();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth using a local variable to avoid calling print and flush on different streams in case System.out is reassigned in between?

var out = System.out;
out.print(obj);
out.flush();

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is not that perf sensitive, guess it is fine as-is

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's a correctness angle. If you print something to a stream you want to flush the same stream. If System.out is changed in between (unlikely, but possible) then output destined for the old stream could be stranded in its buffer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated as suggested; thanks @lukellmann .

Comment on lines +140 to +147
* @throws IOError if an I/O error occurs
*/
public static String readln() {
try {
return reader().readLine();
} catch (IOException ioe) {
throw new IOError(ioe);
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was UncheckedIOException considered? It might be better for beginners to learn throwing exceptions instead of errors.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implementation is not to be emulated by beginners. If an Exception.is thrown here, a user might be tempted to add handlers in their code, while this is an issue with the setup.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Someone is bound to ask why the readln method throw but the println methods don't.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The IOError was carried over from Console.readLine(), which throws IOError on error. Of course we're decoupled from Console now, but that was the original reason.

My guess is that Console methods throw IOError because (a) they didn't want to make everybody catch IOException, (b) they didn't want to maintain an internal error state like PrintStream, and (c) they wanted to throw a throwable whose type conveyed the meaning that it wasn't expected to be handled. IOError and Console were both added in JDK 1.6. Maybe that reasoning still applies here.

UncheckedIOException wasn't added until JDK 1.8. Note also that UncheckedIOException has a cause of IOException, so it can't arise from some other throwable. This was used in the previous implementation, where IOError was thrown if Console was null. That doesn't apply anymore, but IOError is still somewhat more flexible than UncheckedIOException in this regard.

I think we need to say something, implicitly or explicitly, about error handling. Note that PrintStream has this internal error state so output is covered already. For input it seems like IOError is reasonable, but maybe there's a better choice.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The readln methods handle malformed-input and unmappable-character errors by dropping, and using a replacement value. So erroneous input doesn't throw IOError with a CharacterCodingException as the cause.

System.in.close() would a be wild thing to do, but a case where readln would throw IOError.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah. Does the specification here need to state the policy around malformed input and unmappable characters? (Or, maybe in the class specification.)

People have ended up in cases where System.in is closed. It can occur when combining use of Scanner to read from System.in and applying the anything-that-you-open-must-be-closed rule, e.g.,

try (Scanner sc = new Scanner(System.in)) {
    String s = sc.nextLine();
}

which is a mistake, but it takes a long time to explain....

Copy link
Member

@stuart-marks stuart-marks Apr 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a clause to the class specification that covers malformed/unmappable byte sequences.

@lahodaj lahodaj marked this pull request as ready for review April 7, 2025 06:57
@openjdk openjdk bot added the rfr Pull request is ready for review label Apr 7, 2025
@mlbridge
Copy link

mlbridge bot commented Apr 7, 2025

Webrevs

Copy link
Member

@sormuras sormuras left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes in source launcher related tests look good.

* If this property is not present, or if the charset it names cannot be loaded, then
* UTF-8 is used instead. These internal objects are created upon the first call to
* either of the {@code readln} methods and are stored for subsequent reuse by these
* methods.
Copy link
Contributor

@AlanBateman AlanBateman Apr 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stuart-marks Can we rephrase this paragraph so that it doesn't use phrase "internal objects"? The class does speak of buffering and how it might impact code that mixes use of System.in and IO.readln so I agree with that part it's just the "internal objects" phrase that is confusing to read in this class.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used "internal objects" because I wanted to avoid naming concrete classes, which might or might not be used. Previous drafts mentioned BufferedReader, InputStreamReader, and CharsetDecoder.

I could replace "internal objects" with something more descriptive like "objects to handle buffering and charset decoding" but I'd still need a noun phrase to refer to them later. Maybe "buffering and decoding objects" ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we rephrase the 1st sentence to read:
"The readln() and readln(String) methods in this class decode bytes read from System.in into characters."

And remove the last "these internal objects" sentence.

And for the next paragraph, change the first line to:
"The readln methods may buffer additional bytes..."

So we remove references to "internal objects."

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm in the midst of rewriting all of this stuff. The main concern here is that applications can use System.in more or less as they please (presumably without buffering) until the first call to a readln method, at which time something happens and effectively the application cannot use System.in anymore. I'll post the revision when I've finished it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there reason to create a test subdirectory IO under java.lang.IO package?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I missed this comment. Fixed here:
bacc1a2

@@ -51,7 +51,7 @@
/*
* @test
* @bug 8305457 8342936 8351435
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue id needs to be added (and other test files too)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I missed this comment. Fixed here:
5d02ff6

Copy link
Contributor

@vicente-romero-oracle vicente-romero-oracle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

javac compiler related code and tests lgtm

* Charset decoding is set up upon the first call to one of the {@code readln} methods.
* Decoding may buffer additional bytes beyond those that have been decoded to characters
* returned to the application. After the first call to one of the {@code readln} methods,
* any subsequent use of {@code System.in} results in unspecified behavior.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should add a note about mixing API usage into System.in too. There are tutorials and books that show examples build on System.in that will add buffering on that input source.

Copy link
Member

@stuart-marks stuart-marks Apr 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be a good idea. I'll handle this separately. I don't want it to refer to a Preview API. I'll probably just use Scanner and InputStreamReader as examples. (Oh wait, this supposedly isn't a Preview API anymore...)

* The {@link #readln()} and {@link #readln(String)} methods decode bytes read from
* {@code System.in} into characters. The charset used for decoding is specified by the
* {@link System#getProperties stdin.encoding} property. If this property is not present,
* or if the charset it names cannot be loaded, then UTF-8 is used instead.
Copy link
Contributor

@AlanBateman AlanBateman Apr 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stdout.encoding and stderr.encoding are specified (in System.getProperties) to lead to unspecified behavior if started with either property set to a value other than UTF-8. We should work through the issues of introducing stdin.encoding as soon as we can, esp. the redirect cases and whether there is use cases for setting any of these properties on the command line.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, believe it or not, I am still studying this...

Copy link
Contributor

@vicente-romero-oracle vicente-romero-oracle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

* <p>
* The {@link #readln()} and {@link #readln(String)} methods decode bytes read from
* {@code System.in} into characters. The charset used for decoding is specified by the
* {@link System#getProperties stdin.encoding} property. If this property is not present,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stuart-marks Are you planning to propose/integrate pull/24738 in advance of the JEP update? Asking because this paragraph will need to be adjusted if pull/24738 doesn't go in first.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depends on how close you think 24738 is to being ready. As it stands it seems ok; but I'm still looking at the potential impact on other part of the JDK. In particular are there other JDK APIs that should be adjusted to mention stdin.encoding?

As for the impact here, if 24738 doesn't go in first, then I'm not quite sure what this should say. I guess it could say "default charset" or something and then it could be amended with a bugfix later.

if (isPreview && mainMethod == null) {
if (mainMethod == null) {
//if not public method, try to lookup a non-public one
mainMethod = JLA.findMethod(cls, false, "main", String[].class);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why the first findMethod can't use publicOnly=false?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My reason was that when publicOnly=false, there's more classloading happening (unsurprisingly, as also the non-public methods must be resolved). Which may lead to a change of behavior if the traditional public static main(String[]) exists.

IIRC there is a test relying on this behavior, but I don't have the name offhand. Will put it here when I have it.

For example:

$ cat Main.java 
public class Main {
    public static void main(String... args) {
        System.err.println("Hello!");
    }
    private Missing get() { return null; }
}
$ cat Missing.java 
public class Missing {
}
$ jdk-24/bin/javac Main.java Missing.java; rm Missing.class
$ jdk-24/bin/java -classpath . Main
Hello!
$ jdk-without-publicOnly=true/bin/java -classpath . Main
Error: Unable to initialize main class Main
Caused by: java.lang.NoClassDefFoundError: Missing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests that would be affected by the extra classloading appear to be:

test/jdk/java/lang/System/SecurityManagerWarnings.java
test/jdk/java/lang/System/LoggerFinder/SignedLoggerFinderTest/SignedLoggerFinderTest.java

@@ -148,100 +147,6 @@ public Reader reader() {
throw newUnsupportedOperationException();
}

/**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The copyright header will need to be updated on the changes to java.base as they seem to be first update to these sources in 2025.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed here:
44fccf3

Copy link
Member

@naotoj naotoj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I guess the coordination with Stuart is underway

@openjdk
Copy link

openjdk bot commented Apr 23, 2025

@lahodaj this pull request can not be integrated into master due to one or more merge conflicts. To resolve these merge conflicts and update this pull request you can run the following commands in the local repository for your personal fork:

git checkout finalize-concise-source-files
git fetch https://git.openjdk.org/jdk.git master
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge master"
git push

@openjdk openjdk bot added the merge-conflict Pull request has merge conflict with target branch label Apr 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler compiler-dev@openjdk.org core-libs core-libs-dev@openjdk.org csr Pull request needs approved CSR before integration jep kulla kulla-dev@openjdk.org merge-conflict Pull request has merge conflict with target branch rfr Pull request is ready for review
Development

Successfully merging this pull request may close these issues.

None yet

8 participants