-
Notifications
You must be signed in to change notification settings - Fork 5.8k
8344706: Compiler Implementation of Compact Source Files and Instance Main Methods #24438
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…spec. Augment class spec to discuss buffering of bytes from System.in. Change "terminates the line" with "writes a line separator".
…concise-source-files
👋 Welcome back jlahoda! A progress list of the required criteria for merging this PR into |
❗ This change is not yet ready to be integrated. |
|
/contributor add @stuart-marks |
/jep JDK-8344699 |
@lahodaj |
@lahodaj |
System.out.print(obj); | ||
System.out.flush(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it worth using a local variable to avoid calling print
and flush
on different streams in case System.out
is reassigned in between?
var out = System.out;
out.print(obj);
out.flush();
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code is not that perf sensitive, guess it is fine as-is
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's a correctness angle. If you print something to a stream you want to flush the same stream. If System.out is changed in between (unlikely, but possible) then output destined for the old stream could be stranded in its buffer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated as suggested; thanks @lukellmann .
* @throws IOError if an I/O error occurs | ||
*/ | ||
public static String readln() { | ||
try { | ||
return reader().readLine(); | ||
} catch (IOException ioe) { | ||
throw new IOError(ioe); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was UncheckedIOException
considered? It might be better for beginners to learn throwing exceptions instead of errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This implementation is not to be emulated by beginners. If an Exception.is thrown here, a user might be tempted to add handlers in their code, while this is an issue with the setup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Someone is bound to ask why the readln method throw but the println methods don't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The IOError was carried over from Console.readLine(), which throws IOError on error. Of course we're decoupled from Console now, but that was the original reason.
My guess is that Console methods throw IOError because (a) they didn't want to make everybody catch IOException, (b) they didn't want to maintain an internal error state like PrintStream, and (c) they wanted to throw a throwable whose type conveyed the meaning that it wasn't expected to be handled. IOError and Console were both added in JDK 1.6. Maybe that reasoning still applies here.
UncheckedIOException wasn't added until JDK 1.8. Note also that UncheckedIOException has a cause of IOException, so it can't arise from some other throwable. This was used in the previous implementation, where IOError was thrown if Console was null. That doesn't apply anymore, but IOError is still somewhat more flexible than UncheckedIOException in this regard.
I think we need to say something, implicitly or explicitly, about error handling. Note that PrintStream has this internal error state so output is covered already. For input it seems like IOError is reasonable, but maybe there's a better choice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The readln methods handle malformed-input and unmappable-character errors by dropping, and using a replacement value. So erroneous input doesn't throw IOError with a CharacterCodingException as the cause.
System.in.close() would a be wild thing to do, but a case where readln would throw IOError.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah. Does the specification here need to state the policy around malformed input and unmappable characters? (Or, maybe in the class specification.)
People have ended up in cases where System.in is closed. It can occur when combining use of Scanner to read from System.in and applying the anything-that-you-open-must-be-closed rule, e.g.,
try (Scanner sc = new Scanner(System.in)) {
String s = sc.nextLine();
}
which is a mistake, but it takes a long time to explain....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a clause to the class specification that covers malformed/unmappable byte sequences.
Webrevs
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes in source launcher related tests look good.
* If this property is not present, or if the charset it names cannot be loaded, then | ||
* UTF-8 is used instead. These internal objects are created upon the first call to | ||
* either of the {@code readln} methods and are stored for subsequent reuse by these | ||
* methods. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stuart-marks Can we rephrase this paragraph so that it doesn't use phrase "internal objects"? The class does speak of buffering and how it might impact code that mixes use of System.in and IO.readln so I agree with that part it's just the "internal objects" phrase that is confusing to read in this class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used "internal objects" because I wanted to avoid naming concrete classes, which might or might not be used. Previous drafts mentioned BufferedReader, InputStreamReader, and CharsetDecoder.
I could replace "internal objects" with something more descriptive like "objects to handle buffering and charset decoding" but I'd still need a noun phrase to refer to them later. Maybe "buffering and decoding objects" ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we rephrase the 1st sentence to read:
"The readln()
and readln(String)
methods in this class decode bytes read from System.in
into characters."
And remove the last "these internal objects" sentence.
And for the next paragraph, change the first line to:
"The readln
methods may buffer additional bytes..."
So we remove references to "internal objects."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm in the midst of rewriting all of this stuff. The main concern here is that applications can use System.in more or less as they please (presumably without buffering) until the first call to a readln
method, at which time something happens and effectively the application cannot use System.in anymore. I'll post the revision when I've finished it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
test/jdk/java/lang/IO/IO/IO.java
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there reason to create a test subdirectory IO
under java.lang.IO
package?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I missed this comment. Fixed here:
bacc1a2
test/jdk/java/lang/IO/IO/IO.java
Outdated
@@ -51,7 +51,7 @@ | |||
/* | |||
* @test | |||
* @bug 8305457 8342936 8351435 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Issue id needs to be added (and other test files too)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I missed this comment. Fixed here:
5d02ff6
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
javac compiler related code and tests lgtm
…jects."
* Charset decoding is set up upon the first call to one of the {@code readln} methods. | ||
* Decoding may buffer additional bytes beyond those that have been decoded to characters | ||
* returned to the application. After the first call to one of the {@code readln} methods, | ||
* any subsequent use of {@code System.in} results in unspecified behavior. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should add a note about mixing API usage into System.in too. There are tutorials and books that show examples build on System.in that will add buffering on that input source.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be a good idea. I'll handle this separately. I don't want it to refer to a Preview API. I'll probably just use Scanner and InputStreamReader as examples. (Oh wait, this supposedly isn't a Preview API anymore...)
* The {@link #readln()} and {@link #readln(String)} methods decode bytes read from | ||
* {@code System.in} into characters. The charset used for decoding is specified by the | ||
* {@link System#getProperties stdin.encoding} property. If this property is not present, | ||
* or if the charset it names cannot be loaded, then UTF-8 is used instead. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stdout.encoding and stderr.encoding are specified (in System.getProperties) to lead to unspecified behavior if started with either property set to a value other than UTF-8. We should work through the issues of introducing stdin.encoding as soon as we can, esp. the redirect cases and whether there is use cases for setting any of these properties on the command line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, believe it or not, I am still studying this...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
* <p> | ||
* The {@link #readln()} and {@link #readln(String)} methods decode bytes read from | ||
* {@code System.in} into characters. The charset used for decoding is specified by the | ||
* {@link System#getProperties stdin.encoding} property. If this property is not present, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stuart-marks Are you planning to propose/integrate pull/24738 in advance of the JEP update? Asking because this paragraph will need to be adjusted if pull/24738 doesn't go in first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depends on how close you think 24738 is to being ready. As it stands it seems ok; but I'm still looking at the potential impact on other part of the JDK. In particular are there other JDK APIs that should be adjusted to mention stdin.encoding
?
As for the impact here, if 24738 doesn't go in first, then I'm not quite sure what this should say. I guess it could say "default charset" or something and then it could be amended with a bugfix later.
if (isPreview && mainMethod == null) { | ||
if (mainMethod == null) { | ||
//if not public method, try to lookup a non-public one | ||
mainMethod = JLA.findMethod(cls, false, "main", String[].class); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason why the first findMethod can't use publicOnly=false?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My reason was that when publicOnly=false
, there's more classloading happening (unsurprisingly, as also the non-public methods must be resolved). Which may lead to a change of behavior if the traditional public static main(String[])
exists.
IIRC there is a test relying on this behavior, but I don't have the name offhand. Will put it here when I have it.
For example:
$ cat Main.java
public class Main {
public static void main(String... args) {
System.err.println("Hello!");
}
private Missing get() { return null; }
}
$ cat Missing.java
public class Missing {
}
$ jdk-24/bin/javac Main.java Missing.java; rm Missing.class
$ jdk-24/bin/java -classpath . Main
Hello!
$ jdk-without-publicOnly=true/bin/java -classpath . Main
Error: Unable to initialize main class Main
Caused by: java.lang.NoClassDefFoundError: Missing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tests that would be affected by the extra classloading appear to be:
test/jdk/java/lang/System/SecurityManagerWarnings.java
test/jdk/java/lang/System/LoggerFinder/SignedLoggerFinderTest/SignedLoggerFinderTest.java
@@ -148,100 +147,6 @@ public Reader reader() { | |||
throw newUnsupportedOperationException(); | |||
} | |||
|
|||
/** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The copyright header will need to be updated on the changes to java.base as they seem to be first update to these sources in 2025.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed here:
44fccf3
This reverts commit 7047cf0.
…nto finalize-concise-source-files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I guess the coordination with Stuart is underway
@lahodaj this pull request can not be integrated into git checkout finalize-concise-source-files
git fetch https://git.openjdk.org/jdk.git master
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge master"
git push |
This is a PR that implements JEP: Compact Source Files and Instance Main Methods. Changes include:
java.io.IO
moved tojava.lang.IO
, and no longer usesSystem.console()
to implement the methods (thanks to @stuart-marks)java. ... .IO
is no longer automatically imported in any compilation unit--enable-preview
)Progress
Integration blocker
Issues
Reviewers
Contributors
<smarks@openjdk.org>
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/24438/head:pull/24438
$ git checkout pull/24438
Update a local copy of the PR:
$ git checkout pull/24438
$ git pull https://git.openjdk.org/jdk.git pull/24438/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 24438
View PR using the GUI difftool:
$ git pr show -t 24438
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/24438.diff
Using Webrev
Link to Webrev Comment