New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8304013: Add a fast, non-manual alternative to test/jdk/java/util/zip/ZipFile/TestTooManyEntries #12231
Conversation
…d of central directory" records. This removes the need to a create several GB big "real" ZIP64 file. To facilitate testing, the validation of CEN size > limit is moved from initCEN into findEnd. Since the test now runs fast, make it non-manual.
👋 Welcome back eirbjo! A progress list of the required criteria for merging this PR into |
… cases for regular CEN sizes exactly on and exceeding the limit. Move enforcement of the size limit back to initCEN, but before the short-circuiting of empty files.
…ze to the wanted sizes. Remove the ZIP64 logic, since the purpose of the test is to verify rejection of CEN sizes too large, not where the CEN is found.
… runs fast and non-manual
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates.
Please see the additional suggestions below
* The resulting ZIP is technically not valid, but it does allow us | ||
* to test that the large CEN size is rejected | ||
*/ | ||
private Path zipWithCenSize(String name, int cenSize) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could speed this up a bit more by just creating the array representing the original Zip 1 time and modify it each pass.(though it might not speed things up that much)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I introduced the templateZip() method which returns a byte[]. While the speed is insignificat here, I think it makes it more explicit that we're producing ZIPs from a template.
private static final int ENDHDR = 22; // End of central directory record size | ||
private static final int CEN_SIZE_OFFSET = 12; // Offset of CEN size field within ENDHDR | ||
private Path huge; // ZIP file with CEN size exceeding limit | ||
private Path big; // ZIP file with CEN size exactly on the limit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might consider more meaningful names for the Paths
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've removed these fields now and instead produce the files in the test methods directly, removing the need to name the fields.
import static org.testng.Assert.assertTrue; | ||
import static org.testng.Assert.fail; | ||
|
||
public class CenSizeTooLarge { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would add a comment which also points to the original test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a comment referring to TestTooManyEntries explaining why this test exists.
*/ | ||
private void assertRejected(Path zip, String expectedMsg) throws IOException { | ||
try (ZipFile zf = new ZipFile(zip.toFile())) { | ||
fail("Expected ZipFile to throw ZipException"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could consider using expectThrows and then validation the value of getMessage()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replaced with expectedExceptions and expectedExceptionsMessageRegExp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that I finally understand what you meant by this, I've update to use expectThrows and assertEquals :-)
…ment referencing existing CenSizeTooLarge, clarify MAX_CEN_SIZE, use expectedExceptions and expectedExceptionsMessageRegExp
… declared in the End of central directory record. Add a new test which verifies the rejection of too-large CEN offset. Revert the change to END validation in ZipFile which is no longer needed.
/issue 8304013 |
Thanks for all your help and careful reviews getting this ball across the net, Lance! Feels good to have one less manual test. /integrate |
import static org.testng.Assert.*; | ||
|
||
/** | ||
* This test augments {@link TestTooManyEntries}. It creates sparse ZIPs where the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove the "the" stutter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"the" stutter has been removed 👍
|
||
/** | ||
* Validate that an end of central directory record with a | ||
* CEN size which exceeds the position of the EOC record is rejected. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EOC is not a standard zip file technical term. Did you mean "END header" ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If by "standard" you mean "APPNOTE.TXT", then that uses "end of central directory record" which is a bit long.
The Java implementation seems to use END, like in ZipFile.ENDHDR
, "END header", etc.
It is probably better to be consistent than correct here, so I've changed the test to use "END header" consistently when referring to the "end of central directory record".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, what I would suggest is adding a comment somewhere regarding end of central directory record and note the reference to ENDHDR or END Header
I don't think you want to spend too much time here just the wording reference the end of central directory record was clear and where you were abbreviating, then I would reference ENDHDR
I would be looking at the APP.NOTE when looking at this test, not the ZipFile constant verbiage, but that is just me :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Neither APPNOTE nor JDK doc zip terminology is great, but a test is not the place to invent something better.
(But I'm surprised no one seems to be using "EOCD")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that would be “EOCDR”. (Hides behind the bike shed..)
/** | ||
* Validate that an end of central directory record with a CEN offset which | ||
* is larger than the EOC position minus the CEN size is rejected | ||
* @throws IOException |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think @throws
javadoc comments in tests are useful; especially checked ones
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recently added a @throws
by review comment from Lance. I tend to agree that is not very useful, so I'm removing both ocurrences as per your recommendation.
If Lance wants it back, he can add a review comment saying so. I don't particularly care which professor gets the last word in this matter :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add them back before pushing with @throws IOException if an error occurs
Yes I understand they do not add much value but it will reduce noise from IDEs and if/when we add more checks for missing javadoc tags, it will save us from having to revisit this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I've added @throws IOException if an error occurs
to all methods throwing IOException
.
For the record, let me state my personal (rather strong) opinion:
I think this a pretty crazy level of boilerplate noise for a test and that it affects readability negatively. My IDE (IntelliJ) does not complain about this at all. Any IDE which complains should be fixed or configured to not complain. Changing Javadoc to complain about this in a scope like this would be a poor decision.
I think the PR is ready to be sponsored after this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hear you but the option is not to use javadoc comments and use block comments :-)
I think we can agree to disagree on impacting readability ;-) this comes down to personal preferences
Again thank you for making the change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the PR is ready to be sponsored after this.
Please see my comment regarding the end of central directory record.
I would prefer to tweak that in a fashion similar to what I indicated as I thought your original version was clearer. I understand what Martin was indicating of the use of EOC and that could have been addressed by adding (EOC) or ENDHDR after end of central directory record to make it clearer
I will leave it up to you as to whether you want to make the change but it would be clearer as I think we took a small step backwards.
Either way you should not need a sponsor and should be good to integrated when you are ready
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hear you but the option is not to use javadoc comments and use block comments :-)
I learned something new today: Javadoc comments and block comments are not the same!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I changed the references to this ultimate ZIP structure to read like this:
'End of central directory record' (END header)
Example:
/**
* Validate that an 'End of central directory record' (END header)
* where the value of the CEN size field exceeds the position of
* the END header is rejected.
This way we stick to the 'official' ZIP verbiage but also allow using the shorter 'END header'.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either way you should not need a sponsor and should be good to integrated when you are ready
As a non-committer, I still think I need help from a committer to /sponsor
this change, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this a pretty crazy level of boilerplate noise for a test and that it affects readability negatively.
I'm with Eirik here.
I've maintained jdk tests for 20 years and have never had to worry about nagging to add javadoc boilerplate.
doclint should restrict itself to checking what's there, not to check for completeness, in tests and private APIs.
And it looks like the jdk makefiles are in fact set up to work this way.
I see:
./make/Docs.gmk:102:JAVADOC_DISABLED_DOCLINT_WARNINGS := missing
Especially for test methods, the contract of a test is very simple: "do nothing", but possibly throw an informative exception. If the compiler requires a throws declaration, then I like throws Throwable
which evokes this contract. A javadoc @throws
declaration is the sort of useless boilerplate that makes people hate Java.
/label remove sponsor |
@eirbjo
|
@LanceAndersen Would you like to do a final review after the changes suggested by Martin in his late review? It is mostly comment changes, and some stray variable renames. I have run the test locally after the last changes, and it passes on my machine. |
This reverts commit 623d374.
…ethods throwing IOException
…of central directory record' (END header)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the updates here Eirik. They look fine.
You are right, you do need a sponsor still and I will do so Monday morning so that Martin has a chance to provide any last input on your comment updates regarding the End of central directory record.
Thanks Lance! Allow me to underline just how much I appreciate that we can have differences of opinion while still being friends and making productive progress. Looking around the world, this is not always a given. I see this as a sign of a healty community. |
/sponsor |
@LanceAndersen The PR has been updated since the change author (@eirbjo) issued the |
/integrate |
/sponsor |
Going to push as commit 65e01da.
Your commit was automatically rebased without conflicts. |
@LanceAndersen @eirbjo Pushed as commit 65e01da. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
The TestTooManyEntries test was originally added to validate that ZIP64 files with CEN sizes exceeding what ZipFile supports are rejected with a ZipException. The test does this by creating a large ZIP file (several gigabytes) with many enties. Because this is resource intensive, the test is currently tagged as manual. (See #6927)
It would be useful to have a test which asserts the CEN size enforcement, but without the CPU, disk, memory and run time requirements of TestTooManyEntries. Such a fast test can run non-manual, without the @requires and manual tags as found in TestTooManyEntries.
This PR adds the EndOfCenValidation test which creates sparse test ZIPs where the CEN is "inflated" such that is matches the size declared in the "End of central directory" records.
While thee sparse files look large, they consume very little disk space on file systems supporting sparse files:
For good measure, two new test methods are added to excercise the remaining ZipExceptions which ZipFile may throw during validation of the END record .
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/12231/head:pull/12231
$ git checkout pull/12231
Update a local copy of the PR:
$ git checkout pull/12231
$ git pull https://git.openjdk.org/jdk.git pull/12231/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 12231
View PR using the GUI difftool:
$ git pr show -t 12231
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/12231.diff