Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8304013: Add a fast, non-manual alternative to test/jdk/java/util/zip/ZipFile/TestTooManyEntries #12231

Closed
wants to merge 29 commits into from

Conversation

eirbjo
Copy link
Contributor

@eirbjo eirbjo commented Jan 26, 2023

The TestTooManyEntries test was originally added to validate that ZIP64 files with CEN sizes exceeding what ZipFile supports are rejected with a ZipException. The test does this by creating a large ZIP file (several gigabytes) with many enties. Because this is resource intensive, the test is currently tagged as manual. (See #6927)

It would be useful to have a test which asserts the CEN size enforcement, but without the CPU, disk, memory and run time requirements of TestTooManyEntries. Such a fast test can run non-manual, without the @requires and manual tags as found in TestTooManyEntries.

This PR adds the EndOfCenValidation test which creates sparse test ZIPs where the CEN is "inflated" such that is matches the size declared in the "End of central directory" records.

While thee sparse files look large, they consume very little disk space on file systems supporting sparse files:

16 -rw-r--r--  1   2147483702 Feb  6 18:54 bad-cen-offset.zip
16 -rw-r--r--  1   2147483703 Feb  6 18:54 cen-size-too-large.zip
 8 -rw-r--r--  1          132 Feb  6 18:54 invalid-zen-size.zip

For good measure, two new test methods are added to excercise the remaining ZipExceptions which ZipFile may throw during validation of the END record .


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8304013: Add a fast, non-manual alternative to test/jdk/java/util/zip/ZipFile/TestTooManyEntries

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/12231/head:pull/12231
$ git checkout pull/12231

Update a local copy of the PR:
$ git checkout pull/12231
$ git pull https://git.openjdk.org/jdk.git pull/12231/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 12231

View PR using the GUI difftool:
$ git pr show -t 12231

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/12231.diff

…d of central directory" records. This removes the need to a create several GB big "real" ZIP64 file. To facilitate testing, the validation of CEN size > limit is moved from initCEN into findEnd. Since the test now runs fast, make it non-manual.
@bridgekeeper
Copy link

bridgekeeper bot commented Jan 26, 2023

👋 Welcome back eirbjo! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Jan 26, 2023

@eirbjo The following label will be automatically applied to this pull request:

  • core-libs

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the core-libs core-libs-dev@openjdk.org label Jan 26, 2023
@eirbjo eirbjo changed the title Update TestTooManyEntries run non-manual with less resources Update TestTooManyEntries to run non-manual with less resources Jan 26, 2023
… cases for regular CEN sizes exactly on and exceeding the limit. Move enforcement of the size limit back to initCEN, but before the short-circuiting of empty files.
…ze to the wanted sizes. Remove the ZIP64 logic, since the purpose of the test is to verify rejection of CEN sizes too large, not where the CEN is found.
@eirbjo eirbjo changed the title Update TestTooManyEntries to run non-manual with less resources Add test CenSizeTooLarge as a fast, non-manual alternative to TestTooManyEntries Jan 30, 2023
Copy link
Contributor

@LanceAndersen LanceAndersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates.

Please see the additional suggestions below

* The resulting ZIP is technically not valid, but it does allow us
* to test that the large CEN size is rejected
*/
private Path zipWithCenSize(String name, int cenSize) throws IOException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could speed this up a bit more by just creating the array representing the original Zip 1 time and modify it each pass.(though it might not speed things up that much)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I introduced the templateZip() method which returns a byte[]. While the speed is insignificat here, I think it makes it more explicit that we're producing ZIPs from a template.

private static final int ENDHDR = 22; // End of central directory record size
private static final int CEN_SIZE_OFFSET = 12; // Offset of CEN size field within ENDHDR
private Path huge; // ZIP file with CEN size exceeding limit
private Path big; // ZIP file with CEN size exactly on the limit
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might consider more meaningful names for the Paths

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed these fields now and instead produce the files in the test methods directly, removing the need to name the fields.

import static org.testng.Assert.assertTrue;
import static org.testng.Assert.fail;

public class CenSizeTooLarge {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add a comment which also points to the original test

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment referring to TestTooManyEntries explaining why this test exists.

*/
private void assertRejected(Path zip, String expectedMsg) throws IOException {
try (ZipFile zf = new ZipFile(zip.toFile())) {
fail("Expected ZipFile to throw ZipException");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could consider using expectThrows and then validation the value of getMessage()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced with expectedExceptions and expectedExceptionsMessageRegExp

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I finally understand what you meant by this, I've update to use expectThrows and assertEquals :-)

…ment referencing existing CenSizeTooLarge, clarify MAX_CEN_SIZE, use expectedExceptions and expectedExceptionsMessageRegExp
… declared in the End of central directory record. Add a new test which verifies the rejection of too-large CEN offset. Revert the change to END validation in ZipFile which is no longer needed.
@eirbjo eirbjo changed the title Add test CenSizeTooLarge as a fast, non-manual alternative to TestTooManyEntries Add test EndOfCenValidation as a fast, non-manual alternative to TestTooManyEntries Feb 6, 2023
@eirbjo eirbjo changed the title Add test EndOfCenValidation as a fast, non-manual alternative to TestTooManyEntries Add test EndOfCenValidation as non-manual alternative to TestTooManyEntries Mar 11, 2023
@eirbjo eirbjo marked this pull request as draft March 11, 2023 07:35
@eirbjo eirbjo changed the title Add test EndOfCenValidation as non-manual alternative to TestTooManyEntries Add a fast, non-manual alternative to test/jdk/java/util/zip/ZipFile/TestTooManyEntries Mar 11, 2023
@eirbjo
Copy link
Contributor Author

eirbjo commented Mar 11, 2023

/issue 8304013

@openjdk openjdk bot changed the title Add a fast, non-manual alternative to test/jdk/java/util/zip/ZipFile/TestTooManyEntries 8304013: Add a fast, non-manual alternative to test/jdk/java/util/zip/ZipFile/TestTooManyEntries Mar 11, 2023
@eirbjo
Copy link
Contributor Author

eirbjo commented Mar 24, 2023

Thanks for all your help and careful reviews getting this ball across the net, Lance! Feels good to have one less manual test.

/integrate

@openjdk openjdk bot added the sponsor Pull request is ready to be sponsored label Mar 24, 2023
@openjdk
Copy link

openjdk bot commented Mar 24, 2023

@eirbjo
Your change (at version e348201) is now ready to be sponsored by a Committer.

import static org.testng.Assert.*;

/**
* This test augments {@link TestTooManyEntries}. It creates sparse ZIPs where the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove the "the" stutter

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"the" stutter has been removed 👍


/**
* Validate that an end of central directory record with a
* CEN size which exceeds the position of the EOC record is rejected.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EOC is not a standard zip file technical term. Did you mean "END header" ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If by "standard" you mean "APPNOTE.TXT", then that uses "end of central directory record" which is a bit long.

The Java implementation seems to use END, like in ZipFile.ENDHDR, "END header", etc.

It is probably better to be consistent than correct here, so I've changed the test to use "END header" consistently when referring to the "end of central directory record".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, what I would suggest is adding a comment somewhere regarding end of central directory record and note the reference to ENDHDR or END Header

I don't think you want to spend too much time here just the wording reference the end of central directory record was clear and where you were abbreviating, then I would reference ENDHDR

I would be looking at the APP.NOTE when looking at this test, not the ZipFile constant verbiage, but that is just me :-)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neither APPNOTE nor JDK doc zip terminology is great, but a test is not the place to invent something better.

(But I'm surprised no one seems to be using "EOCD")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that would be “EOCDR”. (Hides behind the bike shed..)

/**
* Validate that an end of central directory record with a CEN offset which
* is larger than the EOC position minus the CEN size is rejected
* @throws IOException
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think @throws javadoc comments in tests are useful; especially checked ones

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recently added a @throws by review comment from Lance. I tend to agree that is not very useful, so I'm removing both ocurrences as per your recommendation.

If Lance wants it back, he can add a review comment saying so. I don't particularly care which professor gets the last word in this matter :-)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add them back before pushing with @throws IOException if an error occurs

Yes I understand they do not add much value but it will reduce noise from IDEs and if/when we add more checks for missing javadoc tags, it will save us from having to revisit this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I've added @throws IOException if an error occurs to all methods throwing IOException.

For the record, let me state my personal (rather strong) opinion:

I think this a pretty crazy level of boilerplate noise for a test and that it affects readability negatively. My IDE (IntelliJ) does not complain about this at all. Any IDE which complains should be fixed or configured to not complain. Changing Javadoc to complain about this in a scope like this would be a poor decision.

I think the PR is ready to be sponsored after this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hear you but the option is not to use javadoc comments and use block comments :-)

I think we can agree to disagree on impacting readability ;-) this comes down to personal preferences

Again thank you for making the change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the PR is ready to be sponsored after this.

Please see my comment regarding the end of central directory record.

I would prefer to tweak that in a fashion similar to what I indicated as I thought your original version was clearer. I understand what Martin was indicating of the use of EOC and that could have been addressed by adding (EOC) or ENDHDR after end of central directory record to make it clearer

I will leave it up to you as to whether you want to make the change but it would be clearer as I think we took a small step backwards.

Either way you should not need a sponsor and should be good to integrated when you are ready

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hear you but the option is not to use javadoc comments and use block comments :-)

I learned something new today: Javadoc comments and block comments are not the same!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I changed the references to this ultimate ZIP structure to read like this:

'End of central directory record' (END header)

Example:

/**
     * Validate that an 'End of central directory record' (END header)
     * where the value of the CEN size field exceeds the position of
     * the END header is rejected.

This way we stick to the 'official' ZIP verbiage but also allow using the shorter 'END header'.

What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either way you should not need a sponsor and should be good to integrated when you are ready

As a non-committer, I still think I need help from a committer to /sponsor this change, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this a pretty crazy level of boilerplate noise for a test and that it affects readability negatively.

I'm with Eirik here.

I've maintained jdk tests for 20 years and have never had to worry about nagging to add javadoc boilerplate.

doclint should restrict itself to checking what's there, not to check for completeness, in tests and private APIs.

And it looks like the jdk makefiles are in fact set up to work this way.

I see:
./make/Docs.gmk:102:JAVADOC_DISABLED_DOCLINT_WARNINGS := missing

Especially for test methods, the contract of a test is very simple: "do nothing", but possibly throw an informative exception. If the compiler requires a throws declaration, then I like throws Throwable which evokes this contract. A javadoc @throws declaration is the sort of useless boilerplate that makes people hate Java.

@eirbjo
Copy link
Contributor Author

eirbjo commented Mar 25, 2023

/label remove sponsor

@openjdk
Copy link

openjdk bot commented Mar 25, 2023

@eirbjo
The label sponsor is not a valid label.
These labels are valid:

  • serviceability
  • hotspot
  • hotspot-compiler
  • ide-support
  • kulla
  • i18n
  • shenandoah
  • jdk
  • javadoc
  • security
  • hotspot-runtime
  • jmx
  • build
  • nio
  • client
  • core-libs
  • compiler
  • net
  • hotspot-gc
  • hotspot-jfr

@openjdk openjdk bot removed the sponsor Pull request is ready to be sponsored label Mar 25, 2023
@eirbjo
Copy link
Contributor Author

eirbjo commented Mar 25, 2023

@LanceAndersen Would you like to do a final review after the changes suggested by Martin in his late review?

It is mostly comment changes, and some stray variable renames. I have run the test locally after the last changes, and it passes on my machine.

Copy link
Contributor

@LanceAndersen LanceAndersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the updates here Eirik. They look fine.

You are right, you do need a sponsor still and I will do so Monday morning so that Martin has a chance to provide any last input on your comment updates regarding the End of central directory record.

@eirbjo
Copy link
Contributor Author

eirbjo commented Mar 25, 2023

Thanks Lance! Allow me to underline just how much I appreciate that we can have differences of opinion while still being friends and making productive progress. Looking around the world, this is not always a given. I see this as a sign of a healty community.

@LanceAndersen
Copy link
Contributor

/sponsor

@openjdk
Copy link

openjdk bot commented Mar 26, 2023

@LanceAndersen The PR has been updated since the change author (@eirbjo) issued the integrate command - the author must perform this command again.

@eirbjo
Copy link
Contributor Author

eirbjo commented Mar 26, 2023

/integrate

@openjdk openjdk bot added the sponsor Pull request is ready to be sponsored label Mar 26, 2023
@openjdk
Copy link

openjdk bot commented Mar 26, 2023

@eirbjo
Your change (at version a93248f) is now ready to be sponsored by a Committer.

@LanceAndersen
Copy link
Contributor

/sponsor

@openjdk
Copy link

openjdk bot commented Mar 26, 2023

Going to push as commit 65e01da.
Since your change was applied there have been 28 commits pushed to the master branch:

  • 38e1714: 8304258: x86: Improve the code generation of VectorRearrange with int and float
  • 765a942: 8304136: Match allocation and free in sspi.cpp
  • 3f59b75: 8304898: Fix Copyright Headers for JLink Source Files
  • 501b606: 8298725: Add BitMap support for reverse iteration
  • 9764948: 8273986: JEditorPane HTML Demo - Accessibility issues
  • 5727610: 8304353: Add lib-test tier1 testing in GHA
  • d8ba227: 8304069: ClassFileParser has ad-hoc hashtables
  • 9a8a60f: 8304833: (fc) Remove dead code in sun.nio.ch.FileChannelImpl::implCloseChannel
  • f96aee7: 8291154: Create a non static nested class without enclosing class throws VerifyError
  • 4ec720d: 8297977: vmTestbase/nsk/stress/except/except012.java fails with unexpected Exception
  • ... and 18 more: https://git.openjdk.org/jdk/compare/63d4afbeb17df4eff0f65041926373ee62a8a33a...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Mar 26, 2023
@openjdk openjdk bot closed this Mar 26, 2023
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review sponsor Pull request is ready to be sponsored labels Mar 26, 2023
@openjdk
Copy link

openjdk bot commented Mar 26, 2023

@LanceAndersen @eirbjo Pushed as commit 65e01da.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core-libs core-libs-dev@openjdk.org integrated Pull request has been integrated
3 participants