6478546: FileInputStream.read() throws OutOfMemoryError when there is plenty available #8235

bplb · 2022-04-14T01:40:50Z

Modify native multi-byte read-write code used by the java.io classes to limit the size of the allocated native buffer thereby decreasing off-heap memory footprint and increasing throughput.

Progress

Change must be properly reviewed (1 review required, with at least 1 Reviewer)
Change must not contain extraneous whitespace
Commit message must refer to an issue

Issue

JDK-6478546: FileInputStream.read() throws OutOfMemoryError when there is plenty available

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk pull/8235/head:pull/8235
$ git checkout pull/8235

Update a local copy of the PR:
$ git checkout pull/8235
$ git pull https://git.openjdk.org/jdk pull/8235/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 8235

View PR using the GUI difftool:
$ git pr show -t 8235

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/8235.diff

… plenty available

bridgekeeper · 2022-04-14T01:42:12Z

👋 Welcome back bpb! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

bplb · 2022-04-14T01:45:35Z

Currently for java.io.FileInputStream.read(byte[],int,int) and java.io.FileOutputStream.write(byte[],int,int), for example, if the number of bytes respectively to be read or written exceeds 8192, an array of the total length of the read or write is allocated. For large reads or writes this could be significant. It is proposed to limit the maximum allocation size to 1 MB and perform the read or write by looping with this buffer. For reading or writing less than 1 MB, there is no effective change in the implementation.

This change both saves off-heap memory and increases throughput. An allocation of 1 MB is only 0.42% the size of the buffer in the JBS issue, 501 x 501 x 501 x 2 (= 251,503,002), so for this case the memory reduction is drastic. Reading throughput is almost doubled and writing throughput improved by about 50%. As measured on macOS, the throughput of the methods mentioned above before the change was:

Benchmark         Mode  Cnt   Score   Error  Units
ReadWrite.read   thrpt    5  10.108 ± 0.264  ops/s
ReadWrite.write  thrpt    5   7.188 ± 0.431  ops/s

and that after is:

Benchmark         Mode  Cnt   Score   Error  Units
ReadWrite.read   thrpt    5  20.112 ± 0.262  ops/s
ReadWrite.write  thrpt    5  10.644 ± 4.568  ops/s

openjdk · 2022-04-14T01:46:06Z

@bplb The following label will be automatically applied to this pull request:

core-libs

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

mlbridge · 2022-04-14T01:49:32Z

Webrevs

djelinski · 2022-04-14T05:57:54Z

The benchmark results are quite unexpected. Would we benefit from reducing the buffer size even further?

src/java.base/share/native/libjava/io_util.c

bplb · 2022-04-19T21:43:50Z

The benchmark results are quite unexpected. Would we benefit from reducing the buffer size even further?

I tested with smaller and smaller buffer sizes until the performance started to be affected which was about 64 KB. I have not changed this value in the patch just yet.

uschindler · 2022-04-28T20:13:48Z

By the way: FileOutputStream has exactly the same problem with write(byte[]). I see no test for it, but I assume this is now also fixed. That's a longstanding issue in Lucene (this is why we use a ChunkedOutputStream when writing files.

bplb · 2022-04-28T20:30:45Z

By the way: FileOutputStream has exactly the same problem with write(byte[]). I see no test for it, but I assume this is now also fixed. That's a longstanding issue in Lucene (this is why we use a ChunkedOutputStream when writing files.

Yes, write(byte[]) is also fixed: io_util.c#L164.

bplb · 2022-05-05T23:33:40Z

Further performance testing was conducted for the case where the native read and write functions used a fixed, stack-allocated buffer of size 8192. The loops were moved up into the Java code of FileInputStream, FileOutputStream and RandomAccessFile. Note that there was code duplication because RAF needs both read and write methods as well. The performance of writing with this approach was approximately half what it had been, so for writing the approach was abandoned.

Here are some updated performance measurements:

The performance measurements shown are for the following cases:

Master: unmodified code as it exists in the mainline
Java: fixed-size stack buffer in native read, read loops in Java, write as in the mainline but with malloc buffer size limit
Native: read loop in native read with malloc buffer size limit, write as in the mainline but with malloc buffer size limit

The horizontal axis represents a variety of lengths from 8192 to 1GB; the vertical axis is throughput (ops/s) on a log 10 scale. The native lines in the charts are for the code proposed to be integrated.

As can be seen, the performance of reading is quite similar up to larger lengths. The mainline version presumably starts to suffer the effect of large allocations. The native read loop performs the best throughout, being for lengths 10 MB and above from 50% to 3X faster than the mainline version. The native read loop is about 40% faster than the Java read loop for these larger lengths.

Due to the log scale of the charts, the reading performance detail cannot be seen exactly and so is given here for the larger lengths:

               Throughput of read(byte[]) (ops/s)
   Length      Master         Java        Native
   1048576    11341.39      6124.482    11371.091
  10485760      356.893      376.326      557.906
 251503002       10.036       14.27        19.869
 524288000        5.005        6.857        9.552
1000000000        1.675        3.527        4.997

The performance of writing is about the same for the Java and Native versions, as it should be since the implementations are the same. Any difference is likely due to measurement noise. The mainline version again suffers for larger lengths.

As the native write loop was already present in the mainline code, the principal complexity proposed to be added is the native read loop. Given the improved throughput and vastly reduced native memory allocation this seems to be justified.

src/java.base/share/native/libjava/io_util.c

bplb · 2022-06-15T00:25:27Z

This change would increase throughput at the expense of the additional complexity of a new loop in the native readBytes() function; a loop was already present in the native writeBytes() function. Putting the loop in Java was problematic and less improvement in performance was observed. I suggest at this point either refining the current proposal as needed to move it forward, or withdrawing the request and resolving the associated issue as Will not Fix.

bplb · 2022-06-24T20:36:58Z

The following benchmark results are for commit e6f3b45 with 5 warmup iterations of 5 seconds each, followed by 10 measurement iterations of 10 seconds each on a MacBookPro16,1.

           FileInputStream.read(byte[])

                   Score (ops/s)
(length)       Master              Branch
16384      5850.383 ± 220.209  5805.952 ± 187.870
32768      5781.410 ± 345.939  5807.587 ± 515.808
100000     5444.167 ± 409.680  5273.988 ± 159.652
500000     3748.413 ±  88.505  3803.577 ± 256.442
1000000    2647.356 ± 145.245  2700.523 ±  97.717
1048576    2759.337 ± 362.618  2631.779 ±  45.971
10485760    338.840 ±   2.706   392.126 ±   2.59
251503002    10.626 ±   0.024    20.165 ±   0.091
524288000     5.134 ±   0.024     9.826 ±   0.068
1000000000    1.806 ±   0.014     5.194 ±   0.015

           FileOutputStream.write(byte[])

                   Score (ops/s)
(length)       Master              Branch
16384      2736.489 ±  31.547  2599.203 ±  79.392
32768      2656.471 ±  72.597  2523.470 ± 158.460
100000     2476.514 ±  46.392  2225.597 ± 442.285
500000     1606.205 ± 487.787  1281.550 ± 413.797
1000000    1125.265 ±  13.969   812.184 ± 232.23
1048576    1098.608 ±  13.528   816.657 ± 218.112
10485760    136.867 ±   2.310   153.137 ±   2.93
251503002     7.266 ±   0.026    10.546 ±   0.125
524288000     3.450 ±   0.016     3.880 ±   1.355
1000000000    1.178 ±   0.007     2.637 ±   0.505

The proposed change is approximately at parity with the current code base for smaller lengths, but has higher throughput for larger lengths, at least for reading. In all cases where the length is greater than or equal to 65536 (64 KiB) or is a multiple of 8192 (8 KiB), less native memory is allocated.

src/java.base/share/classes/java/io/RandomAccessFile.java

…s to Java layer

bplb · 2022-08-26T22:17:09Z

Using the temporary direct buffer cache to provide intermediate native memory is not a good solution as direct memory may be limited. Hence the patch is reverted to its prior state, modified to move parameter checking up to the Java layer, and simplified to keep Blocker use to within the readBytes and writeBytes methods. This patch uses less memory and has generally higher throughput than the master branch. Sample benchmark measurements on macOS are as follows:

   RandomAccessFile::read(byte[])

                   ops / sec
    length      master       patch
     16384  468119.718  501174.185
     32768  328305.094  353844.578
    262144   48575.747   52739.050
   1048576   13176.250   13723.486
  10485760     378.469     596.246
 251503002      10.554      20.784
 524288000       5.095      10.014
1000000000       1.743       5.254

   RandomAccessFile::write(byte[])

                   ops / sec
    length      master       patch
     16384  328584.712  353691.078
     32768  252836.010  276133.188
    262144   46072.526   38558.946
   1048576   12908.426    9716.200
  10485760     194.396     498.384
 251503002       7.649      18.175
 524288000       3.643       8.524
1000000000       1.202       4.491

bridgekeeper · 2022-09-24T04:38:50Z

@bplb This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

bplb · 2022-10-13T18:44:57Z

Moving to draft.

bridgekeeper · 2022-12-09T03:35:06Z

@bplb This pull request has been inactive for more than 8 weeks and will be automatically closed if another 8 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

bridgekeeper · 2023-02-03T09:36:54Z

@bplb This pull request has been inactive for more than 16 weeks and will now be automatically closed. If you would like to continue working on this pull request in the future, feel free to reopen it! This can be done using the /open pull request command.

6478546: FileInputStream.read() throws OutOfMemoryError when there is…

8bb1e14

… plenty available

openjdk bot added the rfr Pull request is ready for review label Apr 14, 2022

openjdk bot added the core-libs core-libs-dev@openjdk.org label Apr 14, 2022

AlanBateman reviewed Apr 14, 2022

View reviewed changes

src/java.base/share/native/libjava/io_util.c Show resolved Hide resolved

6478546: Decrease malloc'ed buffer maximum size to 64 kB

40d46f8

bplb added 2 commits April 28, 2022 18:23

Merge

22e9919

6478546: Clean up io_util.c

5c3a344

hjohn reviewed May 12, 2022

View reviewed changes

bplb added 3 commits May 18, 2022 14:58

Merge

c4a0a02

6478546: Add break in write loop on ExceptionOccurred

10f14bb

Merge

0052148

bplb added 2 commits June 23, 2022 08:53

Merge

43905e4

6478546: Use dynamically sized temporary direct buffers

e6f3b45

bplb added 3 commits June 24, 2022 16:40

Merge

17f7724

Merge

e866d65

6478546: Miscellaneous cleanup

9a13863

dfuch reviewed Aug 3, 2022

View reviewed changes

src/java.base/share/classes/java/io/RandomAccessFile.java Outdated Show resolved Hide resolved

bplb added 2 commits August 3, 2022 09:40

Merge

cf95a11

6478546: Eliminate a subtraction from RandomAccessFile.bufferSize()

9d7e4fb

bplb added 2 commits August 23, 2022 14:07

Merge

5641893

6478546: Revert use of temporary direct buffers; move parameter check…

2532c0b

…s to Java layer

bplb marked this pull request as draft October 13, 2022 18:45

openjdk bot removed the rfr Pull request is ready for review label Oct 13, 2022

bridgekeeper bot closed this Feb 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

6478546: FileInputStream.read() throws OutOfMemoryError when there is plenty available #8235

6478546: FileInputStream.read() throws OutOfMemoryError when there is plenty available #8235

bplb commented Apr 14, 2022 •

edited by openjdk bot

bridgekeeper bot commented Apr 14, 2022

bplb commented Apr 14, 2022

openjdk bot commented Apr 14, 2022

mlbridge bot commented Apr 14, 2022 •

edited

djelinski commented Apr 14, 2022

bplb commented Apr 19, 2022

uschindler commented Apr 28, 2022

bplb commented Apr 28, 2022

bplb commented May 5, 2022

bplb commented Jun 15, 2022

bplb commented Jun 24, 2022

bplb commented Aug 26, 2022

bridgekeeper bot commented Sep 24, 2022

bplb commented Oct 13, 2022

bridgekeeper bot commented Dec 9, 2022

bridgekeeper bot commented Feb 3, 2023

6478546: FileInputStream.read() throws OutOfMemoryError when there is plenty available #8235

6478546: FileInputStream.read() throws OutOfMemoryError when there is plenty available #8235

Conversation

bplb commented Apr 14, 2022 • edited by openjdk bot

Progress

Issue

Reviewing

bridgekeeper bot commented Apr 14, 2022

bplb commented Apr 14, 2022

openjdk bot commented Apr 14, 2022

mlbridge bot commented Apr 14, 2022 • edited

Webrevs

djelinski commented Apr 14, 2022

bplb commented Apr 19, 2022

uschindler commented Apr 28, 2022

bplb commented Apr 28, 2022

bplb commented May 5, 2022

bplb commented Jun 15, 2022

bplb commented Jun 24, 2022

bplb commented Aug 26, 2022

bridgekeeper bot commented Sep 24, 2022

bplb commented Oct 13, 2022

bridgekeeper bot commented Dec 9, 2022

bridgekeeper bot commented Feb 3, 2023

bplb commented Apr 14, 2022 •

edited by openjdk bot

mlbridge bot commented Apr 14, 2022 •

edited