-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8334392: Switch RNG in NMT's treap #19756
Conversation
Ping @tstuefe , @gerard-ziemski, does this look OK? |
👋 Welcome back jsjolen! A progress list of the required criteria for merging this PR into |
@jdksjolen This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 62 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
@jdksjolen To determine the appropriate audience for reviewing this pull request, one or more labels corresponding to different subsystems will normally be applied automatically. However, no automatic labelling rule matches the changes in this pull request. In order to have an "RFR" email sent to the correct mailing list, you will need to add one or more applicable labels manually using the /label pull request command. Applicable Labels
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Preexisting: I don't think there is any reason for the seed to be randomly initialized. There is nothing here that is security-relevant. I would choose a 64-bit value and encode that as constexpr seed init value directly in the class, and remove the argument from the treap constructor.
src/hotspot/share/nmt/nmtTreap.hpp
Outdated
_prng_seed = (PrngMult * _prng_seed + PrngAdd) & PrngModMask; | ||
uint32_t first_half = os::next_random(static_cast<uint32_t>(_prng_seed)); | ||
uint32_t second_half = os::next_random(static_cast<uint32_t>(_prng_seed >> 32)); | ||
_prng_seed = static_cast<uint64_t>(first_half) | (static_cast<uint64_t>(second_half) << 32); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make first/second_half uint64_t and remove the ugly C++ casts?
src/hotspot/share/nmt/nmtTreap.hpp
Outdated
// For comparison, a RB-tree has a proven max depth of log_2(N + 1) * 2. | ||
const int expected_maximum_depth = ceil((log(this->_node_count+1) / log(2)) * 2.5); | ||
const int expected_maximum_depth = ceil((log(this->_node_count+1) / log(2)) * 3); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why ln? why not log2? And then, why not one of the log2 methods in utilities/power_of_two.hpp?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, I'll switch to those, I just wasn't aware of them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks.
I switched initial seed as mine reached rather poor results. I was also interested in the actual distribution (how often are we actually going to hit this poor depth?) and these were my results:
I'm rather sure that this is very good. Note that all treaps that we construct with 5 000 000 nodes will have this distribution, as the initial seed and RNG is fixed. That's a nice property of being in control of this stuff. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I switched initial seed as mine reached rather poor results. I was also interested in the actual distribution (how often are we actually going to hit this poor depth?) and these were my results:
Maximum depth found: 55 with 1 such nodes, average depth found: 28.000000 with 1666004 number of leaves, expected depth: 28.623 . Maximum depth is 2.47 * log_2(n), average is 1.26 * log_2(n), expected is 1.28 Leaf distribution (index is depth): [0,0,0,0,0,0,0,0,0,3,19,46,144,328,794,1763,3549,6403,11310,18458,28336,41395,56059,72937,90856,107183,121451,131594,136310,135633,129671,118893,103964,87528,71242,55959,42052,30688,22006,14605,9746,6087,3855,2350,1377,703,364,188,82,42,14,8,3,2,3,1,]
I'm rather sure that this is very good. Note that all treaps that we construct with 5 000 000 nodes will have this distribution, as the initial seed and RNG is fixed. That's a nice property of being in control of this stuff.
The sensitivity to the initial value of the seed bothers me. Did you try other random algorithms to see if that's the case for others like CRC for example?
It also bothers me that it's static. I think it should be dynamic, like you had it originally. Just because we can't see security issue here right now, does not mean that it does not exist.
It sounds nice to be able to set the seed statically and make sure it works well, but personally I would really prefer for it to be dynamic and use a different random algorithm if needed. A compromise (that would make this code more complex unfortunately) would be to offer a runtime flag to control this behavior?
@jdksjolen I also think you can tone down that test. On my Mac it takes 1.7 seconds, which is too long, and I think testing with e.g. 500000 instead of 5 mio is perfectly fine. Or even just 100000. Let's save some trees. A guaranteed balanced tree, e.g. RB, would be cool. Well, maybe if I'm nice, I'll get one for Christmas, who knows.
What would be the point of such a switch? The only reason to pass in a seed via command line would be to reproduce problems when the RNG relies on random seeds. If it does not, there is no point, really. |
The flag would not be taking an internal seed, but would be a boolean flag to set whether we use the static seed or generate one dynamically. We could also ask security team to take a look at this and sign off. I would be OK with that as well. What I am saying tis that I do not feel qualified to state that there is no security concern here by using static seed compared to creating one dynamically. |
I have more questions the more I think about this. Thomas uses Please help me understand how we go from 6 to 28 or 22 as the expected depth? Also, why are we using For reference (assuming log==log10): So the ideal distribution for expected depth of say 6 would be Johan uses A treap leaf depth, in the worst case, could be theoretically O(n) correct? There are no quarantees like in red-black tree? Shouldn't verify_self() perhaps use the depth of the peak from normal distribution and compare that to the expected depth, as opposed to using the maximum leaf depth, and hoping that it is within some range, which we keep tweaking, and at the same time also tweaking |
Hey now, that's just guaranteed worst-case
If we do keep a static seed, then we can reduce the test by quite a bit, since it'll always produce the same tree and we just need to make sure any future changes doesn't mess it up.
Okay, well. Let's say that the attacker knows the random number sequence and has full control of any treap operations being performed. The latter assumption is quite a high bar already, but we're trying to build a worst case here. Assuming everything else is secure (we can't help it if our treap is being used to attack malloc, that's on malloc), then what the attacker can do is construct the exact shape of the treap by a meticulous series of insertions and deletions. For example, they can repeatedly That is a pretty weak and unlikely attack, imho. Besides, we have a static init seed for That's just my 2 cents on this.
It's not expected. It's the best case. Do not use base 10 log, it is base 2 log which should be used. Not defining the base is because they don't use very rigorous notation, which is "a feature, not a bug" of ordo notation. Remember: A binary tree has 2 children, each depth stores Some questions that may reveal the answer to why we use
It could be if you have a random number sequence of strictly incrementing/decrementing numbers. It's probabilistic, so quality of RNG and initial seed matters. If you actually have a random and evenly distributed source (like a
The expected depth I described is a computed expected value from the distribution, as in the statistical concept. We could always pick something more obscene, such as With a static seed, the test will not be brittle. |
Where do you get 1666004? The test adds 5 mio nodes.
6 makes no sense. 22 (point something): guaranteed height of a self-balancing binary tree with 5 mio nodes.
For calculating height of binary trees, 2 is the only correct base. If the articles you read use big O notation, log base does not matter, since you can translate any logarithm to one with a different base by multiplying with a constant. Constants don't matter in big O, so there you are. Think of big O as a tool designed to give you a feeling for algorithmic complexity. "Its logarithmic" is usually good enough, since the difference between e.g. a treap and a self balancing tree is tiny compared to the one between a treap and a linked list.
Yep. Maybe I get some bright-eyed intern to do this, or I do it myself if I find time. Treap is good enough for now.
Well, no. Tree shape depends on the tree-internal RNG and the input node values, and the latter is still random, as it should to get good test coverage. (there is the issue that os::random uses a constant seed (hah!) but I plan on addressing this soon). |
Right, I was only considering the 5_000_000 inserts, those aren't random, those are just |
Verifying less often and with smaller amount of nodes in the "extreme" stress test, test time down to 219ms on one of my machine's. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still good.
If @gerard-ziemski is unhappy with the seed change, this should not hold up the PR. In that case I am fine with reverting to the random-inited seed we used earlier.
Thank you Thomas, Johan for your notes and explanations, they help.
What bothers me the most is how brittle the code looks, if it's that sensitive to the initial RNG seed value. I am afraid that we will be re-visiting this again. |
I'm concerned that because our current code is implemented using recursive calls, if the complexity happens to be anywhere O(n) we might run out of stack and start overwriting memory. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An idea I saw in https://nrk.neocities.org/articles/simple-treap :
Have we considered replacing RNG with memory pointers of the regions we are storing? That would remove our current dependency on random number generator and quality of the seeds.
I guess I am just not all that happy with us using treap, recursion and the code being sensitive to the initial random number generator seed. I'm not sure the benefit outweighs the cost here.
I do like the idea of guaranteed quality of red-black trees, even if the implementation would more complex.
Feel free to remove me as a reviewer, I do not want to hold up this improvement. The fix on its own does look fine.
Pointers are a terrible source of entropy—one reason why ihashes are not just object addresses. That is because memory locations are subject to a bunch of restrictions (alignment, user address space boundaries etc) and result of typically quite regular allocation patterns. |
We could do some hash function using the memory pointer, like we do in
I'm sure there are better ones? |
Wouldn't we just hit guard pages which would terminate the process? That's a stack depth of
As far as I know, you can't use a PRNG (which a hash function is very similar to) to generate entropy. Only an entropy source, that is, the initial seed, can generate entropy. |
We simply need the random numbers to balance nodes representing memory locations. We are not talking about using them for cryptography. To me it feels like a good match. The same memory location would get assigned same priority when allocated, freed then allocated again to the same memory location. Also, Address Space Layout Randomization would ensure that the results are not repeatable. But this is just an idea to simplify the code, if you don't like it, that's OK. |
I don't understand you. You have concerns about bad quality of the treap RNG. Then you argue for using memory locations as a replacement. But their entropy is a lot worse and depends on factors you cannot influence nor predict. Pseudo RNG entropy depends only on the algorithm, and it is predictable. Its math. Its only job is to give you a reasonable entropy across the whole value range. Entropy of memory addresses depends on the logic of the underlying algorithm. Having good entropy is really not the first design goal for memory managers. Even ASLR only gives you a few bits of entropy. With memory addresses, you are at the mercy of the underlying algorithm. Use a buddy-style allocator? Great, all your addresses are now more likely to be power-2 values. Use some form of Arenas? Now you have a large concentration of pointers at the lower end of pages. Use something like bin lists? Okay, now your pointers are aligned to whatever the bin list factor is. Use mmap? Now, your address is page-aligned and you lose 12-16 bits of entropy depending on platform. Malloc alignment loses you 4 bit of entropy. User space is only xxTB...4 PB. You lose typically 16-24 bits in the upper half of the address, again. So you propose to replace an RNG that is only dependent on its algorithm with something that has demonstrably worse entropy and depends on a ton of factors out of our control. (One fun example, lets say someone changes the libc implementation under you, e.g. Valgrind or ASAN style. Now your malloc pointers look different and are more clustered.)
Likely a case of "works good enough, and not worth spending thoughts on it". ihash generation otoh, which are hugely important for performance of java hashtables, are specifically not using object addresses, but a pseudo RNG like we do here. |
I don't like that our implementation is so sensitive to the quality of seed and random. I suggested we consider memory location to be used instead after reading https://nullprogram.com/blog/2019/04/30/, but yes after sleeping on it, and in line with your arguments against it, I think that is not a solution for us here. I really start to think that we should have gone with red-black trees (or some other data structure with guaranteed qualities) instead of treap and not use recursive implementation if we can not guarantee a reasonable worst case. The fix here looks fine for what it is trying to achieve, so please go ahead and feel free to proceed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
FWIW, I don't like the recursion much either, but it is only a very minor concern because:
|
Good. /integrate |
Going to push as commit 652784c.
Your commit was automatically rebased without conflicts. |
@jdksjolen Pushed as commit 652784c. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
Hi,
We've seen some behavior where the Treap's depth is slightly out of bounds (seen: 62 for 5 000 000 nodes) for what is expected. This PR does a couple of things to remedy that:
3
,log_2(N + 1)*2.5 + 3
may simply be too tight of a bound.os::next_random
to fill out an unsigned 64-bit priority.I ran the
5 000 000
node stress test 50 times, only seeing a maximum depth of 58. I also ran (only once) the test for some larger node amounts. These are my findings in a table:Our Treap is currently implemented using two recursive methods:
split
andmerge.
They take up80
and48
bytes of stack space per frame, respectively. That means that a call depth of 69 would imply 5.4KiB of stack space for split and 3.3KiB of stack space for merge. Looking at our thread stack defaults, that is not a lot and leaves a lot of activity left on the stack for others :-).All the best,
Johan
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/19756/head:pull/19756
$ git checkout pull/19756
Update a local copy of the PR:
$ git checkout pull/19756
$ git pull https://git.openjdk.org/jdk.git pull/19756/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 19756
View PR using the GUI difftool:
$ git pr show -t 19756
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/19756.diff