-
Notifications
You must be signed in to change notification settings - Fork 613
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor bad smell ToArrayCallWithZeroLengthArrayArgument #1447
base: master
Are you sure you want to change the base?
refactor bad smell ToArrayCallWithZeroLengthArrayArgument #1447
Conversation
- ToArrayCallWithZeroLengthArrayArgument The performance of the empty array version is the same, and sometimes even better, compared to the pre-sized version. Also, passing a pre-sized array is dangerous for a concurrent or synchronized collection as a data race is possible between the <code>size</code> and <code>toArray</code> calls. This may result in extra <code>null</code>s at the end of the array if the collection was concurrently shrunk during the operation.</p> See https://shipilev.net/blog/2016/arrays-wisdom-ancients/ for more details.
The linked performance data has some serious issues. The largest collection size tested is 1000. The data shows a clear trend where the sized variant is getting faster and faster (relative to the zero), but they stop short of where it gets interesting. At 1M, I would bet it's entirely reversed. Also, the tests are run with a 5 year old JDK, with a GC that's no longer default. The benefits of the zero method (eliding zeroing) is something that may be optimized in newer JDKs. My recommendation: re-run the benchmarks with much larger sizes and newer JDK's before using it to change production code. |
Sorry, I didn't realize this was limited to sizes 4 or less. That's likely much harder to optimize in a meaningful way. The most relevant line from the benchmark is this:
So there is no measurable difference between the two(significant overlap of avg with error), so all this PR is doing is causing more garbage creation. |
While the performance argument may not be convincing, the point about toArray(emptyArray) being an atomic way of dealing with concurrent collections is compelling to me. |
The statement "passing a pre-sized array is dangerous for a concurrent or synchronized collection" seems to contain a misunderstanding of the contract of
Most of this is handled in public <T> T[] toArray(T[] a) {
// Estimate size of array; be prepared to see more or fewer elements
int size = size();
T[] r = a.length >= size ? a :
(T[])java.lang.reflect.Array
.newInstance(a.getClass().getComponentType(), size);
... // more resizing checks Please also note the specifics of this code: the very first thing it does is call |
I have this impression that on concurrent collections, the only method that can conveniently and atomically give you O(n) elements that were at least in the collection at some point is toArray(emptyArray). I'm looking around and not finding this documented anywhere. I don't see it on a Javadoc contract or in Effective Java or JCIP. But it's clear that java.util.concurrent.ConcurrentHashMap.CollectionView#toArray() was written this way. It returns a trimmed array with no nulls at the end. |
on |
I am seeing different behaviors for different sizes. public static void main(String[] args)
{
testWithPreSize(0);
testWithPreSize(2);
}
private static void testWithPreSize(int preSize)
{
ConcurrentHashMap<String, Integer> map = new java.util.concurrent.ConcurrentHashMap<String, Integer>();
map.put("One", 1);
Object[] sizeTwoArray = new Object[preSize];
Object[] result = map.entrySet().toArray(sizeTwoArray);
System.out.println("Result is " + Arrays.toString(result) + " for pre-sized to " + preSize);
} Prints:
I'm sure you know this though, so I'm trying to guess where our disconnect is. I think part of it might be, why would the size ever be greater than the number of entries (2 here). In |
Sorry, I misread the code: it only trims if the array sent in was too small. So it honors the contract of the function as specified in
The guarantees around the entries in the array are so weak (it may have values that are no longer in the map, and values that are missing) that I'm not sure "it can have nulls, but it'll allocate less" is a real issue. |
I agree that this isn't a real issue in practice. ConcurrentHashMap doesn't allow null keys nor null values, so if And
If anyone ever ran into this issue in practice, it would manifest as a NullPointerException inside ImmutableMapFactory.withAll(). If we want to fix that, I think the 0 size thing would work. If that's a performance problem, another option might be: Map.Entry<K, V>[] entries = map.entrySet().toArray(new Map.Entry[map.entrySet().size()]);
if (entries.length == 0 || entries[0] == null)
{
return this.empty();
}
if (entries.length == 1 || entries[1] == null)
{
return this.of(entries[0].getKey(), entries[0].getValue());
}
if (entries.length == 2 || entries[2] == null)
{
return this.of(
entries[0].getKey(), entries[0].getValue(),
entries[1].getKey(), entries[1].getValue());
}
if (entries.length == 3 || entries[3] == null)
{
return this.of(
entries[0].getKey(), entries[0].getValue(),
entries[1].getKey(), entries[1].getValue(),
entries[2].getKey(), entries[2].getValue());
}
if (entries.length == 4 || entries[4] == null)
{
return this.of(
entries[0].getKey(), entries[0].getValue(),
entries[1].getKey(), entries[1].getValue(),
entries[2].getKey(), entries[2].getValue(),
entries[3].getKey(), entries[3].getValue());
}
return new ImmutableUnifiedMap<>(map); |
That works. A bit less "or-y": int realLength = entries.length;
while (realLength > 0 && entries[realLength - 1] == null) realLength--; Then switch on realLength. |
Repairing Code Style Issues
ToArrayCallWithZeroLengthArrayArgument
The performance of the empty array version is the same, and sometimes even better, compared
to the pre-sized version. Also, passing a pre-sized array is dangerous for a concurrent or
synchronized collection as a data race is possible between the
size
andtoArray
calls. This may result in extra
null
s at the end of the array if the collection was concurrentlyshrunk during the operation.
See https://shipilev.net/blog/2016/arrays-wisdom-ancients/ for more details.
Repairing Code Style Issues