HI group purchase system development (system development)

HashMap is stored using Hash table. In order to solve the conflict of Hash table, open address method and chain address method can be used to solve the problem. HashMap in Java adopts chain address method. Chain address method, in short, is the combination of array and linked list. There is a linked list structure on each array element. When the data is hashed, the array subscript is obtained and the data is placed on the linked list of the corresponding subscript element.

2. Optimization scheme of red black tree

1. Why does the conversion occur when the length is 8

Because TreeNodes are about twice the size of regular nodes, we use them only when bins contain enough nodes to warrant use (see TREEIFY_THRESHOLD). And when they become too small (due to removal or resizing) they are converted back to plain bins. In usages with well-distributed user hashCodes, tree bins are rarely used. Ideally, under random hashCodes, the frequency of nodes in bins follows a Poisson distribution (http://en.wikipedia.org/wiki/Poisson_distribution) with a parameter of about 0.5 on average for the default resizing threshold of 0.75, although with a large variance because of resizing granularity. Ignoring variance, the expected occurrences of list size k are (exp(-0.5) * pow(0.5, k) / factorial(k)). The first values are:

0: 0.60653066
1: 0.30326533
2: 0.07581633
3: 0.01263606
4: 0.00157952
5: 0.00015795
6: 0.00001316
7: 0.00000094
8: 0.00000006
more: less than 1 in ten million
  • 1
  • 2 HI group purchase system development (189 micro-8884 power-2527) HI group purchase system development details, HI group purchase system development cases, HI group purchase system development source code.
  • 3 (content ignored, details + v)
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12

Ideally, in the case of random hash code, the node frequency in the bucket follows Poisson distribution. The frequency table of bucket length k is given in this paper.
As can be seen from the frequency table, the probability that the length of the barrel exceeds 8 is very, very small. Therefore, the author should choose 8 as the threshold according to probability and statistics. It can be seen that this choice is very rigorous and scientific.

2. Since there is a linked list converted to a red black tree, is there a red black tree converted to a linked list

HashMap is in jdk1 After 8, the concept of red black tree is introduced, which means that if the linked list elements in the bucket exceed 8, they will be automatically transformed into red black tree; If the elements in the bucket are less than or equal to 6, the tree structure will be restored to the form of linked list.

  • The average search length of the red black tree is log(n), the length is 8, the search length is log(8)=3, the average search length of the linked list is n/2, and when the length is 8, the average search length is 8 / 2 = 4, which is necessary to convert into a tree; If the length of the linked list is less than or equal to 6, 6 / 2 = 3, although the speed is also very fast, the time to convert into tree structure and generate tree will not be too short.

  • There are also reasons for choosing 6 and 8:

    • There is a difference 7 in the middle, which can prevent frequent conversion between linked list and tree. Suppose that if the number of linked lists exceeds 8, the linked list will be transformed into a tree structure. If the number of linked lists is less than 8, the tree structure will be transformed into a linked list. If a HashMap keeps inserting and deleting elements, and the number of linked lists hovers around 8, tree to linked list and linked list to tree will occur frequently, and the efficiency will be very low.

3. The time of capacity expansion, why the capacity expansion is twice as long, and the process of capacity expansion

1. Time of capacity expansion

Greater than or equal to the threshold - that is, when the length of the current array is multiplied by the value of the loading factor, the capacity will be expanded automatically.

  • Load factor
    • Too small: it is prone to reszie and consumes performance
    • Too large: it is prone to hash collision, the linked list becomes longer and the red black tree becomes higher

2. Why should the underlying array of hashmap be guaranteed to be the nth power of 2

//The calculation of hash value is divided into two steps:
//1. XOR operation
static final int hash(Object key) {   //jdk1.8 & jdk1.7
     int h;
     // h = key.hashCode() is the first step to get the hashCode value
     // H ^ (H > > > 16) is the second step of high-order operation
     return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}

//2. Sum array length and operation, distributed to the original array
hash = h&(n-1)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11

After the hash value is obtained, perform a sum operation with the length-1 (length-1) of the array, because if the length of the array is a multiple of 2, the binary of length-1 must be... 00001111... In this form, that is, the front must be 0 and the back must be 1. When performing a sum operation with the hash value, the result must be within the size of the original array, For example, the binary of the default array size of 16-1 = 15 is 00000000 00000000 00000000 00001111, and the hash value of a key is 11010010 00000001 10010000 00100100. When performing the sum operation with the above, the value will operate on the following four bits, which will certainly fall within the range of 0 ~ 15. If it is not a multiple of 2, the binary of length-1 cannot be followed by all 1, which will cause a waste of space.

Tags: Java JDK

Posted by Jonob on Wed, 25 May 2022 18:02:25 +0300