Data structure note Directory:
1. Introduction and time complexity
2. Linear meter
3. Trees
4. Fig
5. Find
6. Sorting
6.1 basic concepts
6.1.1 objectives
Order keywords
6.1.2 classification
Internal sort
Data elements are all in memory during sorting
External sort
During the sorting process, data is constantly moving between internal and external storage
6.1.3 stability
After sorting, the elements with the same keyword keep the original order
6.1.4 two operations
compare
Determine the priority of keywords
move
Move elements to achieve sequence order
6.1.5 one trip sequencing
Process all unprocessed elements once
6.2 internal sorting
6.2.1 summary
stability
Return to the foundation and stabilize
- Half insert
- Direct insertion
- Bubble sort
- Merge sort
- Cardinality sort
Is stable
Unstable sorting algorithms include:
- Shell Sort
- Quick platoon
- Simple selection sort
- Heap sort
Applicable to chain storage
Selection and return to base
- Direct insert sort
- Bubble sort
- Simple selection sort
- Merge sort
- Cardinality sort
No ordered subsequence will be formed
Quick platoon
Time complexity
The time complexity is independent of the initial state
A bunch of () turtles () choose () friends ()
- Simple choice O ( n 2 ) O(n^2) O(n2)
- Heap sort O ( n l o g n ) O(nlogn) O(nlogn)
- Merge sort O ( n l o g n ) O(nlogn) O(nlogn)
- Cardinality sort O ( r d ( n + r ) ) O(rd(n+r)) O(rd(n+r))
The number of comparisons is independent of the initial state
Select sort
Related to initial state
Insert sort
-
Best: orderly
direct O ( n ) O(n) O(n)
Half O ( n l o g n ) O(nlogn) O(nlogn)
-
Worst case: reverse order
Swap sort
-
Bubble
Best: orderly O ( n ) O(n) O(n)
Worst case: reverse order O ( n 2 ) O(n^2) O(n2)
-
Quick platoon
Best: pivot split O ( n l o g n ) O(nlogn) O(nlogn)
Worst case: basic order, reverse order O ( n 2 ) O(n^2) O(n2)
The number of bubble sorting passes is related to the initial state
Is O(nlogn)
- Quick platoon
- Stacking
- Merge
Spatial complexity
O(1)
- Direct insertion
- Half insert
- Shell Sort
- Bubble sort
- Simple choice
- Heap sort
O(logn)
Quick platoon is the best
O(n)
- Quick platoon worst
- Merge sort
- Worst cardinality ®
Applicable scenario:
scale
Take the smallest k
- k times bubble sort
- k simple selection sorting
- k-pass heap sorting
Forming a partially ordered sequence
- Direct insertion
- Bubble sort
- Quick platoon
- Simple choice
- Heap sort
6.2.2 sorting of five categories
Insert sort
thinking
Before the last trip, all elements may not be in the final position
Direct insert sort
thinking
realization
//Sort from small to large void Direct_InsertSort(ElemType A[],int n){ int i,j; for(i = 2;i <= n;++i){ if(A[i] < A[i-1]){ //If the element to be inserted a [i] < a [I-1], find the insertion position from the back to the front A[0] = A[i]; //A is a fixed sequence, and the data storage has been determined, so the sentinel needs to record the elements to be inserted for(j = i-1;A[0] > A[j];--j) A[j+1] = A[j]; A[j+1] = A[0]; } } }
characteristic
Half insert
Direct insertion: when finding the insertion position, the traversal is changed to half insertion
realization
void Binary_InsertSort(ElemType A[],int n){ int i,j,low,high,mid; for(i = 2;i <= n;++i){//Sequentially insert A[2]~A[n] into the ordered subsequence A[0] = A[i]; low = 1;high = i - 1; while(low <= high){ mid = (low+high)/2; if(A[mid] > A[0]) high = mid - 1; else low = mid + 1; } //mid is the insertion position for(j = i - 1;j >= mid;--j) A[j+1] = A[j]; A[high + 1] = A[0]; } }
characteristic
Shell Sort
Understanding: since direct insertion sorting is suitable for roughly ordered sequences, data can be preprocessed before direct insertion sorting
thinking
- Take a number every other step
- The step size is the interval between two fetches
characteristic
Select sort
Idea: select the most value as the endpoint of the ordered subsequence
Simple selection sort
thinking
realization
void selectSort(ElemType A[],int n){ int i,j,min; for(i = 0;i < n-1;++i){ minIndex = i; for(j = i + 1;j < n;++j) if(A[j] < A[minIndex]) minIndex = j; if(minIndex != i) swap(A[minIndex],A[i]); } }
characteristic
Heap sort
thinking
storage structure
step
-
Build initial reactor
Adjust from the last non leaf node n 2 or n 2 + 1 ∼ 1 \frac{n}{2} or \ frac{n}{2}+1 \sim 1 2n or 2n + 1 ∼ 1
-
Take the root and exchange with the last element of the heap
-
Readjust the stack and repeat the above steps
operation
delete
- Top to bottom comparison
- Call heapdadjust once, adjust from the root, compare the two nodes first, and exchange according to the result
- Time complexity O ( h ) = O ( l o g n ) O(h) = O(logn) O(h)=O(logn)
insert
-
Comparison from bottom to top
-
Comparison range: This subtree
take K n + 1 K_{n+1} Kn+1 is inserted at the end as a leaf node, and the K n + 1 K_{n+1} Kn+1 compared with its new parents until K n + 1 K_{n+1} Kn+1 meets the characteristics of the reactor
realization
void BuildMaxHeap(ElemType A[],int n){ for(i = n/2;i >= 1;--i) HeapAdjust(A,i,n); } void HeapAdjust(ElemType A[],int k,int n){ A[0] = A[k]; for(i = 2*k;i <= n;i*=2){//Adjust downward layer by layer, and the maximum value must be at the left and right child nodes if(i < n && A[i] < A[i+1])//If the right child node is larger than the left child node element i++; //Take the subscript of the right child node if(A[0] >= A[i]) break; //Put maximum value at root else{ A[k] = A[i]; l = i;//k is the maximum subscript } } A[k] = A[0]; } void HeapSort(ElemType A[],int n){ BuildMaxHeap(A,n); for(i = n;i > 1;--i){ swap(A[i],A[1]); HeapAdjust(A,1,i-2); } }
characteristic
Swap sort
Bubble sort
thinking
Each pass places a maximum or minimum element in the final position
realization
void BubbleSort(int arr[],int n){ for(int i = 0;i < n;++i){ bool flag = false;//Whether the tag is swapped for(int j = 0;j > n-i-1;++j){//One trip bubble sorting if(arr[i] > arr[j+1]){//Put the big one at the back swap(arr[j],arr[j+1]); flag = true; } } if(flag = false)//If there is no exchange in a certain trip, it indicates that it is in order return ; } }
characteristic
Quick sort
thinking
Improvement of bubble sorting: divide and conquer method
Constantly move the pivot position to find the balance point
Overall process
One pass sorting
realization
int Partition(ElemType A[],int low,int high){ ElemType pivot = A[low]; while(low < high){ while(low < high && A[high] >= pivot) --high; A[low] = A[high]; while(low < high && A[low] <= pivot) ++low; A[low] = A[high]; } A[low] = pivot; return low; } void QuickSort(ElemType A[],int low,int high){ if(low < high){ int pivotpos = Partition(A,low,high); QuickSort(A,low,pivotpos-1); QuickSort(A,pivotpos+1,high); } }
characteristic
Merge sort
thinking
decompose
The sorted table with n elements is divided into: n 2 \frac{n}{2} 2n element, sorting the two sub tables respectively
merge
Merge two sorted sub tables to get the result
realization
void Merge(ElemType A[],int low,int mid,int high){ //In the table, A[low,...,mid],A[mid+1,...,high] are ordered respectively ElemType *B = (ElemType *)malloc(sizeof(ElemType)*(high-low+1)); for(int i = 1;i <= high;++i) B[i] = A[i];//Initialize auxiliary array for(int i = low,j = mid+1,k = 1;i <= mid && j <= high;++k){ if(B[i] <= B[j]) A[k] = B[i++]; else A[k] = B[j++]; } while(i <= mid) A[k++] = B[i++];//The first table has not been detected, and assignment while(j <= high) A[k++] = B[j++];//The second table has not been detected, and the assignment } void MergeSort(ElemTypeA[],int low,int high){ if(low < high){ int mid = (low + high)/2; MergeSort(A,low,mid); MergeSort(A,mid+1,high); Merge(A,low,mid,high); } }
characteristic
Cardinality sort
Not based on comparison and lookup
thinking
Each group of keywords, sorted by the same logic
Arrange primary keywords first and then secondary keywords
If the primary keyword is arranged first, the situation that the secondary keywords are the same cannot be handled after the primary keyword is arranged. The actual situation is to arrange the secondary keyword first and then the primary keyword
For example: 010 102 110 212 cardinal order
First primary, then secondary
Obviously, we can't be the first to be the second
The second is the first
characteristic
Bucket sort (count sort)
Applicable to: sorting integers within a certain range
# include<stdio.h> void countSort(int nums[],int ); int main(){ int nums[] = {2,3,1,4,3}; countSort(nums,5); return 0; } void countSort(int nums[],int n){ // 1. Determine bucket array //If the maximum value is 4, it needs 0-4, that is, 5 arrays occupy 6 array element positions int max = 0; for(int i = 0;i < n;++i){ if(max < nums[i]) max = nums[i]; } int bucket[max+1] = {0}; // 3. Counting for(int i = 0;i < n;++i) bucket[nums[i]]++; for(int i = 0;i < max;++i) printf("%d ",i); printf("Number of\n"); for(int i = 0;i < max;++i) printf("%d ",bucket[i]); }
6.3 external sequencing
6.3.1 storage space
K-way merge sort, one output buffer and K input buffer in memory, sort r elements
6.3.2 process
1. Generate k initial merge segments
The blocks of K merge segments are read into K input buffers
The L records in each input buffer are internally sorted to form K ordered initial merge segments
2. Merge the route S and route K S = ⌈ l o g k r ⌉ S=\lceil log_kr\rceil S=⌈logkr⌉
The smallest record of each segment is selected from the K merge segments with the idea of merge sorting and temporarily stored in the output buffer
- When an input buffer is empty, a new block is read immediately
- The next round of comparison is performed only when the buffer is not empty
Output buffer full, write out external memory
6.3.3 time cost
t Total time = t Read / write external memory + t Internal sort + t Internal merger t_ {total time} = t_ {read / write external memory} + T_ {internal sort} + T_ {internal merge} Ttotal time = t read / write external memory + T internal sort + T internal merge
-
Reading, writing and going out: read as many times as you merge
-
Internal sorting: the number of segments can be optimized, but the number of read / write IO is unchanged
Intra block sorting: it takes time to construct the initial merge period
-
Internal merge: select the most value in the K-path and put it into the output buffer
6.3.4 optimization
Increase the number of merging paths
cost
Increase the input buffer of the response
It is required to select a minimum element from k merge segments each time
k
−
1
k-1
k − 1 keyword comparison
S
(
n
−
1
)
(
k
−
1
)
=
⌈
l
o
g
k
r
⌉
(
n
−
1
)
(
k
−
1
)
=
⌈
l
o
g
2
r
⌉
(
n
−
1
)
(
k
−
1
)
⌈
l
o
g
2
r
⌉
S(n-1)(k-1) = \lceil log_kr\rceil(n-1)(k-1)=\frac{\lceil log_2r\rceil(n-1)(k-1)}{\lceil log_2r \rceil}
S(n−1)(k−1)=⌈logkr⌉(n−1)(k−1)=⌈log2r⌉⌈log2r⌉(n−1)(k−1)
Internal merger optimization
Loser tree, reduce keyword comparison times
Path K has k leaves
- Tree height = Comparison times = ⌈ l o g 2 K ⌉ Tree height = comparison times = \ lceil log_2K \rceil Tree height = comparison times = ⌈ log2 K ⌉
Total comparisons
- S ( n − 1 ) ⌈ l o g 2 k ⌉ = ⌈ l o g k r ⌉ ( n − 1 ) ⌈ l o g 2 k ⌉ = ⌈ l o g 2 r ⌉ ( n − 1 ) S(n-1)\lceil log_2k \rceil = \lceil log_kr \rceil(n-1)\lceil log_2k \rceil = \lceil log_2r \rceil(n-1) S(n−1)⌈log2k⌉=⌈logkr⌉(n−1)⌈log2k⌉=⌈log2r⌉(n−1)
k-way balanced merge
Up to k segments and 1
If there are m segments in each trip, there will be ⌈ m r ⌉ \lceil \frac{m}{r} \rceil ⌈rm⌉