If you really can't understand it, memorize it by rote. If you don't want to memorize it, treasure it
Chapter I bubble sorting
Bubble Sort is also a simple and intuitive sorting algorithm. It repeatedly visits the sequence to be sorted, compares two elements at a time, and exchanges them if they are in the wrong order. The work of visiting the sequence is repeated until there is no need to exchange, that is, the sequence has been sorted. The name of this algorithm comes from the fact that smaller elements will slowly "float" to the top of the sequence through exchange.
[1]Bubble_Sort.c
/* * @Author: your name * @Date: 2021-09-03 14:09:13 * @LastEditTime: 2021-09-03 15:35:44 * @LastEditors: Please set LastEditors * @Description: In User Settings Edit * @FilePath: \Desktop\bb.c */ #include <stdio.h> #include <time.h> #include <stdlib.h> #define SIZE 10000 //Bubble sorting from small to large void bubble_sort(int *arr, int len)//100 { int sort_count, pos; int tmp; //How many times to sort: sort_count: several data have been arranged for counting. len-1: the last one does not need to be sorted for(sort_count=0; sort_count<len-1; sort_count++) { //pos: where to start the comparison, len sort_ Count-1: total data length - the number of data already arranged - 1 (this - 1 is because pos + 1 will be accessed in the following operations of pos) for(pos=0; pos<len-sort_count-1; pos++) { if(arr[pos] > arr[pos+1])//Previous data { //Exchange data tmp = arr[pos]; arr[pos] = arr[pos+1]; arr[pos+1] = tmp; } } } } int main(void) { int *array = malloc(SIZE*sizeof(int)); int i; srand(time(NULL));//Random number seed for(i=0; i<SIZE; i++) { array[i] = rand()%SIZE;//Start seed } bubble_sort(array, SIZE);//Sorting function for(i=0; i<SIZE; i++)//Print { printf("%d ", array[i]); } printf("\n"); return 0; }
Chapter II quick sort
Quick sort uses the Divide and conquer strategy to divide a list into two smaller and larger subsequences, and then sort the two subsequences recursively.
The steps are:
Select benchmark value: select an element from the sequence, which is called "pivot";
Split: reorder the sequence. All elements smaller than the benchmark value are placed in front of the benchmark, and all elements larger than the benchmark value are placed behind the benchmark (the number equal to the benchmark value can be on either side). After this division, the sorting of benchmark values has been completed;
Recursively sort subsequence: recursively sort subsequences that are less than the reference value element and subsequences that are greater than the reference value element. The judgment condition for recursion to the bottom is that the size of the sequence is zero or one. At this time, the sequence is obviously ordered.
There are several specific methods to select the benchmark value, which has a decisive impact on the time performance of sorting.
[1]quick_sort.c
/* * @Author: your name * @Date: 2021-09-03 15:21:41 * @LastEditTime: 2021-09-03 15:34:54 * @LastEditors: Please set LastEditors * @Description: In User Settings Edit * @FilePath: \Desktop\quick.c */ /********************************************** * File Name: quick.c * Created @ 2016-08-03 21:20 * Author : Gec * E-mail : 2034294993@qq.com **********************************************/ #include <stdio.h> #include <time.h> #include <stdlib.h> #define SIZE 100000 void quick(int arr[], int left, int right)//left=0 right=100000-1 { int i; if(left < right) { int tmp = arr[left];//The first value is given to the temporary variable to make room for the first position int low = left;//Low order 0 int high = right;//High 99999 //Quick sort alternate comparison while(low < high)//The loop is based on the fact that the left subscript is less than the right subscript { //Compare from right to left, and judge whether the left subscript has a value less than the right subscript in real time, and whether the right value is less than the exchange value in the middle while(low < high && arr[high] >= tmp) { high--;//Until the condition is not satisfied, the value arr[high] is either = = TMP or < TMP }//When you jump out of the loop, it is the first place: the following numbers have been compared with the first one one by one //Prove that the first one is really the smallest. It should be recorded at this time, arr[low] = arr[high];//Put the right value into the free space //Compare from left to right, and judge whether the left subscript is greater than or equal to the right subscript in real time, and whether the right value is less than the exchange value in the middle while(low < high && arr[low] < tmp)//Note that the high value has changed at this time { low++; } arr[high] = arr[low]; } //The last remaining position is the empty one, and then it's sorted arr[low] = tmp; quick(arr,left,low-1);//Left quick(arr,low+1,right);//right } } int main(int argc, char **argv) { int *array = malloc(SIZE*sizeof(int)); int i; srand(time(NULL));//seed for(i=0; i<SIZE; i++) { array[i] = rand()%SIZE;//Generate random number } quick(array, 0, SIZE-1);// 0 -- 8 are subscripts for(i=0; i<SIZE; i++)//Print { printf("%d ", array[i]); } printf("\n"); return 0; }
Chapter III big data sorting
Here we need to use the method of file io
[1]calloc
calloc Function and malloc The functions are similar in that they allocate memory from the heap. The function declaration is as follows: void *calloc(int n,int size); Parameter definition: size: How big is each piece n: Number of blocks applied Note: the last applied space size is: n and size Multiply
[2]malloc
malloc The function can obtain the memory space of the specified byte from the heap, and its function declaration is as follows: void * malloc(int n); Parameter definition: n: Requested space size (single type size)*Total number)
[3]readlloc
Generally, the size of malloc application is insufficient and needs to be expanded
Prototype: external void * realloc (void mem_address, unsigned int newsize);
Syntax: pointer name = (data type) realloc (pointer name to change memory size, new size).
//If the new size is smaller than the original size, the end of the original data may be lost (overwritten by other data using memory, etc.)
Header file: #include < stdlib h> Some compilers require #include < malloc h>
[4] File IO operation
[4.1]fprintf
[4.2]fscanf
[4.3]fnprintf
// Open a file that contains millions of data levels FILE *src = fopen("numbers.txt", "r");//Open in read-only mode if(src == NULL) { perror("fail to open file"); exit(0);//End program }
int fscanf ( FILE *fp, char * format, ... );
int fprintf ( FILE *fp, char * format, ... );
Input file operation: src Is a file descriptor %u:Is the format taken from the file, obviously unsigned int type &data[i]:Array address Return value: failed EOF=-1 fscanf(src, "%u", &data[i])
Prototype: int snprintf(char *str, size_t size, const char *format,...);
Description: The functions snprintf() write at most size bytes (including the terminating null byte ('\ 0')) to str. that is, snprintf writes up to size characters into STR according to format format, where size characters already include terminators.
[4.4] rename and remove functions
rename The command modifies file names in batch by string replacement. Format: rename from to file from : Indicates the character to be replaced or processed, which is generally a part of the file name, to : take from Content, replace with to Content of the. file : Documents to be processed.
remove(f1);//Delete f1 file descriptor
/* * @Author: your name * @Date: 2021-09-03 15:43:04 * @LastEditTime: 2021-09-03 15:45:06 * @LastEditors: Please set LastEditors * @Description: In User Settings Edit * @FilePath: \Desktop\big_data_sort.c */ #include <stdio.h> #include <stdbool.h> //Swap two numbers void swap(int *a, int *b)//When the parameter is a pointer, you will keep the value of all operations on a and B, and there is no return form { int tmp; tmp = *a; *a = *b; *b = tmp; } // Merge all files from begin to end and merge them into the file begin=1 void merge(int begin, int end)//end=20 { if(end-begin <= 0) return; // Merge all files from begin+1 to end and merge them into the file begin+1 merge(begin+1, end);//Recursive self // Merge begin and begin+1 char f1[10], f2[10]; bzero(f1, 10); bzero(f2, 10); //Set begin Txt files saved in fx snprintf(f1, 10, "%d.txt", begin); snprintf(f2, 10, "%d.txt", begin+1); //printf("merging% s and% s...\n", f1, f2); FILE *fp1 = fopen(f1, "r"); FILE *fp2 = fopen(f2, "r"); unsigned n1, n2; if(fscanf(fp1, "%u", &n1) == EOF) { fclose(fp1); fclose(fp2); remove(f1);//delete rename(f2, f1);//Rename f2 to f1 return; } if(fscanf(fp2, "%u", &n2) == EOF) { fclose(fp1); fclose(fp2); remove(f2); return; } // Temporarily store the merged data, and finally change its name to begin FILE *tmp = fopen("tmp.txt", "w+");//Open file in read-write mode int fp1_done = false; int fp2_done = false; while(1) { if(n1 < n2)//Comparison of two file descriptors { fprintf(tmp, "%u\n", n1);//tmp file descriptor = n1(fp1) file content if(fscanf(fp1, "%u", &n1) == EOF)// { fprintf(tmp, "%u\n", n2); fp1_done = true; break; } } else { fprintf(tmp, "%u\n", n2); if(fscanf(fp2, "%u", &n2) == EOF) { fprintf(tmp, "%u\n", n1); fp2_done = true; break; } } } // Put the remaining data into tmp if(fp1_done) { while(fscanf(fp2, "%u", &n2) != EOF) { fprintf(tmp, "%u\n", n2); } } else { while(fscanf(fp1, "%u", &n1) != EOF) { fprintf(tmp, "%u\n", n1); } } // Delete the original file and rename tmp to begin fclose(fp1); fclose(fp2); if(remove(f1) == -1) { printf("Delete file%s fail:%s\n", f1, strerror(errno)); exit(0); } if(remove(f2) == -1) { printf("Delete file%s fail:%s\n", f2, strerror(errno)); exit(0); } fclose(tmp); rename("tmp.txt", f1); } int main(int argc, char **argv) { // Open a file that contains millions of data levels FILE *src = fopen("numbers.txt", "r");//Open in read-only mode if(src == NULL) { perror("fail to open file"); exit(0);//End program } // 1. Divide the original data file into N ordered sub files bool done = false; char file[20]; int N = 0; int wanted = 10*10000; // It is assumed that only 100000 data can be read at a time int infact = wanted;//Crazy iron: 100000 Volts while(1) { // Attempted to read wanted data from file unsigned *data = calloc(wanted, sizeof(unsigned));//Apply for 100000 pieces, 4 bytes each for(int i=0; i<wanted; i++) { if(fscanf(src, "%u", &data[i]) == EOF)//From the file numbers Txt takes out the unsigned data and stores it in the data array { done = true; infact = i; break; } } // Sort the read infact data quickSort(data, infact);//Quick sort // Create temporary files to store some data bzero(file, 20); //File array name, 20 array size, format, stored data example: file [0] = 0 txt snprintf(file, 20, "%d.txt", ++N);//Used to load file names, up to 20 txt file name FILE *fp = fopen(file, "w");//Write only mode // Write the ordered part of the data into a temporary file and save it for(int i=0; i<infact; i++) { fprintf(fp, "%u\n", data[i]); } free(data); fclose(fp); if(done) break; } fclose(src); // 2. Merge N ordered sub files into one ordered file merge(1, N); return 0; }
Creation is really not easy, but what does it matter if you can help people