Three ways to store strings


   in the data structure, the string should be stored in a separate storage structure, which is called string storage structure. The string here refers to the string. No matter which programming language you learn, the string is always the most operated. The most commonly used storage structure is undoubtedly the use of fixed length array storage. However, this storage structure needs to allocate space in advance. When we don't know the string length, too much allocated memory is undoubtedly a waste. Therefore, it is particularly important to choose a reasonable way to store strings. The following will introduce three storage methods in turn.

Fixed length sequential storage

   the fixed length sequential storage structure of string can be understood as using "fixed length sequential storage structure" to store string, so its underlying implementation can only use static array.
   when using the fixed length sequential storage structure to store strings, it is necessary to apply for enough memory space in advance in combination with the length of the target string.
   for example, the fixed length sequential storage structure is used to store "feizhufeifei". Through visual inspection, it is known that the length of this string is 12 (excluding the terminator '\ 0'). Therefore, the length of the array space we apply for is at least 12, which is expressed in C language:

char str[18] = "feizhufeifei";

  the following is the specific C language implementation

int main()
    char str[15]="feizhufeifei";
    return 0;

  this storage method should be mastered by beginners. The second storage method is described below.

Dynamic array storage

  first, we should clarify two concepts: heap and stack.
   the heap is managed by our programmers. When the process calls malloc and other functions to allocate memory, the newly allocated memory is dynamically allocated to the heap. When free and other functions are used to release memory, the released memory is removed from the heap.  
   stack, also known as stack, is a variable temporarily created by the user to store the program, that is, the variable defined in our function {}, but does not include the variable declared by static. Static means to store the variable in the data segment. In addition, when the function is called, its parameters will also be pushed into the process stack that initiates the call, and after the call is completed, the return value of the function will also be stored back on the stack. Due to the first in and last out characteristics of the stack, the stack is particularly convenient for saving and restoring the call site. In this sense, we can regard the stack as a memory area for registering and exchanging temporary data.
   when we call malloc, we will divide a space on the heap for us to use. The specific code is as follows:

//Create a dynamic array str, and apply for 10 heap storage spaces of char type size by using malloc.
char * str = (char*)malloc(10*sizeof(char));

   the advantage of dynamic array is that its length is variable and it can be allocated dynamically according to needs. What if I don't want to apply for new variables, but want to expand the space of str? At this point, the realloc function works.

//By using this line of code, the dynamic array with 10 char storage space can be expanded to store 20 char data.
str = (char*)realloc(str, 20*sizeof(char));

   the following is an example of merging two strings to better understand the dynamic allocation process.

 * @Description: Dynamic heap allocation memory for string
 * @Version:   V1.0
 * @Autor: Carlos
 * @Date: 2020-05-25 
 * @LastEditors: Carlos
 * @LastEditTime: 2020-05-25 
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
//Print test statements
#define DEBUG 0
#define DBG_PRINTF(fmt, args...)  \
    printf("<<File:%s  Line:%d  Function:%s>> ", __FILE__, __LINE__, __FUNCTION__);\
    printf(fmt, ##args);\
# else

# define DBG_PRINTF(fmt, args...)
int main()
    char *s1 = NULL;
    char *s2 = NULL;
    s1 = (char *)malloc(5*sizeof(char *));
    s2 = (char *)malloc(7*sizeof(char *));
    int length1 = strlen(s1);
    int length2 = strlen(s2);
    //Try to store the merged string in s1. If s1 space is not enough, use realloc to apply dynamically
        s1 =(char*) realloc(s1,(length1 + length2+1) * sizeof(char));
     //Merge two strings into s1
    for(int i = length1; i < length1 + length2;i++)
         s1[i] = s2[i - length1];
     //Add \ 0 to the end of the string to avoid errors
    s1[length1 + length2] = '\0';
    printf("s1+s2:%s", s1);
    //The dynamic array should be released immediately after use
    return 0;

Block chain storage

   block chain storage is to use a linked list to store strings. This paper uses the linked list structure without head node (that is, the first head node of the linked list also stores data).
We know that the "single" in the single linked list only emphasizes that each node of the linked list can only have one pointer, and does not limit the specific number of data stored in the data field. Therefore, when designing the structure of linked list nodes, each node can store multiple data.
   for example, we want to use a linked list to store feizhu strings. The linked list structure is as follows:

  we can also store four characters per linked list, so the last node will not be full. At this point, we can fill it with # or other symbols.

  how to determine the number of data stored in each node in the linked list?
  the number of data stored in each node of the linked list can refer to the following factors:
   length of string and size of storage space: if the string contains a large amount of data and the storage space applied for in the linked list is limited, let each node store more data as much as possible to improve the utilization of space (for each additional node, apply for more space in the pointer field); On the contrary, if the string is not very long or the storage space is enough, it needs to be considered comprehensively in combination with other factors;
   functions realized by the program: if a large number of insertion or deletion operations need to be performed on the stored string in the actual scene, the number of data stored in each node should be reduced as much as possible; On the contrary, it needs to be combined with other factors.
  the following is the specific code implementation.

 * @Description: Block linked list storage of string (linked list without head node)
 * @Version: V1.0
 * @Autor: Carlos
 * @Date: 2020-05-25 
 * @LastEditors: Carlos
 * @LastEditTime: 2020-05-25 
//Set the number of nodes in the global linked list to store data
#define linkNum 3
typedef struct link {
    //The data field can store linkNum data
    char a[linkNum]; 
    //Represents the pointer field and points to the direct successor element
    struct link * next; 
 * @Description: Traverse the linked list and print
 * @Param: Link * head Structure head node pointer
 * @Return: nothing
 * @Author: Carlos
void PrintLink(Link * head)
     Link * p = head;
    while (p)
    for (int i = 0; (i < linkNum) &&(p->a[i]!='#'); i++) 
        printf("%c", p->a[i]);
     p = p->next;
 * @Description: Initialize linked list
 * @Param: Link * head Structure head node pointer. char * str string to operate on
 * @Return: Link *Structure pointer
 * @Author: Carlos
Link * InitLink(Link * head, char * str)
    int length = strlen(str);
    //The number of nodes required is rounded up
    int nodenum = length/linkNum;
    Link *p = head;
    int j;
     //Store the data in the array of each node
    for(int i = 0;i<=nodenum;i++)
        for( j = 0;j < linkNum;j++)  
            if (i*linkNum + j < length)
                 p->a[j] = str[i*linkNum+j];
            //Use # to fill the space of the node array that is not full
                p->a[j] = '#';
        //Link two nodes
        if (i*linkNum + j < length)
            Link* q = (Link*)malloc(sizeof(Link));
            q->next = NULL;
            p->next = q;
            p = q;
    return head;

int main()
    Link* head = (Link*)malloc(sizeof(Link));
    return 0;

  for those who don't understand the linked list, please refer to this blog Summary of operations such as addition, deletion, modification, query and inversion of the most complete single linked list in history and five sorting algorithms (C language)
   the codes in the text have been tested. If you have any comments or suggestions, please contact me. Welcome to study and exchange!
   if you think it's good, please point a praise before you go, thank you!

In case of typographical disorder, you can visit my CSDN through the following link.

**CSDN:[CSDN searches for "embedded and Linux things"]

Tags: C

Posted by ashutosh.titan on Fri, 29 Apr 2022 01:46:52 +0300