Hongmeng kernel source code analysis (memory assembly) | what assembly code does the memory implementation involve | add Chinese comments to the HarmonyOS source code, and the four warehouses are updated synchronously | V11 02

Hongmeng kernel source code comments Chinese version [CSDN warehouse] | Gitee warehouse | Github warehouse | Coding warehouse] Add Chinese notes to the HarmonyOS source code line by line and elaborate the design details to help you quickly read the HarmonyOS kernel source code and master the operation mechanism of the whole Hongmeng kernel. The four code warehouses are updated synchronously every day

Hongmeng source code analysis series [CSDN] | OSCHINA] From the perspective of HarmonyOS architecture layer, I sorted out the documents, and first tried to deconstruct the kernel by telling stories in life scenes.

This chapter explains the source code of the assembly part of memory. See the following for details: /kernel/base/vm -- kernel_liteos_a\arch\arm\arm

catalogue

ARM-CP15 coprocessor

The ARM processor uses the registers of coprocessor 15(CP15) to control cache, TCM and memory management. The register of CP15 can only be accessed by MRC and MCR (Move to Coprocessor from ARM Register) instructions. It contains 16 32-bit registers with numbers of 0 ~ 15. This article focuses on the C7, C2 and C13 registers.

First disassemble a piece of assembly code

It's not good to read the kernel source code and not click the assembly, but don't be afraid. It's not so terrible. From shallow to deep, the kernel is actually very fun. See arm h. It's full of these things.

#define DSB __asm__ volatile("dsb" ::: "memory")
#define ISB __asm__ volatile("isb" ::: "memory")
#define DMB __asm__ volatile("dmb" ::: "memory")

STATIC INLINE VOID OsArmWriteBpiallis(UINT32 val)
{
    __asm__ volatile("mcr p15, 0, %0, c7,c1,6" ::"r"(val));
    __asm__ volatile("isb" ::: "memory");
}

Directive

Shuo Ming

Syntax format

mcr

Write the data in the register of ARM processor to the register in CP15

mcr{<cond>}   p15, <opcode_1>, <rd>, <crn>, <crm>, {<opcode_2>

mrc

Read the data in the register in CP15 to the register of ARM processor

mcr{<cond>}   p15, <opcode_1>, <rd>, <crn>, <crm>, {<opcode_2>

Cond: condition code for instruction execution. When cond is ignored, the instruction is executed unconditionally.  
Opcode_1: Coprocessor specific opcode opcode1=0 for CP15 register
Rd: the ARM register as the source register, whose value will be transferred to the coprocessor register, or the value of the coprocessor register will be transferred to this register, usually R0
CRn: coprocessor register as the target register, whose number is C~C15.  
CRm: additional destination register or source operand register in coprocessor. If you do not need to set additional information, set CRm to c0, otherwise the result is unknown
Opcode_2: Optional coprocessor specific opcodes. (used to distinguish different physical registers with the same number. When no additional information is required, it is specified as 0

The literal meaning of this assembly instruction is: write the data of ARM register R0 to the register numbered 7 in CP15, and the value is transmitted from the outside.

For example, OsArmWriteBpiallis(0) did four actions

1. Write the value of 0 into R0 register. Note that this register is the register of ARM, i.e. CPU. "r"(val) means to declare to GCC compiler that it will modify the value of R0 register and say hello in advance before changing. They are all gentlemen and civilized people. In fact, the function of compiler is very powerful. It is not just a tool for compiling code.

2.volatile still means to tell the compiler not to optimize this code and generate the target instructions intact.

3. "ISB":: "memory" still tells the compiler that the contents of memory may have been changed. It is necessary to invalidate all caches and access the actual contents instead of Cache!

4. Write the value of R0 into C7, which is the register of CP15 coprocessor. What is the C7 register responsible for? Check the table below.

What registers does CP15 have

Register number

Basic function

Role in MMU

Role in PU

0

ID code (read only)

ID code and cache type

 

1

Control bit (read / write)

Various control bits

 

2

Storage protection and control

Address translation table base address

Control bit of cacheability

3

Storage protection and control

Domain access control bit

Bufferability control bit

4

Storage protection and control

retain

retain

5

Storage protection and control

Memory failure status

Access control bit

6

Storage protection and control

Memory failure address

Protection area control

7

Cache and write cache

Cache and write cache control

 

8

Storage protection and control

TLB control

retain

9

Cache and write cache

Cache lock

 

10

Storage protection and control

TLB lock

retain

11

retain

 

 

12

retain

 

 

13

Process identifier

Process identifier

 

14

retain

 

 

15

Varies by design

Varies by design

Varies by design

Turn off cache and write cache control!, Other registers will be discussed below. Let's have a general impression first.

Where does mmu get page table information? The answer is: TTB

TTB register (Translation table base)

Referring to the above table, TTB register is the C2 register of CP15 coprocessor, and the base address of the page table, that is, the base address of the primary mapping descriptor table. Around TTB, Hongmeng provides the following read functions. In short, the kernel constantly modifies and reads the register value from the outside, and MMU will only read the value of this register directly through hardware, so that MMU can obtain different page tables and convert the process virtual address and physical address. Remember? The page table of each process is independent!

Under what circumstances will the value be modified? The page feed table means that mmu is switching the context! Just look at the code.

mmu context

Only called by this function. No doubt LOS_ArchMmuContextSwitch is the key function.

typedef struct ArchMmu {
    LosMux              mtx;            /**< arch mmu page table entry modification mutex lock */
    VADDR_T             *virtTtb;       /**< translation table base virtual addr */
    PADDR_T             physTtb;        /**< translation table base phys addr */
    UINT32              asid;           /**< TLB asid */
    LOS_DL_LIST         ptList;         /**< page table vm page list */
} LosArchMmu;

// mmu context switching
VOID LOS_ArchMmuContextSwitch(LosArchMmu *archMmu)
{
    UINT32 ttbr;
    UINT32 ttbcr = OsArmReadTtbcr();//Read the status value of TTB register
    if (archMmu) {
        ttbr = MMU_TTBRx_FLAGS | (archMmu->physTtb);//Process TTB physical address value
        /* enable TTBR0 */
        ttbcr &= ~MMU_DESCRIPTOR_TTBCR_PD0;//Enable TTBR0
    } else {
        ttbr = 0;
        /* disable TTBR0 */
        ttbcr |= MMU_DESCRIPTOR_TTBCR_PD0;
    }

    /* from armv7a arm B3.10.4, we should do synchronization changes of ASID and TTBR. */
    OsArmWriteContextidr(LOS_GetKVmSpace()->archMmu.asid);//Here, first switch the asid to the ID of kernel space
    ISB;
    OsArmWriteTtbr0(ttbr);//Write the process page base address to TTB through r0 register
    ISB;
    OsArmWriteTtbcr(ttbcr);//Write TTB status bit
    ISB;
    if (archMmu) {
        OsArmWriteContextidr(archMmu->asid);//Write the process identifier to the C13 register through the R0 register
        ISB;
    }
}
// c13 asid(Adress Space ID) process identifier
STATIC INLINE VOID OsArmWriteContextidr(UINT32 val)
{
    __asm__ volatile("mcr p15, 0, %0, c13,c0,1" ::"r"(val));
    __asm__ volatile("isb" ::: "memory");
}

Take another look at those places that will call LOS_ArchMmuContextSwitch, the figure below is clear at a glance.

There are four places to switch mmu context

First: through the scheduling algorithm, the space of the selected process changes, and the natural mapping page table changes. You need to switch the mmu context or look at the code directly. Not many codes were posted and annotated. If you don't remember the scheduling algorithm, you can see the Hongmeng kernel source code analysis (scheduling mechanism) in the series, which is described in detail.

//Scheduling algorithm - process switching
STATIC VOID OsSchedSwitchProcess(LosProcessCB *runProcess, LosProcessCB *newProcess)
{
    if (runProcess == newProcess) {
        return;
    }

#if (LOSCFG_KERNEL_SMP == YES)
    runProcess->processStatus = OS_PROCESS_RUNTASK_COUNT_DEC(runProcess->processStatus);
    newProcess->processStatus = OS_PROCESS_RUNTASK_COUNT_ADD(newProcess->processStatus);

    LOS_ASSERT(!(OS_PROCESS_GET_RUNTASK_COUNT(newProcess->processStatus) > LOSCFG_KERNEL_CORE_NUM));
    if (OS_PROCESS_GET_RUNTASK_COUNT(runProcess->processStatus) == 0) {//Gets the number of tasks in the current process
#endif
        runProcess->processStatus &= ~OS_PROCESS_STATUS_RUNNING;
        if ((runProcess->threadNumber > 1) && !(runProcess->processStatus & OS_PROCESS_STATUS_READY)) {
            runProcess->processStatus |= OS_PROCESS_STATUS_PEND;
        }
#if (LOSCFG_KERNEL_SMP == YES)
    }
#endif
    LOS_ASSERT(!(newProcess->processStatus & OS_PROCESS_STATUS_PEND));//Asserts that the process is not in a blocked state
    newProcess->processStatus |= OS_PROCESS_STATUS_RUNNING;//Set process status to running status

    if (OsProcessIsUserMode(newProcess)) {//Switching process mmu context in user mode
        LOS_ArchMmuContextSwitch(&newProcess->vmSpace->archMmu);//New process - > part of - > MMU in virtual space
    }

#ifdef LOSCFG_KERNEL_CPUP
    OsProcessCycleEndStart(newProcess->processID, OS_PROCESS_GET_RUNTASK_COUNT(runProcess->processStatus) + 1);
#endif /* LOSCFG_KERNEL_CPUP */

    OsCurrProcessSet(newProcess);//Set process to g_runProcess

    if ((newProcess->timeSlice == 0) && (newProcess->policy == LOS_SCHED_RR)) {//Allocate time slices for out of time slices or initial processes
        newProcess->timeSlice = OS_PROCESS_SCHED_RR_INTERVAL;//Reassign time slice, default 20ms
    }
}

Here's another word. Two context switches have been mentioned in the series. One is mmu context switching caused by process switching, and the other is CPU context switching caused by task switching. Can you remember?

Second: mmu will be switched when ELF files are loaded, and a new process is born. The details will be discussed in detail in Hongmeng kernel source code analysis (startup loading). Please pay attention to the dynamics of the series.

When the rest of the virtual space is recycled and refreshed, look at the code yourself.

How can mmu quickly find the physical address through the virtual address? The answer is: TLB. Note that there is also a TTB, a register and a cache. Don't get confused.

TLB(translation lookaside buffer)

TLB is a cache on hardware. Because the page table is generally large and stored in memory, after the processor introduces MMU, it needs to access memory twice to read instructions and data: first obtain the physical address by querying the page table, and then access the physical address to read instructions and data. In order to reduce the processor performance degradation caused by MMU, TLB is introduced, which can be translated as "address translation backup buffer", or "fast table". In short, TLB is the cache of page table, which stores the most likely page table items to be accessed at present, and its content is a copy of some page table items. Only when the TLB cannot complete the address translation task will it query the page table in memory, which reduces the performance degradation of the processor caused by page table query. See in detail

According to the picture, the steps are as follows.

1. The base address of the page table in the figure is the TTB register value above. The whole page table is very large. How big is it? I'll talk about it later, so it can only be stored in memory. Only a start position is stored in TTB.

2. The virtual address is the logical address of the program, that is, the address fed to the CPU. It must be transformed into physical memory after MMU conversion in order to get the real instructions and data.

3. TLB is a mini version of page table. MMU first looks for the physical page from TLB, and then looks for it from page table if it cannot be found. After finding it from page table, it will be put into TLB. Note that this step is very important. Because there are many page tables belonging to processes, and there is only one TLB, if it is not placed, the page tables of multiple processes will be mapped to the same physical page box without knowing it. A physical page can only be mapped by one page table at the same time. However, in addition to the uniqueness of TLB, one thing is needed to do well, which is the unique identifier of the process at the mapping level - asid.

asid register

asid(Adress Space ID) process identifier belongs to the C13 register of CP15 coprocessor. ASID can be used to uniquely identify the process and provide address space protection for the process. When TLB attempts to resolve the virtual page number, it ensures that the ASID of the currently running process matches the ASID associated with the virtual page. If it does not match, it will fail as a TLB. In addition to providing address space protection, ASID allows TLBs to contain entries for multiple processes at the same time. If the TLB does not support independent ASID, each time a page table is selected (for example, during context switching), the TLB must be flushed or deleted to ensure that the next process will not use the wrong address translation.

There is a bit in the TLB page table to indicate whether the current entry is global(nG=0, all processes can access) or non global (ng = 1, only this process can access). If it is global type, tag ASID will not be displayed in TLB; If it is a non global type, the TLB will tag the ASID, and MMU needs to judge whether the ASID is consistent with the ASID of the current process when querying in the TLB. Only if it is consistent can it prove that the current process of the entry has access rights.  

Did you see? If you refresh the TLB every time mmu context switching, you can ensure that the TLB is full of mapping tables of new processes, but the efficiency is too low!!! In fact, the process switching is at the second and sub second level. How frequent the virtual and real address conversion is? How can it be so realistic? The real situation is that there are still records of the physical memory occupied by many other processes in the TLB. Of course, their right to use the physical memory is still there. Therefore, when the application new 10M memory and thinks it belongs to itself, in fact, it doesn't belong to you at the kernel level, or others are using it. Only when you use 1M, the real 1M physical memory belongs to you. Moreover, when your process is switched by other processes, it is likely that the 1M you use is no longer in the physical memory and has been replaced on the hard disk. See? Students who only pay attention to application development can of course say that it's none of my business and give me a feeling. However, students who want to be familiar with the kernel must understand that this is happening every minute.

The last function is left to you. How are asid s allocated?

/* allocate and free asid */
status_t OsAllocAsid(UINT32 *asid)
{
    UINT32 flags;
    LOS_SpinLockSave(&g_cpuAsidLock, &flags);
    UINT32 firstZeroBit = LOS_BitmapFfz(g_asidPool, 1UL << MMU_ARM_ASID_BITS);
    if (firstZeroBit >= 0 && firstZeroBit < (1UL << MMU_ARM_ASID_BITS)) {
        LOS_BitmapSetNBits(g_asidPool, firstZeroBit, 1);
        *asid = firstZeroBit;
        LOS_SpinUnlockRestore(&g_cpuAsidLock, flags);
        return LOS_OK;
    }

    LOS_SpinUnlockRestore(&g_cpuAsidLock, flags);
    return firstZeroBit;
}

Series articles enter > > Hongmeng system source code analysis (general directory) [CSDN] | OSCHINA] see

Notes Chinese version enter > > Hongmeng kernel source code comments Chinese version [CSDN warehouse] | Gitee warehouse | Github warehouse | Coding warehouse] read

Tags: github gitee entry harmonyos gcc liteos

Posted by dmccabe on Tue, 10 May 2022 03:42:04 +0300