[operating system] cognitive framework construction of Linux kernel VFS virtual file system (super_block,inode,dentry,file)



Virtual file system (also known as virtual file system switch) is the software layer in the kernel, which provides file system interface for user space programs. It also allows different kernel file systems to coexist

"Everything is a file" is one of the basic philosophies of Linux. It is not only an ordinary file, but also a directory, character device, block device, socket and so on. The basis of this behavior is the virtual file system mechanism of Linux..

Remember the system call? You can use VFS to abstract a system call for various file systems, encapsulate various types of file systems into invisible ones on the upper layer, and leave some APIs at the same time. The schematic diagram is as follows

VFS design

In order to realize this VFS system, Linux adopts the object-oriented design idea, and mainly abstracts four object types:

  • Superblock object: represents an installed file system. (struct super_block)
  • Inode object: represents a specific file. (struct inode)
  • The path of a directory: the path of an item representing a directory. (struct denrtry)
  • File object: a file opened on behalf of a process. (struct file)

Confusing places

Linux treats a directory as a file object. It is another form of file, which contains one or more directory entries. The directory item is a separate abstract object, mainly including file name and index node number. Because directories can be nested layer by layer to form file paths, and each part of the path is actually a directory item.

struct super_block (represents an installed file system)


A superblock represents a file system type. For example, ext3 and ext4 have corresponding superblocks_ block structural morphology . A machine can have multiple hard disks, and a hard disk can have multiple partitions. Each partition has its own file system type. The super block also maintains various information related to the file system. Each super_ Blocks are organized by linked lists

struct super_block {
	struct list_head	s_list;	//This variable links to the super block global linked list super_ On blocks
	dev_t			s_dev;		//The block device identifier corresponding to the file system
	unsigned char		s_blocksize_bits;
	unsigned long		s_blocksize; //The block size of the file system
	loff_t			s_maxbytes;	//Maximum files supported by the file system
	struct file_system_type	*s_type; //File system types, such as ext3 and ext4
	const struct super_operations	*s_op; //Operation function of super block
	const struct dquot_operations	*dq_op; //File system quota related operations
	const struct quotactl_ops	*s_qcop; //Disk quota
	const struct export_operations *s_export_op;
	unsigned long		s_flags; //mount tag for file system
	unsigned long		s_iflags;	/* internal SB_I_* flags */
	unsigned long		s_magic;  //Magic word for this file system type
	struct dentry		*s_root; //dentry entry for global root
	struct block_device	*s_bdev;  //Corresponding block device
	struct backing_dev_info *s_bdi; //BDI device corresponding to superblock
	struct mtd_info		*s_mtd;
	//Through this variable, link to file_ system_ FS in type_ Super linked list
	struct hlist_node	s_instances;
	char			s_id[32];	/* Informational name */
	uuid_t			s_uuid;		/* UUID tune2fs -l Can view*/

	void 			*s_fs_info;	//Point to the specific file system super block structure, such as ext4_sb_info
	const struct dentry_operations *s_d_op; //The default directory entry operation function of the superblock
	struct shrinker s_shrink;	//The shrink function registered for each super block is used for memory reclamation
	/* AIO completions deferred from interrupt context */
	struct workqueue_struct *s_dio_done_wq;
	//The list of unused dentries corresponding to this superblock
	struct list_lru		s_dentry_lru ____cacheline_aligned_in_smp;
	//List of unused inode s corresponding to this superblock
	struct list_lru		s_inode_lru ____cacheline_aligned_in_smp;
	/* s_inode_list_lock protects s_inodes */
	spinlock_t		s_inode_list_lock ____cacheline_aligned_in_smp;
	struct list_head	s_inodes;	//All inode s contained in this superblock

	spinlock_t		s_inode_wblist_lock;
	struct list_head	s_inodes_wb;	//The inode that the superblock is writing back to

ps: the information of the super block is readable! Through cat, a file can be human readab. Which one did I forget... I saw it on a video

const struct dentry_operations *s_d_op stores the directory entry operation function (function pointer) of the file system

struct file_system_type

A superblock is a real installed file system_ system_ Type focuses on describing the type of the file system, which is organized by a linked list

Each file system should hang its own information to super_blocks is a global linked list

file_ system_ Mount type to super_block

The kernel is divided into two steps:

First, each file system must pass register_ The filesystem function sets its own file_system_type attached to file_ On the global variable systems,

Then call Kern_ The mount function hangs its own set table of file related operation functions to super_ On blocks. The routines (get_sb) for reading superblocks of each file system type must be implemented by itself.

struct file_system_type {
	const char *name; //File system name, such as ext4, xfs
	struct dentry *(*mount) (struct file_system_type *, int,
		       const char *, void *); //Corresponding mount function
	void (*kill_sb) (struct super_block *);
	struct module *owner;
	struct file_system_type * next; //This variable links all file system types on the system together
	struct hlist_head fs_supers; //The file system type lock contains the superblock object

struct inode (specific object of physical Device storage)

When a file is created, an inode is assigned to the file. An inode only corresponds to one actual file, and a file will have only one inode. The maximum number of inodes is the maximum number of files.

The static information in the inode structure is taken from the file system on the physical device and filled in by the function specified by the file system. It only exists in memory and can be accessed through the inode cache.

struct inode {
    struct list_headi_hash;
    struct list_headi_list;
    struct list_headi_dentry;
    struct list_headi_dirty_buffers;
    unsigned longi_ino; /*Each inode has a serial number. Through the super block structure and its serial number, we can easily find this inode.*/
    atomic_t i_count; /*In the Kernel, many structures will record their reference count to ensure that if a structure is in use, it will not be accidentally released, i_count is its reference count.*/
    kdev_t i_dev; /* inode device code */
    umode_t i_mode; /* inode Permissions for */
    nlink_t i_nlink; /* hard link Number of */
    uid_t i_uid; /* inode Owner's id */
    gid_t i_gid; /* inode Group id */
    kdev_t i_rdev; /* If inode represents device, this field will record the code of device */ 
    off_t i_size; /* inode File size represented */
    time_t i_atime; /* inode Last access time */
    time_t i_mtime; /* inode Last modification time */
    time_t i_ctime; /* inode Generation time of */ 
    unsigned long i_blksize; /* inode Block size during IO */
    unsigned long i_blocks; /* inode The number of blocks used. One block is 512 byte s*/
    unsigned long i_version; /* Version number */ 
    unsigned short i_bytes;
    struct semaphore i_sem;
    struct rw_semaphore i_truncate_sem;
    struct semaphore i_zombie;
    struct inode_operations *i_op;
    struct file_operations *i_fop;/* former ->i_op->default_file_ops */
    struct super_block *i_sb; /* inode super block of the file system */
    wait_queue_head_t i_wait;
    struct file_lock *i_flock; /* Used for file lock */
    struct address_space *i_mapping;
    struct address_space i_data;
    struct dquot *i_dquot [MAXQUOTAS];
    /* These three should probably be a union */
    struct pipe_inode_info *i_pipe;
    struct block_device *i_bdev;
    struct char_device *i_cdev;
    unsigned longi_dnotify_mask; /* Directory notify events */
    struct dnotify_struct *i_dnotify; /* for directory notifications */
    unsigned long i_state; /* inode The current status can be I_DIRTY,I_LOCK and I_ OR combination of freeing */ 
    unsigned int i_flags; /* Record the parameters of this inode */ 
    unsigned char i_sock; /* Used to record whether this inode is a socket */ 
    atomic_t i_write count;
    unsigned int i_attr_flags; /* The inode attribute used for this record */ 
    __u32 i_generation;
    union {
        struct minix_inode_info minix_i;
        struct ext2_inode_info ext2_i;
        struct ext3_inode_info ext3_i;
        struct hpfs_inode_info hpfs_i;
        struct ntfs_inode_info ntfs_i;
        struct msdos_inode_info msdos_i;
        struct umsdos_inode_info umsdos_i;
        struct iso_inode_info isofs_i;
        struct sysv_inode_info sysv_i;
        struct affs_inode_info affs_i;
        struct ufs_inode_info ufs_i;
        struct efs_inode_info efs_i;
        struct romfs_inode_info romfs_i;
        struct shmem_inode_info shmem_i;
        struct coda_inode_info coda_i;
        struct smb_inode_info smbfs_i;
        struct hfs_inode_info hfs_i;
        struct adfs_inode_info adfs_i;
        struct qnx4_inode_info qnx4_i;
        struct reiserfs_inode_info reiserfs_i;
        struct bfs_inode_info bfs_i;
        struct udf_inode_info udf_i;
        struct ncp_inode_info ncpfs_i;
        struct proc_inode_info proc_i;
        struct socketsocket_i;
        struct usbdev_inode_info usbdev_i;
        struct jffs2_inode_infojffs2_i;
        void *generic_ip;
    } u;

Although each file has a corresponding inode node, the system will establish a corresponding inode data structure in memory only when necessary. The established inode structure will form a linked list. We can get the file nodes we need by traversing the linked list,

struct dentry (an entity in main memory)

Why introduce dentry

It is necessary to introduce the directory entry structure, because the same file is represented by only one inode object Hard link The same file can be accessed through different file names, so directory entries need to be introduced in the middle

struct dentry {
	atomic_t d_count;				/* Directory item object reference counter */
	unsigned int d_flags;			/* Directory entry cache flag */
	spinlock_t d_lock;				/* Spinlocks that protect directory entry objects */
	struct inode *d_inode;			/* The inode associated with the file name */
	struct hlist_node d_hash;		/* Pointer to hash table Necklace table */
	struct dentry *d_parent;		/* Directory entry object of parent directory */
	struct qstr d_name;				/* file name */
	struct list_head d_lru;			/* Pointer to unused catalog table */
	union {
		struct list_head d_child;	/* For a directory, a pointer to the directory table in the same parent directory */
	 	struct rcu_head d_rcu;		/* Used by RCU descriptors when recycling catalog item objects */
	} d_u;
	struct list_head d_subdirs;		/* For a catalog, a subdirectory is the head of the necklace table */
	struct list_head d_alias;		/* Pointer to the catalog table associated with the same inode (alias) */
	unsigned long d_time;			/* By D_ The revalidate method uses */
	struct dentry_operations *d_op;	/* Catalog item method */
	struct super_block *d_sb;		/* Superblock object for file */
	void *d_fsdata;					/* File system dependent data */
	struct dcookie_struct *d_cookie;/* Pointer to the data structure used by the kernel configuration file */
	int d_mounted;					/* For a directory entry, a counter that records the number of file systems on which the directory entry is installed */
	unsigned char d_iname[DNAME_INLINE_LEN_MIN];	/* Space for short file names */

Three states of directory entries

1. In use d_ count > 0

2. Not used d_count == 0

3. Negative state d_inode == NULL the directory entry object does not have a corresponding valid inode

struct file (open file object)


The file structure represents an open file. Each open file in the system has an associated struct file in kernel space. It is created by the kernel when the file is opened and passed to any function that operates on the file. After all instances of the file are closed, the kernel releases the data structure. In the kernel creation and driver source code, the pointer of struct file is usually named file or filp

struct file {
  union {
        struct list_head fu_list; //Linux / list object list pointer h
        struct rcu_head fu_rcuhead; //RCU (read copy update) is a new locking mechanism in Linux 2.6 kernel
   } f_u;
  struct path f_path; //Contains dentry and mnt members, which are used to determine the file path
  #define f_ dentry f_ path. dentry //f_ One of the members of path, the dentry structure of the current file
  #define f_vfsmnt f_path.mnt / / indicates the mount root directory of the file system where the current file is located
  const struct file_operations *f_op; //The operation function associated with the file
  atomic_t f_count; //Reference count of the file (how many processes open the file)
  unsigned int f_flags; //Corresponds to the flag specified when open ing
  mode_t f_mode; //Read / write mode: open mod_t mode parameter
       loff_t     f_pos;//Current file pointer position
  off_t f_pos; //The file offset of the file in the current process
  struct fown_struct f_owner; //The function of this structure is to inform the data of I/O time through signal.
  unsigned int f_uid, f_gid;// File owner id, owner group id
  struct file_ra_state f_ra; //In Linux / include / Linux / Fs Defined in H, related to document pre reading
  unsigned long f_version;//The version number of the record file is incremented after each use
       void *f_security;
  /* needed for tty driver, and maybe others */
  void *private_data;//Use this member to point to the assigned data
  /* Used by fs/eventpoll.c to link all the hooks to this file */
      struct list_head f_ep_links;
      spinlock_t f_ep_lock;
  #endif /* #ifdef CONFIG_EPOLL */
  struct address_space *f_mapping;

dentry and inode

Inode (can be understood as ext2 inode) corresponds to a specific object on the physical disk. dentry is a memory entity in which d_inode members point to the corresponding inode. In other words, an inode can link multiple dentries at runtime, while d_count records the number of links.

According to D_ The value of count and dentry can be divided into the following three states:

1. In use d_ count > 0

2. Not used d_count == 0

3. Negative state d_inode == NULL directory entry object does not have a corresponding valid inode

File_ system_ type -->file_ system —>super_ Block diagram

Relationship diagram of each important data structure in VFS

Tags: Linux Operation & Maintenance server GNU

Posted by summoner on Sat, 30 Apr 2022 09:15:48 +0300