awk is a programming language for processing text, pattern matching. With sed and grep, commonly known as the Three Musketeers under Linux. Learning awk means that you have another option for processing text in the Linux command line. This article focuses on teaching you how to use it. After reading this article, you will roughly know how to use it, and try to use it simply.
Terminology foreshadowing
In awk's text processing rules, awk treats text files as text databases consisting of fields and records. By default, awk treats each line as a record, that is, the record separator is \n, and the record separator can be changed by the built-in variable RS.
In each record, the record is divided into several fields, that is, the record is composed of fields, and the default separator of the fields is a space or a tab.
1. Basic usage
Like the Linux commands we usually use, awk is also used in a certain format, the format is as follows:
# use format awk Executed event file # E.g: [root@iamshuaidi ~]# awk '{print $0}' test.txt my first language:Java second languange:python third language:C
Note: You can pull left and right
Among them, print means printing, $0 means an entire record, and test.txt means a file. so
awk '{print $0}' test.txt
Indicates that each line of records in the test.txt file is printed out.
We just said that records are composed of fields, and the default delimiter for fields is space or tab. Below we print the first field of each record, as follows:
# print the first field of each line [root@iamshuaidi # awk '{print $1}' test.txt my second third
$0 means the whole record, but $1, $2, $3..... means the first field in the whole record, the second field... .
Just now we said that the default delimiter of a field is a space or a tab. The default means that we can explicitly specify the delimiter ourselves. Let's use ":" as our delimiter.
# print the second field [root@iamshuaidi ~]# awk -F ':' '{print $2}' test.txt Java python C
Above we used the parameter -F to specify our delimiter, that is, if you want to specify the delimiter of the field, you can use the parameter -F to specify the delimiter.
2. Conditional restrictions
When printing text, we can specify some conditions. The format is as follows:
awk Parameter Condition Action to execute File
For example, we specify that the delimiter is ":", and the condition is the record whose second field is "Java".
# Print the text with the second field as "Java" [root@iamshuaidi ~]# awk -F ':''$2 == "Java" {print $2}' test.txt Java
Print the second field of odd lines:
# print records with odd lines [root@iamshuaidi ~]# awk -F ':' 'NR % 2 == 1 {print $2}' test.txt Java C
Among them, NR is a built-in variable that represents the record currently being processed, that is, the current record is the number of records.
3. Conditional Statements
Like our usual programming, awk also provides if, else, while and other conditional statements.
For example, print the second and following records:
root@iamshuaidi ~]# awk '{if(NR > 1) print $2}' test.txt languange:python language:C
Note that the field separator above is a space, and the if statement is specified in "{}".
Let's look at another example:
# If the first field is greater than "s", print the first field, otherwise print the second field [root@iamshuaidi ~]# awk '{if($1 < "s") print $1; else print $2}' test.txt my languange:python language:C
Note: You can pull left and right
The above prints: if the first field is greater than "s", print the first field, otherwise print the second field.
4. Function
awk provides some built-in functions for us to use. The commonly used functions are as follows:
tolower(): Characters are converted to lowercase. toupper(): Convert characters to uppercase length(): Returns the length of the string. substr(): Return a substring. sqrt(): square root. rand(): random number.
For example, we want to convert the printed field to size
# Convert the first field to uppercase output [root@iamshuaidi ~]# awk '{print toupper($1)}' test.txt MY SECOND THIRD
5. Variables
Just now we said that NR is a built-in variable that indicates which record is currently being processed. The commonly used built-in variables are as follows:
NR: Indicates which line is currently being processed NF: Indicates how many fields the current row has FILENAME: current file name FS: Field separator, default is space and tab. RS: Line separator, used to split each line, default is newline. OFS: Separator for output fields, used to separate fields when printing, defaults to spaces. ORS: Separator for output records, used to separate records when printing, defaults to newline.
For example, if we want to print the last field of each record, we can use the variable NF.
[root@iamshuaidi ~]# awk '{print $NF}' test.txt language:Java languange:python language:C
By the way, the NR variable just now is also very useful, for example:
# Mark the current row, so it seems more comfortable? [root@iamshuaidi ~]# awk '{print NR ". " $0}' test.txt 1. my first language:Java 2. second languange:python 3. third language:C
This is basically the end of this article. This article is an introductory article, which shields many details. It briefly introduces how to use it. For more specific usage, you can find related functions according to the functions you want to implement.
Have a harvest? Click the bottom card to add a chicken leg as a reward?