Linux study notes

Posted May 28, 20203 min read

sed

Full name:steam editor

Single quotes & double quotes & backquotes

apostrophe

The characters within the single quotes are ordinary characters and will not have any special meaning. Includes escape characters. So all characters will be output as is

Double quotes

The characters in double quotes, $, \, These three characters have special meanings. Where $represents the value of the reference variable, and `` ``represents the referenced command. The backslash is used to escape these two characters and the double quotes themselves.

If you want to print these special characters, you need to use transfer, otherwise it will fail

# Direct printing, it will be wrong
[root @ hadoop usr]# echo "` "
> ^ C

##

Backticks

Strings enclosed in backticks will be recognized as commands. It is equivalent to $(), the latter is recommended, because backticks are easily confused with single quotes.

to sum up

  • When there are single quotes in the string to be matched or used for replacement, you can only use double quotes.

  • When you want to match or replace double quotes, you can use single quotes, or double quotes and use to escape

  • In the sed command, the quotation marks are parsed by the shell, and the sed command just obtains the results of the shell analysis. And different shells may parse in different ways. So the safest way is to write in the file and execute it with sed -f, avoiding the use of quotes.

    #Error usage-$s will be parsed into a variable under double quotes. So it will report an error.
    sed "1,3s/my/your/g; 3, $s/This/That/g" my.txt

    #Correct usage--using single quotes will not do any parsing
    sed '1,3s/my/your/g; 3, $s/This/That/g' my.txt
    #Correct usage--escaping the $symbol
    sed "1,3s/my/your/g; 3, \ $s/This/That/g" my.txt

matching lines

Only replace the first s of each line:

$sed 's/s/S/1' my.txt

Only replace the second s of each line:

$sed 's/s/S/2' my.txt

Only replace the third and subsequent s in the first line:

$sed '1s/s/S/3g' my.txt

Special symbols and commands

1. &

& Is used to indicate the matching result.

2. Parentheses

Example of using parenthesis matching:(The string matched by the regular expression enclosed in parentheses can be used as a variable, sed uses 1, 2, ...)

$sed 's/This is my \([^, &]* \),. * is \(. * \)/\ 1:\ 2/g' my.txt

3 . BRE & ERE

Basic regular expression and Extended regular expression

The backslash is used to escape special characters in the string. I found a little doubt when using parentheses. The examples given on the Internet all need to escape the parentheses with \, but is this not to match the parentheses themselves? However, in practice, it is found that the characters will be directly matched when they are not escaped, and the role of parentheses will be reflected only after the escape.

As the following example:

##
$echo "()" | sed 's/()/a/g'
a

##
$echo "abc" | sed 's/\(b \)/\ 1 \ 1/g'
abbc

When you learned regular expressions before, you only need to escape when you need to match the parentheses themselves. This is the difference between Basic Regular Expression and Extended Regular Expression.

In linux text processing commands, both grep and sed only support basic regularization, while egrep and awk support extended regularization. But grep and sed can also support extended regularization via -E and -r parameters respectively

Basic Regular

The basic regular metacharacters are as follows:

1. ^
2. $
3..
4. *
5. []
6. \ <
7. \>
8. \(\)
9. \?
10. \ +

Many blogs on the Internet classify ? And + as extended rules, but note that in the basic rules, there are these two symbols, just to be used by escaping

Testing in Linux shows that these two metacharacters can be used normally.
There are also blogs saying that | is only extended by regularity

$echo "ABC" | sed "s/A \ + BC/DDD /"
DDD

$echo "ABC" | sed "s/A \? BC/DDD /"
DDD

Extended Regular

Extended regularization just removes the escape of some commonly used metacharacters and adds a few new metacharacters.

1. +
2. ?
3.()

So I understand why the brackets in sed should be escaped. Because sed uses basic regularization, if you want to use parentheses without backslashes, then add the command parameter -r to let sed support extended regularization. The same is true for grep, you can use egrep or grep -E