Linux三剑客之 grep

柳三千

文章最后更新时间:2025年06月11日

前言

grep(Global Regular Expression Print)是 Linux 和类 Unix 系统中一个非常强大且常用的文本搜索工具,用于在文本中查找匹配指定模式的行,并将匹配的行输出。

应用场景

  • 日志文件分析:在系统日志文件中查找特定的错误信息或事件记录。例如,grep "ERROR" /var/log/syslog 可以在系统日志中查找包含 “ERROR” 的行。
  • 代码搜索:在代码文件中查找特定的函数名、变量名或关键字。例如,grep -r "function_name" /path/to/code 可以在指定的代码目录中查找包含 “function_name” 的行。
  • 数据筛选:从大量文本数据中筛选出符合特定条件的数据。例如,grep "[0-9]{3}-[0-9]{2}-[0-9]{4}" data.txt 可以从 data.txt 文件中筛选出符合美国社会安全号码格式(如 123-45-6789)的行。

基本语法

grep [选项] 模式 [文件...]
  • 模式:指要查找的字符串或正则表达式。
  • 文件:指定要在哪些文件中进行查找,若不指定文件,grep 会从标准输入读取数据。

示例:在 test.txt 文件中查找包含 “apple” 的行:

grep "apple" test.txt

常用选项

  • **-i**:忽略大小写匹配。例如,grep -i "apple" test.txt 会查找包含 “apple”、”Apple”、”APPLE” 等的行。
  • **-v**:反向匹配,输出不包含指定模式的行。例如,grep -v "apple" test.txt 会输出 test.txt 中不包含 “apple” 的行。
  • **-r/-R**:递归查找,用于在目录及其子目录下的所有文件中搜索。例如,grep -r "apple" /path/to/directory
  • **-n**:显示匹配行的行号。例如,grep -n "apple" test.txt
  • **-c**:只输出匹配的行数,不输出具体内容。例如,grep -c "apple" test.txt
  • **-C**:输出匹配行的同时,额外显示前后若干行(上下文信息)。
  • **-w**:只匹配完整单词。例如,grep -w "apple" test.txt 仅匹配作为完整单词的 “apple”,不匹配 “applet” 等。

正则表达式支持

grep 支持基本正则表达式(BRE)和扩展正则表达式(ERE),通过 -E 选项使用扩展正则。以下是常用元字符:

  • **.**:匹配任意单个字符。例如,grep "a.e" test.txt 匹配 “abe”、”ace” 等。
  • *****:匹配前面字符零次或多次。例如,grep "ab*c" test.txt 匹配 “ac”、”abc”、”abbbc” 等。
  • **^**:匹配行开头。例如,grep "^apple" test.txt 匹配以 “apple” 开头的行。
  • **$**:匹配行结尾。例如,grep "apple$" test.txt 匹配以 “apple” 结尾的行。

练习文档

[ldx@VM-20-5-opencloudos ~]$ cat system_log.txt 
[2024-01-01 10:00:00] INFO: System started successfully.
[2024-01-01 10:05:00] WARN: Disk space is low, only 10GB left.
[2024-01-01 10:10:00] ERROR: Database connection failed.
[2024-01-01 10:15:00] INFO: User "John" logged in.
[2024-01-01 10:20:00] DEBUG: Query executed: SELECT * FROM users;
[2024-01-01 10:25:00] INFO: Service "WebServer" started.
[2024-01-01 10:30:00] WARN: High CPU usage detected, 90%.
[2024-01-01 10:35:00] ERROR: File not found: /var/www/html/index.html.
[2024-01-01 10:40:00] INFO: User "Alice" logged out.
[2024-01-01 10:45:00] DEBUG: Memory allocation: 512MB.

实战练习

  1. 查找所有包含 INFO 级别的日志行

    # 只打印匹配的文件名
    [ldx@VM-20-5-opencloudos ~]$ grep -l INFO ./*.  
    ./system_log.txt
    
    # 打印匹配的行及内容
    [ldx@VM-20-5-opencloudos ~]$ grep INFO ./*  
    ./system_log.txt:[2024-01-01 10:00:00] INFO: System started successfully.
    ./system_log.txt:[2024-01-01 10:15:00] INFO: User "John" logged in.
    ./system_log.txt:[2024-01-01 10:25:00] INFO: Service "WebServer" started.
    ./system_log.txt:[2024-01-01 10:40:00] INFO: User "Alice" logged out.
    
  2. 查找所有不包含 DEBUG 级别的日志行

    [ldx@VM-20-5-opencloudos ~]$ grep -v 'DEBUG' system_log.txt  
    [2024-01-01 10:00:00] INFO: System started successfully.
    [2024-01-01 10:05:00] WARN: Disk space is low, only 10GB left.
    [2024-01-01 10:10:00] ERROR: Database connection failed.
    [2024-01-01 10:15:00] INFO: User "John" logged in.
    [2024-01-01 10:25:00] INFO: Service "WebServer" started.
    [2024-01-01 10:30:00] WARN: High CPU usage detected, 90%.
    [2024-01-01 10:35:00] ERROR: File not found: /var/www/html/index.html.
    [2024-01-01 10:40:00] INFO: User "Alice" logged out.
    
  3. 查找包含 ERROR 级别的日志行并显示行号

    [ldx@VM-20-5-opencloudos ~]$ grep -n "ERROR" system_log.txt  
    3:[2024-01-01 10:10:00] ERROR: Database connection failed.
    8:[2024-01-01 10:35:00] ERROR: File not found: /var/www/html/index.html.
    
  4. 递归查找当前目录下所有包含 User 的行

    [ldx@VM-20-5-opencloudos ~]$ grep -r "User" ./*  
    ./system_log.txt:[2024-01-01 10:15:00] INFO: User "John" logged in.
    ./system_log.txt:[2024-01-01 10:40:00] INFO: User "Alice" logged out.
    
  5. 查找作为完整单词出现的 Service 的行

    [ldx@VM-20-5-opencloudos ~]$ grep -w "Service" system_log.txt  
    [2024-01-01 10:25:00] INFO: Service "WebServer" started.
    
  6. 统计包含 WARN 级别的日志行数

    [ldx@VM-20-5-opencloudos ~]$ grep -c "WARN" system_log.txt  
    2
    
  7. 查找以日期开头(格式为 [YYYY-MM-DD)的行

    [ldx@VM-20-5-opencloudos ~]$ grep -E "^\\[[0-9]{4}-[0-9]{2}-[0-9]{2}" system_log.txt  
    [2024-01-01 10:00:00] INFO: System started successfully.
    [2024-01-01 10:05:00] WARN: Disk space is low, only 10GB left.
    [2024-01-01 10:10:00] ERROR: Database connection failed.
    [2024-01-01 10:15:00] INFO: User "John" logged in.
    [2024-01-01 10:20:00] DEBUG: Query executed: SELECT * FROM users;
    [2024-01-01 10:25:00] INFO: Service "WebServer" started.
    [2024-01-01 10:30:00] WARN: High CPU usage detected, 90%.
    [2024-01-01 10:35:00] ERROR: File not found: /var/www/html/index.html.
    [2024-01-01 10:40:00] INFO: User "Alice" logged out.
    [2024-01-01 10:45:00] DEBUG: Memory allocation: 512MB.
    
  8. 查找包含两个连续数字的行(扩展正则表达式)

    [ldx@VM-20-5-opencloudos ~]$ grep -E "[0-9]{2}" system_log.txt  
    [2024-01-01 10:00:00] INFO: System started successfully.
    [2024-01-01 10:05:00] WARN: Disk space is low, only 10GB left.
    [2024-01-01 10:10:00] ERROR: Database connection failed.
    [2024-01-01 10:15:00] INFO: User "John" logged in.
    [2024-01-01 10:20:00] DEBUG: Query executed: SELECT * FROM users;
    [2024-01-01 10:25:00] INFO: Service "WebServer" started.
    [2024-01-01 10:30:00] WARN: High CPU usage detected, 90%.
    [2024-01-01 10:35:00] ERROR: File not found: /var/www/html/index.html.
    [2024-01-01 10:40:00] INFO: User "Alice" logged out.
    [2024-01-01 10:45:00] DEBUG: Memory allocation: 512MB.
    
文章版权声明:除非注明,否则均为柳三千运维录原创文章,转载或复制请以超链接形式并注明出处。

取消
微信二维码
微信二维码
支付宝二维码