filebeat 
日志采集客户端。
安装 
二进制安装 
shell
wget https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-8.17.0-darwin-aarch64.tar.gz
wget https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-8.17.0-darwin-aarch64.tar.gz.sha512
shasum -a 512 -c filebeat-8.17.0-darwin-aarch64.tar.gz.sha512
tar -xzf filebeat-8.17.0-darwin-aarch64.tar.gz
xattr -d -r com.apple.quarantine filebeat-8.17.0-darwin-aarch64
cd filebeat-8.17.0-darwin-aarch64测试配置语法 ./filebeat test config -c filebeat.yml
hello world
shell
cat << EOF > dev-filebeat.yml
filebeat.inputs:
  - type: stdin
    encoding: utf-8
output.console:
  pretty: true
logging.level: warning
EOF从 stdin 读入样本数据解释后输出到 stdout: echo hello | ./filebeat -c dev-filebeat.yml -e
容器安装 
shell
docker pull docker.elastic.co/beats/filebeat:8.17.0
export PATH_DATA=$HOME/v/data/filebeat
mkdir -p $PATH_DATA
export PATH_LOG=$HOME/v/log/filebeat
mkdir -p $PATH_LOG
export PATH_ETC=$HOME/v/etc/filebeat
mkdir -p $PATH_ETC
# 按需修改
export PATH_LOGS=$HOME/v/log
mkdir -p $PATH_LOGS
# 按需修改样本配置
curl -o $PATH_ETC/filebeat.docker.yml -L -O https://raw.githubusercontent.com/elastic/beats/8.17/deploy/docker/filebeat.docker.yml
# 按需修改样本配置
cat << EOF >> $PATH_ETC/filebeat.docker.yml
...
EOF
docker run \
--restart unless-stopped \
-d \
--name=filebeat \
-m 128MB \
-v $PATH_DATA:/usr/share/filebeat/data \
-v $PATH_LOG:/usr/share/filebeat/logs \
-v $PATH_ETC/filebeat.docker.yml:/usr/share/filebeat/filebeat.yml:ro \
-v $PATH_LOGS:/path/to/v/log:ro \
-e --strict.perms=false \
docker.elastic.co/beats/filebeat:8.17.0 filebeat参考配置文件 filebeat.config.yml
核心概念 
- component
- input
- harvester
解释 spring 日志 
参考 logback 配置 /path/to/<spring_project_root>/src/main/resources/logback-spring.xml 属性值 LOG_PATTERN
xml
<?xml version="1.0" encoding="UTF-8"?>
<configuration>
    <include resource="org/springframework/boot/logging/logback/defaults.xml" />
    <property
            name="LOG_PATH"
            value="/v/log/spring" />
    <property
            name="LOG_PATTERN"
            value="%d{yyyy-MM-dd'T'HH:mm:ss.SSSXXX} %-5level %class{20}:%method:%line - [%thread] traceId:%X{traceId} - %msg%n" />
    <appender
            name="CONSOLE"
            class="ch.qos.logback.core.ConsoleAppender">
        <encoder>
            <pattern>${LOG_PATTERN}</pattern>
            <charset>utf8</charset>
        </encoder>
    </appender>
    <!-- File appender -->
    <appender
            name="FILE"
            class="ch.qos.logback.core.rolling.RollingFileAppender">
        <file>${LOG_PATH}/app.log</file>
        <rollingPolicy class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy">
            <fileNamePattern>${LOG_PATH}/app.%d{yyyy-MM-dd}.log</fileNamePattern>
            <maxHistory>30</maxHistory>
        </rollingPolicy>
        <encoder>
            <pattern>${LOG_PATTERN}</pattern>
            <charset>utf8</charset>
        </encoder>
    </appender>
    <root level="DEBUG">
        <appender-ref ref="CONSOLE" />
        <appender-ref ref="FILE" />
    </root>
    <logger
            name="org.springframework"
            level="WARN" />
</configuration>修改或添加 filebeat.yml ,要点:
- 支持解释 spring 堆栈多行日志,每行日志按时间戳 RFC3339 作为行开始切割 parsers.multiline
- 切割后按 logback pattern 拆分字段 processors.dissect
- 从日志完整路径解释出 appname processors.dissect
- dissect tokenizer 在线校验 https://dissect-tester.jorgelbg.me
yaml
logging.level: warning
filebeat.inputs:
  - type: filestream
    id: applog
    paths:
      - /v/log/**/*.log
    parsers:
      - multiline:
          type: pattern
          # match logs: spring/logback, Go, ElasticSearch etc.
          pattern: '2\[*\d{3}-[012]\d-[0123]\d(T| )'
          negate: true
          match: after
setup.template.settings:
  index.number_of_shards: 1
processors:
  - dissect:
      tokenizer: "%{timestamp} %{level} %{class}:%{method}:%{line|integer} - [%{thread}] - %{content}"
      field: "message"
      target_prefix: ""
      overwrite_keys: true
  - dissect:
      tokenizer: "/v/log/%{appname}/%{log_filename}"
      field: "log.file.path"
      target_prefix: ""
      overwrite_keys: true
  - timestamp:
      field: timestamp
      layouts:
        - "2006-01-02T15:04:05.999-07:00"
      test:
        - "2024-12-27T22:46:06.684+08:00"同时支持解释 Go 和 Java/Spring 日志 
yaml
processors:
  - add_fields:
      target: ""
      fields:
        thread: ""
        class: ""
  - dissect:
      description: "extract the appname field from full log path"
      tokenizer: "/v/log/%{appname}/%{log_filename}"
      field: "log.file.path"
      target_prefix: ""
      overwrite_keys: true
      ignore_missing: true
      ignore_failure: true
  - dissect:
      description: "extract fields for Java/Spring"
      tokenizer: "%{timestamp} %{level} %{class}:%{method}:%{line|integer} - [%{thread}] traceId:%{traceId} - %{content}"
      field: "message"
      target_prefix: ""
      overwrite_keys: true
      ignore_missing: true
      ignore_failure: true
      when:
        regexp:
          message: " traceId:"
  - dissect:
      description: "extract fields for Go"
      when:
        equals:
          thread: ""
          class: ""
      tokenizer: "%{timestamp}\t%{level}\t%{sourcefile}:%{line}\t%{content}"
      field: "message"
      target_prefix: ""
      overwrite_keys: true
      ignore_missing: true
      ignore_failure: truedocker 环境下修正解释 hostname 
yaml
# fix hostname
- add_kubernetes_metadata:
    - drop_fields:
        fields: ["host.name"]
        ignore_missing: true
    - copy_fields:
        fields:
          - from: kubernetes.node.name
            to: host.name
        fail_on_error: false
        ignore_missing: truehttps://github.com/elastic/beats/issues/13589#issuecomment-688741290
添加 host IP address 
yaml
# add host ip address
- add_host_metadata:
    ip_fields: ["source.ip", "host.ip"]
    host_type: "source"
    netinfo.enabled: true使用 es output 
yaml
output.elasticsearch:
  hosts: ["localhost:9201", "localhost:9202", "localhost:9203"]另见 
https://www.elastic.co/guide/en/beats/filebeat/8.17/filebeat-overview.html
Filebeat vs fluentbit 
| Features | filebeat (elastic/beats) | fluentbit | 
|---|---|---|
| Language | Go | C | 
| ensure at-least-once delivery/no data loss | yes | |
| Hot reload | yes | yes | 
| Accept file does not exists | yes | |
| Parse JSON in regular expression | yes | |
| highlights | - easy-to-use | - Use system memory (heap) for high performance - MessagePack - Supports lots of inputs, filters and oupouts | 
| Sizes | Version 8.16.1 Docker image 354 MB | macOS binary 151 MB | 
