Loki
Loki 是由 Grafana Labs 开发的开源日志聚合系统,用于收集和查询日志数据,首个版本发布于 2018 年; 与 Prometheus 类似,Loki 通过标签(labels)来组织日志数据,区别是,prome 有 metric name 和 series 概念,但 loki 无,并引入了 stream 概念。
Loki 通常和其关联组件一起使用:
- loki 日志存储
- promtail 日志采集端
- (可选)logcli 命令行查询工具
- grafana 可视化查询和仪表板
- loki-canary loki 性能优化工具
安装
测试环境
- macOS arm64
- loki 3.3.1
下载和启动 loki
mkdir -p loki-3.3.1
cd loki-3.3.1
wget https://github.com/grafana/loki/releases/download/v3.3.1/loki-darwin-arm64.zip
unzip loki-darwin-arm64.zip
wget https://raw.githubusercontent.com/grafana/loki/main/cmd/loki/loki-local-config.yaml
./loki-darwin-arm64 -config.file=loki-local-config.yaml采集业务应用日志
示例:基于 loki+promtail+grafana 采集、存储和展示查询某个 Go 应用日志
测试环境
- macOS arm64
- loki 3.3.1
参考签名步骤安装 loki 。
修改 loki 主配置文件 loki-local-config.yaml:
cat << EOF > loki-local-config.yaml
auth_enabled: false
server:
http_listen_port: 3100
grpc_listen_port: 9096
log_level: warn
grpc_server_max_concurrent_streams: 1000
common:
instance_addr: 127.0.0.1
path_prefix: data
storage:
filesystem:
chunks_directory: data/chunks
rules_directory: data/rules
replication_factor: 1
ring:
kvstore:
store: inmemory
query_range:
results_cache:
cache:
embedded_cache:
enabled: true
max_size_mb: 100
schema_config:
configs:
- from: 2020-10-24
store: tsdb
object_store: filesystem
schema: v13
index:
prefix: index_
period: 24h
pattern_ingester:
enabled: true
metric_aggregation:
loki_address: localhost:3100
ruler:
alertmanager_url: http://localhost:9093
frontend:
encoding: protobuf
# Disable Loki send reporting to Grafana Labs.
analytics:
reporting_enabled: false
limits_config:
volume_enabled: true
EOF主要变动:
- 修改日志级别为 warn
- 存储数据在本地磁盘,当前目录
data/下 - 禁止发送匿名统计数据到 Grafana Labs
- 启用 volume_enabled 查询特性给 grafana
重启 loki 进程。
下载 promtail 日志采集端
mkdir -p promtail-3.3.1
cd promtail-3.3.1
wget https://github.com/grafana/loki/releases/download/v3.3.1/promtail-darwin-arm64.zip
unzip promtail-darwin-arm64.zip创建 promtail 主配置文件 config.yaml
cat >> EOF < config.yaml
server:
http_listen_address: 127.0.0.1
http_listen_port: 9080
positions:
filename: data/positions.yaml
clients:
- url: http://localhost:3100/loki/api/v1/push
scrape_configs:
- job_name: hello
pipeline_stages:
- regex:
expression: "^(?P<timestamp>.{19}) (?P<file>[^:]+):(?P<lineno>\\d+): (?P<level>(DEBUG|INFO|WARN|ERROR))\\s+(?P<content>.*)$"
- labels:
file:
level:
lineno:
static_configs:
- targets:
- localhost
labels:
__path__: /tmp/hello/*.log
EOF启动 promtail ./promtail-darwin-arm64 --config.file=config.yaml --server.enable-runtime-reload
启动参数说明:
--config.file: 指定配置文件--server.enable-runtime-reload: 支持通过 HTTP 接口重载配置curl -XPOST localhost:9080/reload
编写和启动产生日志 hello 应用 hello/main.go,日志输出类似:
2024/12/09 23:47:44 main.go:32: WARN KBywPUvEVr
2024/12/09 23:47:46 main.go:32: WARN bcsGjqaVyv
2024/12/09 23:47:47 main.go:36: ERROR ubzcSrVtaw
2024/12/09 23:47:48 main.go:32: WARN CiOQzzKQWy
2024/12/09 23:47:50 main.go:36: ERROR fiRzkHYyPq调试 promtail ,将解释结果输出到 stdout: tail -n 10 ~/path/to/my/main.log | ./promtail --config.file=./promtail-local-config.yaml --stdin --dry-run --inspect --client.url http://127.0.0.1:3100/loki/api/v1/push
see also: https://grafana.com/docs/loki/latest/send-data/promtail/troubleshooting/
下载和使用 logcli 命令行日志查询工具
wget https://github.com/grafana/loki/releases/download/v3.3.1/logcli-darwin-arm64.zip
unzip logcli-darwin-arm64.zip查询 level 为 ERROR,内容包含 dFF 的日志
./logcli-darwin-arm64 query '{level="ERROR"} |= `dFF`'输出类似
http://localhost:3100/loki/api/v1/query_range?direction=BACKWARD&end=1733762019581322000&limit=30&query=%7Blevel%3D%22ERROR%22%7D+%7C%3D+%60dFF%60&start=1733758419581322000
Common labels: {detected_level="ERROR", file="main.go", filename="/tmp/hello/hello.log", level="ERROR", lineno="36", service_name="unknown_service"}
2024-12-10T00:32:02+08:00 {} 2024/12/10 00:32:02 main.go:36: ERROR dFFGkFDtax
http://localhost:3100/loki/api/v1/query_range?direction=BACKWARD&end=1733761922173275001&limit=30&query=%7Blevel%3D%22ERROR%22%7D+%7C%3D+%60dFF%60&start=1733758419581322000
Common labels: {detected_level="ERROR", file="main.go", filename="/tmp/hello/hello.log", level="ERROR", lineno="36", service_name="unknown_service"}配置 Grafana 可视化查询
- 新增 data source - loki
- 在 Home - Explore 选 loki
- 或,在 Home - Explore - Logs
FAQs
prometail 报错 error 429
level=warn ts=2024-12-09T11:07:38.00215Z caller=client.go:419 component=client host=localhost:3100 msg="error sending batch, will retry" status=429 tenant= error="server returned HTTP status 429 Too Many Requests (429): Ingestion rate limit exceeded for user fake (limit: 4194304 bytes/sec) while attempting to ingest '5164' lines totaling '1044135' bytes, reduce log volume or contact your Loki administrator to see if the limit can be increased"
设置数据存储时长
https://grafana.com/docs/loki/latest/operations/storage/retention/
扩容集群
https://grafana.com/docs/loki/latest/operations/scalability/
无法使用 cloudflare R2 存储
Test env
- loki-3.3.1
- date 2024-12-25
level=error ts=2024-12-25T10:15:47.195718Z caller=table_manager.go:143 index-store=tsdb-2024-12-25 msg="failed to upload table" table=index_20082 err="XAmzContentSHA256Mismatch: The provided 'x-amz-content-sha256' header does not match what was computed.\n\tstatus code: 400, request id: , host id: "
see also
- https://gocloud.dev/howto/blob/#s3-compatible
- https://github.com/grafana/loki/issues/9224
- https://github.com/grafana/loki/issues/11887#issuecomment-2447916912
- https://discourse.gohugo.io/t/deploy-to-s3-compatible-bucket-cloudflare-r2-xamzcontentsha256mismatch/50588/2
See also
- 2024-03-30 低存储成本+水平扩展+支持多租户的日志系统 loki 使用心得分享
