我在 Docker Swarm 中使用 Loki 和 promtail 从 3 个主机上的容器中获取日志。Promtail 处于全局模式。部署堆栈文件后,所有运行服务的日志都在 Grafana 中,但一段时间(几天)后,容器日志的某些部分消失了。出现了一些网络问题,虽然所有服务都重启了,但并不是所有的容器日志都出现了。
docker-stack.yml
loki:
image: grafana/loki:latest
logging:
driver: json-file
options:
tag: "docker/loki"
volumes:
- ./loki/loki-config.yaml:/etc/loki/loki-config.yaml
- loki:/data/loki
command: -config.file=/etc/loki/loki-config.yaml
networks:
- monitor-net
- traefik
deploy:
placement:
constraints:
- node.role==manager
labels:
- "traefik.enable=true"
- traefik.docker.network=default_traefik
- traefik.http.routers.loki-http.rule=Host(`swarm.loki`)
- traefik.http.routers.loki-http.entrypoints=http
- traefik.http.routers.loki-http.middlewares=https-redirect
- traefik.http.routers.loki-https.rule=Host(`swarm.loki`)
- traefik.http.routers.loki-https.entrypoints=https
- traefik.http.routers.loki-https.tls=true
- traefik.http.routers.loki-https.tls.certresolver=le
- traefik.http.services.loki.loadbalancer.server.port=3100
restart_policy:
condition: on-failure
promtail:
image: grafana/promtail:latest
volumes:
- /var/log:/var/log
- /var/lib/docker/containers:/var/lib/docker/containers
- ./promtail:/etc/promtail-config/
command: -config.file=/etc/promtail-config/promtail-config.yaml
networks:
- traefik
- monitor-net
logging:
driver: json-file
options:
tag: "docker/promtail"
deploy:
mode: global
promtail-config.yaml
server:
http_listen_port: 3100
grpc_listen_port: 0
positions:
filename: /tmp/positions.yaml
client:
url: http://loki:3100/api/prom/push
scrape_configs:
- job_name: system
static_configs:
- targets:
- 192.168.56.103 # on each host its ip is written
labels:
job: varlogs
__path__: /var/log/*log
- job_name: containers
static_configs:
- targets:
- 192.168.56.103
- labels:
job: containerlogs
hostname: vm2
__path__: /var/lib/docker/containers/*/*log
pipeline_stages:
- json:
expressions:
stream: stream
attrs: attrs
tag: attrs.tag
hostname: hostname
- labels:
tag:
hostname:
stream:
loki-config.yaml
auth_enabled: false
server:
http_listen_port: 3100
common:
path_prefix: /loki
storage:
filesystem:
chunks_directory: /loki/chunks
rules_directory: /loki/rules
replication_factor: 1
ring:
instance_addr: 127.0.0.1
kvstore:
store: inmemory
schema_config:
configs:
- from: 2022-02-05
store: boltdb-shipper
object_store: filesystem
schema: v11
index:
prefix: index_
period: 24h
limits_config:
ingestion_rate_mb: 15
ingestion_burst_size_mb: 20
问题是什么,解决办法是什么?
提前致谢。