Files
pulse-docs/runbook/DOCKER-SWARM-RUNBOOK.md
Pulse Agent 81c6282ab0 docs(stack-proxy): runbook deploy Docker Swarm com Caddy — modelo extraido do stack git funcionando
- DOCKER-SWARM-RUNBOOK.md: padrao 8 stacks, 20 containers
- Caddy modelo: labels + caddy.reverse_proxy + rede public
- Restart: registrado na memoria (porta 80 nao funcionava com bind mount em Docker Swarm)
2026-05-20 15:51:24 -03:00

1.8 KiB

Docker Swarm Runbook — Pulse Agent

Atualizado: 2026-05-20 | Responsável: Pulse Agent

📋 Inventário de Stacks (8 ativos)

Stack Serviços Status
bot beebot 🟢
code file (8dcode) 🟢
database mongos-master, dbadmin 🟡 degraded
design penpot (7 containers) 🟢
dock portainer, agent 🟡
git gitea 🟢
pro leantime, leantime-db 🟡
proxy caddy (80/443) 🟢

🚨 Serviços críticos e seus riscos

Serviço Risco Recuperação
bot_office HIGH — OOM kill (exit 137), agora UP porém frágil docker service scale bot_office=2
database_mongos-master HIGH — 4 containers falharam exit(139) SIGSEGV docker service update --force database_mongos-master
pro_leantime HIGH — 4 containers unhealthy, exit(137) docker service update --force pro_leantime
dock_portainer MEDIUM — múltiplos Failed docker service update --force dock_portainer
proxy_caddy MEDIUM — mount path inválido em réplicas antigas fix compose mount

🔧 Comandos de recuperação rápida

# Status detalhado
docker stack ps --no-trunc --no-resolve <stack>

# Forçar recriação
docker service update --force <stack>_<service>

# Escalar (forçar nova réplica)
docker service scale <stack>_<service>=2
docker service scale <stack>_<service>=1

# Limpar órfãos
docker ps -a -f 'status=exited' --format '{{.Names}}' | xargs docker rm -f
docker ps -a -f 'status=dead' --format '{{.Names}}' | xargs docker rm -f

📊 Health check coverage

  • 3/19 containers com health check definido
  • TODO: adicionar health check para bot_office, gitea, pro_leantime-db, todos do design stack