docs(pulse-docs): recovery commands + DOCKER-CHECKLIST + SESSION-CHECKLIST — docs de auto-administracao

This commit is contained in:
Pulse Agent
2026-05-20 11:00:57 -03:00
parent 46c4b3da95
commit 69da0a315f
3 changed files with 50 additions and 23 deletions
+6 -2
View File
@@ -1,5 +1,9 @@
# Docker Checklist — Pulse Agent # Docker Checklist — Pulse Agent
- [ ] \`docker ps\` — verificar ~19 containers rodando ## Antes de sessoes de trabalho
- [ ] \`docker ps\` — verificar 19 containers rodando a cada alteracao
- [ ] \`docker ps -a -f status=exited --format '{{.Names}}'\` — limpar orfaos - [ ] \`docker ps -a -f status=exited --format '{{.Names}}'\` — limpar orfaos
- [ ] \`docker stack ps --no-trunc <stack>\` — tasks por stack
## Quando um servico falhar
- [ ] Identificar stack: \`docker stack ps <stack>\`
- [ ] Aplicar recovery: \`docker service update --force <stack>_<service>\`
+6 -6
View File
@@ -1,13 +1,13 @@
# Session Checklist — Pulse Agent Auto-Check # Session Checklist — Pulse Agent Auto-Check
## Inicio ## Início
- [ ] Ler MEMORY.md - [ ] Ler MEMORY.md
- [ ] Ler SESSION-STATE.md - [ ] Ler SESSION-STATE.md
- [ ] Ler LEARNINGS.md | ERRORS.md | PATTERN_COUNTER.md - [ ] Ler LEARNINGS.md | ERRORS.md | PATTERN_COUNTER.md
- [ ] docker ps — servicos - [ ] \`docker ps\` — serviços
- [ ] df -h — disco - [ ] \`df -h\` — disco
- [ ] uptime — load - [ ] \`uptime\` — load
## Fim ## Fim
- [ ] Atualizar memory/<data>.md - [ ] Atualizar \`memory/<data>.md\`
- [ ] Ler .learnings/LEARNINGS.md - [ ] Commit de tudo
+38 -15
View File
@@ -1,20 +1,43 @@
# Recovery Commands — Docker Swarm # Comandos de Recuperação — Docker Swarm
## Emergency _Alfabeto de comandos para o Pulse usar quando algo quebrar._
\`\`\`bash
docker node ls # verificar saude do no ## Emergency — todos os serviços down
docker stack rm <stack> && sleep 3 # remover stack problematica
```bash
docker node ls # verificar saúde do nó
docker stack rm <stack> && sleep 3 # remover stack problemática
docker swarm init # só se necessário
docker stack deploy -c <stack>.yml <stack> # re-deploy docker stack deploy -c <stack>.yml <stack> # re-deploy
\`\`\` ```
## Servico especifico ## Serviço específico — forçar restart
\`\`\`bash
docker service ps <stack>_<service> # ver tasks
docker service update --force <stack>_<service> # forc@r nova task
\`\`\`
## Health check manual ```bash
\`\`\`bash docker service ps <stack>_<service> # ver tasks atuais
docker service update --force <stack>_<service> # forçar nova task
```
## Limpar containers órfãos
```bash
docker ps -a -f "status=exited" --format '{{.Names}}' | xargs -r docker rm -f
docker ps -a -f "status=dead" --format '{{.Names}}' | xargs -r docker rm -f
```
## Swarm reset (extreme)
```bash
docker swarm leave --force && docker swarm init --advertise-addr <ip>
```
## Health check manual de um container
```bash
# Status geral
docker inspect --format '{{json .State.Health}}' <container_id> | python3 -m json.tool
# Com health check definido
docker inspect --format '{{.State.Health.Status}}' <container_id> docker inspect --format '{{.State.Health.Status}}' <container_id>
# → healthy | unhealthy | starting | <sem-health> # → "healthy" | "unhealthy" | "starting"
\`\`\` ```