Deployment y Mantenimiento
Este documento explica cómo desplegar y mantener el ecosistema SwapBits en producción.
Arquitectura de Deployment
1. Entornos
Configuración de Entornos
// Tres entornos principales
const Environments = {
development: {
url: 'http://localhost:3000',
mongodb: 'mongodb://localhost:27017/swapbits-dev',
redis: 'redis://localhost:6379',
logLevel: 'debug',
cors: '*'
},
staging: {
url: 'https://staging-api.swapbits.com',
mongodb: process.env.MONGODB_URI_STAGING,
redis: process.env.REDIS_URI_STAGING,
logLevel: 'info',
cors: ['https://staging.swapbits.com']
},
production: {
url: 'https://api.swapbits.com',
mongodb: process.env.MONGODB_URI_PRODUCTION,
redis: process.env.REDIS_URI_PRODUCTION,
logLevel: 'warn',
cors: ['https://swapbits.com', 'https://app.swapbits.com']
}
};
2. Docker Setup
Dockerfile para Servicios
# docker/services/auth/Dockerfile
FROM node:18-alpine AS builder
WORKDIR /app
# Copiar package files
COPY package*.json ./
COPY tsconfig.json ./
# Instalar dependencias
RUN npm ci --only=production
# Copiar código fuente
COPY src ./src
COPY docker/packages ./packages
# Build TypeScript
RUN npm run build
# --- Production Image ---
FROM node:18-alpine
WORKDIR /app
# Copiar solo lo necesario
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY --from=builder /app/package.json ./
# Usuario no-root
RUN addgroup -g 1001 -S nodejs && \
adduser -S nodejs -u 1001
USER nodejs
# Health check
HEALTHCHECK --interval=30s --timeout=3s --start-period=10s --retries=3 \
CMD node healthcheck.js || exit 1
# Exponer puerto
EXPOSE 3000
# Start
CMD ["node", "dist/index.js"]
Docker Compose (Development)
# docker-compose.dev.yml
version: '3.8'
services:
# MongoDB
mongodb:
image: mongo:7.0
container_name: swapbits-mongodb
ports:
- "27017:27017"
environment:
MONGO_INITDB_ROOT_USERNAME: admin
MONGO_INITDB_ROOT_PASSWORD: ${MONGO_PASSWORD}
volumes:
- mongodb_data:/data/db
networks:
- swapbits-network
# Redis
redis:
image: redis:7-alpine
container_name: swapbits-redis
ports:
- "6379:6379"
command: redis-server --requirepass ${REDIS_PASSWORD}
volumes:
- redis_data:/data
networks:
- swapbits-network
# Auth Service
auth-service:
build:
context: .
dockerfile: docker/services/auth/Dockerfile
container_name: auth-service
ports:
- "3001:3000"
environment:
- NODE_ENV=development
- MONGODB_URI=mongodb://admin:${MONGO_PASSWORD}@mongodb:27017
- REDIS_URI=redis://:${REDIS_PASSWORD}@redis:6379
- JWT_SECRET=${JWT_SECRET}
depends_on:
- mongodb
- redis
networks:
- swapbits-network
volumes:
- ./docker/services/auth/src:/app/src # Hot reload
# Wallet Service
wallet-service:
build:
context: .
dockerfile: docker/services/wallet-service/Dockerfile
container_name: wallet-service
ports:
- "3002:3000"
environment:
- NODE_ENV=development
- MONGODB_URI=mongodb://admin:${MONGO_PASSWORD}@mongodb:27017
- REDIS_URI=redis://:${REDIS_PASSWORD}@redis:6379
- AUTH_SERVICE_URL=http://auth-service:3000
depends_on:
- mongodb
- redis
- auth-service
networks:
- swapbits-network
networks:
swapbits-network:
driver: bridge
volumes:
mongodb_data:
redis_data:
3. CI/CD Pipeline
GitHub Actions Workflow
# .github/workflows/deploy.yml
name: Deploy to Production
on:
push:
branches: [main]
workflow_dispatch:
env:
AWS_REGION: us-east-1
ECR_REGISTRY: 123456789.dkr.ecr.us-east-1.amazonaws.com
jobs:
# Job 1: Tests
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Setup Node.js
uses: actions/setup-node@v3
with:
node-version: '18'
cache: 'npm'
- name: Install dependencies
run: npm ci
- name: Run linter
run: npm run lint
- name: Run unit tests
run: npm test
- name: Run integration tests
run: npm run test:integration
# Job 2: Build Docker Images
build:
needs: test
runs-on: ubuntu-latest
strategy:
matrix:
service: [auth, user-service, wallet-service, bank-service, exchange-service]
steps:
- uses: actions/checkout@v3
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v2
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: ${{ env.AWS_REGION }}
- name: Login to Amazon ECR
id: login-ecr
uses: aws-actions/amazon-ecr-login@v1
- name: Build and push Docker image
env:
IMAGE_TAG: ${{ github.sha }}
run: |
docker build \
-t $ECR_REGISTRY/swapbits-${{ matrix.service }}:$IMAGE_TAG \
-t $ECR_REGISTRY/swapbits-${{ matrix.service }}:latest \
-f docker/services/${{ matrix.service }}/Dockerfile .
docker push $ECR_REGISTRY/swapbits-${{ matrix.service }}:$IMAGE_TAG
docker push $ECR_REGISTRY/swapbits-${{ matrix.service }}:latest
# Job 3: Deploy to Staging
deploy-staging:
needs: build
runs-on: ubuntu-latest
environment: staging
steps:
- name: Deploy to Staging
run: |
# SSH to staging server
ssh deploy@staging.swapbits.com "cd /app && docker-compose pull && docker-compose up -d"
- name: Run migrations
run: |
ssh deploy@staging.swapbits.com "cd /app/migrations && node runMigrations.js"
- name: Health check
run: |
sleep 30
curl -f https://staging-api.swapbits.com/health || exit 1
# Job 4: Deploy to Production (Manual Approval)
deploy-production:
needs: deploy-staging
runs-on: ubuntu-latest
environment: production
steps:
- name: Deploy to Production
run: |
# Rolling update
ssh deploy@prod.swapbits.com "cd /app && ./deploy.sh"
- name: Run migrations
run: |
ssh deploy@prod.swapbits.com "cd /app/migrations && node runMigrations.js"
- name: Health check
run: |
sleep 60
curl -f https://api.swapbits.com/health || exit 1
- name: Notify team
if: success()
uses: slackapi/slack-github-action@v1
with:
payload: |
{
"text": "[SUCCESS] Production deployment successful!"
}
env:
SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL }}
4. Zero-Downtime Deployment
Rolling Update Strategy
#!/bin/bash
# deploy.sh - Script de despliegue sin downtime
set -e
echo "Starting rolling deployment..."
# 1. Pull nuevas imágenes
docker-compose pull
# 2. Escalar a 2 instancias (si solo hay 1)
docker-compose up -d --scale auth-service=2
# 3. Esperar que nueva instancia esté healthy
echo "Waiting for new instance to be healthy..."
sleep 30
# 4. Detener instancia vieja
OLD_CONTAINER=$(docker ps -q -f name=auth-service | head -n1)
docker stop $OLD_CONTAINER
# 5. Remover instancia vieja
docker rm $OLD_CONTAINER
# 6. Volver a escala normal
docker-compose up -d
echo "[SUCCESS] Deployment completed successfully!"
Blue-Green Deployment
#!/bin/bash
# blue-green-deploy.sh
# 1. Deploy a "green" environment
docker-compose -f docker-compose.green.yml up -d
# 2. Health check del green environment
for i in {1..30}; do
if curl -f http://green-lb:3000/health; then
echo "Green environment is healthy"
break
fi
sleep 2
done
# 3. Switch load balancer al green
aws elbv2 modify-listener \
--listener-arn $LISTENER_ARN \
--default-actions Type=forward,TargetGroupArn=$GREEN_TARGET_GROUP
# 4. Esperar que tráfico drene del blue
sleep 60
# 5. Detener blue environment
docker-compose -f docker-compose.blue.yml down
# 6. Renombrar green -> blue para próximo deploy
mv docker-compose.green.yml docker-compose.blue.yml
5. Monitoreo y Alertas
Prometheus + Grafana
# docker-compose.monitoring.yml
version: '3.8'
services:
# Prometheus
prometheus:
image: prom/prometheus:latest
container_name: prometheus
ports:
- "9090:9090"
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
- prometheus_data:/prometheus
command:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.path=/prometheus'
networks:
- monitoring
# Grafana
grafana:
image: grafana/grafana:latest
container_name: grafana
ports:
- "3000:3000"
environment:
- GF_SECURITY_ADMIN_PASSWORD=${GRAFANA_PASSWORD}
volumes:
- grafana_data:/var/lib/grafana
- ./grafana/dashboards:/etc/grafana/provisioning/dashboards
depends_on:
- prometheus
networks:
- monitoring
# Node Exporter
node-exporter:
image: prom/node-exporter:latest
container_name: node-exporter
ports:
- "9100:9100"
networks:
- monitoring
networks:
monitoring:
driver: bridge
volumes:
prometheus_data:
grafana_data:
Configuración Prometheus
# prometheus.yml
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
# Services
- job_name: 'auth-service'
static_configs:
- targets: ['auth-service:3000']
- job_name: 'wallet-service'
static_configs:
- targets: ['wallet-service:3000']
# Infrastructure
- job_name: 'mongodb'
static_configs:
- targets: ['mongodb-exporter:9216']
- job_name: 'redis'
static_configs:
- targets: ['redis-exporter:9121']
- job_name: 'node'
static_configs:
- targets: ['node-exporter:9100']
Métricas Expuestas
// Exponer métricas en cada servicio
import promClient from 'prom-client';
// Registro de métricas
const register = new promClient.Registry();
// Métricas por defecto (CPU, memoria, etc)
promClient.collectDefaultMetrics({ register });
// Métricas custom
const httpRequestDuration = new promClient.Histogram({
name: 'http_request_duration_seconds',
help: 'Duration of HTTP requests in seconds',
labelNames: ['method', 'route', 'status_code'],
registers: [register]
});
const activeConnections = new promClient.Gauge({
name: 'websocket_active_connections',
help: 'Number of active WebSocket connections',
registers: [register]
});
const transactionsProcessed = new promClient.Counter({
name: 'transactions_processed_total',
help: 'Total number of transactions processed',
labelNames: ['coin', 'type', 'status'],
registers: [register]
});
// Endpoint de métricas
app.get('/metrics', async (req, res) => {
res.set('Content-Type', register.contentType);
res.end(await register.metrics());
});
6. Logging Centralizado
ELK Stack (Elasticsearch, Logstash, Kibana)
# docker-compose.logging.yml
version: '3.8'
services:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:8.10.0
environment:
- discovery.type=single-node
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
ports:
- "9200:9200"
volumes:
- elasticsearch_data:/usr/share/elasticsearch/data
networks:
- logging
logstash:
image: docker.elastic.co/logstash/logstash:8.10.0
ports:
- "5000:5000"
- "9600:9600"
volumes:
- ./logstash.conf:/usr/share/logstash/pipeline/logstash.conf
depends_on:
- elasticsearch
networks:
- logging
kibana:
image: docker.elastic.co/kibana/kibana:8.10.0
ports:
- "5601:5601"
environment:
ELASTICSEARCH_HOSTS: http://elasticsearch:9200
depends_on:
- elasticsearch
networks:
- logging
networks:
logging:
driver: bridge
volumes:
elasticsearch_data:
Winston Logger con ELK
import winston from 'winston';
import 'winston-elasticsearch';
const logger = winston.createLogger({
level: process.env.LOG_LEVEL || 'info',
format: winston.format.combine(
winston.format.timestamp(),
winston.format.errors({ stack: true }),
winston.format.json()
),
defaultMeta: {
service: process.env.SERVICE_NAME,
environment: process.env.NODE_ENV
},
transports: [
// Console
new winston.transports.Console({
format: winston.format.combine(
winston.format.colorize(),
winston.format.simple()
)
}),
// Elasticsearch
new winston.transports.Elasticsearch({
level: 'info',
clientOpts: {
node: process.env.ELASTICSEARCH_URL
},
index: 'swapbits-logs'
}),
// File para errores
new winston.transports.File({
filename: 'error.log',
level: 'error'
})
]
});
export default logger;
7. Backups Automatizados
Script de Backup MongoDB
#!/bin/bash
# backup-mongodb.sh
TIMESTAMP=$(date +%Y%m%d-%H%M%S)
BACKUP_DIR="/backups/mongodb"
S3_BUCKET="s3://swapbits-backups/mongodb"
echo "Starting MongoDB backup at $TIMESTAMP"
# 1. Crear backup
mongodump \
--uri="$MONGODB_URI" \
--out="$BACKUP_DIR/$TIMESTAMP" \
--gzip
# 2. Comprimir
tar -czf "$BACKUP_DIR/backup-$TIMESTAMP.tar.gz" \
-C "$BACKUP_DIR" "$TIMESTAMP"
# 3. Subir a S3
aws s3 cp \
"$BACKUP_DIR/backup-$TIMESTAMP.tar.gz" \
"$S3_BUCKET/backup-$TIMESTAMP.tar.gz"
# 4. Limpiar backups locales viejos (> 7 días)
find "$BACKUP_DIR" -type f -name "*.tar.gz" -mtime +7 -delete
# 5. Limpiar backups S3 viejos (> 30 días)
aws s3 ls "$S3_BUCKET/" | while read -r line; do
BACKUP_DATE=$(echo $line | awk '{print $4}' | cut -d'-' -f2 | cut -d'.' -f1)
DAYS_OLD=$(( ($(date +%s) - $(date -d "$BACKUP_DATE" +%s)) / 86400 ))
if [ $DAYS_OLD -gt 30 ]; then
FILENAME=$(echo $line | awk '{print $4}')
aws s3 rm "$S3_BUCKET/$FILENAME"
echo "Deleted old backup: $FILENAME"
fi
done
echo "[SUCCESS] Backup completed successfully"
Cron Job para Backups
# crontab -e
# Backup MongoDB diario a las 2 AM
0 2 * * * /opt/scripts/backup-mongodb.sh >> /var/log/backups.log 2>&1
# Backup Redis cada 6 horas
0 */6 * * * redis-cli --rdb /backups/redis/dump.rdb && \
aws s3 cp /backups/redis/dump.rdb s3://swapbits-backups/redis/
# Limpiar logs viejos semanalmente
0 3 * * 0 find /var/log -name "*.log" -mtime +30 -delete
8. Disaster Recovery
Plan de Recuperación
## Disaster Recovery Plan
### Nivel 1: Servicio Caído (RTO: 15 min, RPO: 5 min)
**Síntomas:**
- Health check falla
- 503 errors
- Servicio no responde
**Acciones:**
1. Verificar estado en Grafana/Prometheus
2. Revisar logs en Kibana
3. Restart del servicio afectado:
```bash
docker-compose restart [service-name]
- Si persiste, rollback al deployment anterior:
./rollback.sh
Nivel 2: Base de Datos Corrupta (RTO: 1 hora, RPO: 24 horas)
Síntomas:
- Errores de MongoDB
- Datos inconsistentes
- Réplicas fuera de sincronía
Acciones:
- Promover réplica secundaria a primaria
- Restaurar desde último backup:
./restore-mongodb.sh backup-20251020-020000.tar.gz - Verificar integridad de datos
- Re-sincronizar réplicas
Nivel 3: Pérdida Total de Datos (RTO: 4 horas, RPO: 24 horas)
Acciones:
- Provisionar nueva infraestructura
- Restaurar backups desde S3
- Ejecutar migraciones
- Verificar todos los servicios
- Conmutar DNS al nuevo entorno
Nivel 4: Compromiso de Seguridad (RTO: Inmediato, RPO: N/A)
Acciones:
- INMEDIATAMENTE apagar servicios afectados
- Rotar TODAS las credenciales y secrets
- Investigar vector de ataque
- Parchear vulnerabilidad
- Auditar todos los accesos recientes
- Notificar a usuarios afectados (si aplica)
---
## 9. Checklist de Mantenimiento
### Diario
- [✓] Revisar dashboards de Grafana
- [✓] Verificar health checks
- [✓] Revisar logs de errores en Kibana
- [✓] Verificar disk usage (< 80%)
- [✓] Revisar alertas de Prometheus
### Semanal
- [✓] Revisar métricas de performance
- [✓] Analizar queries lentas de MongoDB
- [✓] Verificar backups automáticos
- [✓] Revisar y actualizar dependencias de seguridad
- [✓] Revisar logs de auditoría
### Mensual
- [✓] Actualizar dependencias npm
- [✓] Revisar y optimizar índices de MongoDB
- [✓] Limpiar logs y backups viejos
- [✓] Revisar costos de AWS
- [✓] Auditoría de seguridad
- [✓] Load testing
### Trimestral
- [✓] Disaster recovery drill (simulacro)
- [✓] Revisar y actualizar documentación
- [✓] Capacity planning
- [✓] Actualizar versión de Node.js/MongoDB
- [✓] Penetration testing
---
## Comandos Útiles
### Docker
```bash
# Ver logs de un servicio
docker-compose logs -f auth-service
# Ver recursos utilizados
docker stats
# Restart todos los servicios
docker-compose restart
# Rebuild y restart
docker-compose up -d --build
# Limpiar imágenes viejas
docker system prune -a
MongoDB
# Conectar a MongoDB
mongosh $MONGODB_URI
# Ver tamaño de bases de datos
db.stats()
# Crear índice
db.collection.createIndex({ field: 1 })
# Ver queries lentas
db.system.profile.find({millis:{$gt:100}}).sort({ts:-1})
Redis
# Conectar a Redis
redis-cli -h host -p 6379 -a password
# Ver todas las keys
KEYS *
# Ver memoria utilizada
INFO memory
# Flush cache
FLUSHALL
PM2 (si no usas Docker)
# Start todos los servicios
pm2 start ecosystem.config.js
# Reload sin downtime
pm2 reload all
# Ver logs
pm2 logs
# Monitoreo
pm2 monit