Saltar al contenido principal

Deployment y Mantenimiento

Este documento explica cómo desplegar y mantener el ecosistema SwapBits en producción.


Arquitectura de Deployment


1. Entornos

Configuración de Entornos

// Tres entornos principales
const Environments = {
development: {
url: 'http://localhost:3000',
mongodb: 'mongodb://localhost:27017/swapbits-dev',
redis: 'redis://localhost:6379',
logLevel: 'debug',
cors: '*'
},

staging: {
url: 'https://staging-api.swapbits.com',
mongodb: process.env.MONGODB_URI_STAGING,
redis: process.env.REDIS_URI_STAGING,
logLevel: 'info',
cors: ['https://staging.swapbits.com']
},

production: {
url: 'https://api.swapbits.com',
mongodb: process.env.MONGODB_URI_PRODUCTION,
redis: process.env.REDIS_URI_PRODUCTION,
logLevel: 'warn',
cors: ['https://swapbits.com', 'https://app.swapbits.com']
}
};

2. Docker Setup

Dockerfile para Servicios

# docker/services/auth/Dockerfile
FROM node:18-alpine AS builder

WORKDIR /app

# Copiar package files
COPY package*.json ./
COPY tsconfig.json ./

# Instalar dependencias
RUN npm ci --only=production

# Copiar código fuente
COPY src ./src
COPY docker/packages ./packages

# Build TypeScript
RUN npm run build

# --- Production Image ---
FROM node:18-alpine

WORKDIR /app

# Copiar solo lo necesario
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY --from=builder /app/package.json ./

# Usuario no-root
RUN addgroup -g 1001 -S nodejs && \
adduser -S nodejs -u 1001
USER nodejs

# Health check
HEALTHCHECK --interval=30s --timeout=3s --start-period=10s --retries=3 \
CMD node healthcheck.js || exit 1

# Exponer puerto
EXPOSE 3000

# Start
CMD ["node", "dist/index.js"]

Docker Compose (Development)

# docker-compose.dev.yml
version: '3.8'

services:
# MongoDB
mongodb:
image: mongo:7.0
container_name: swapbits-mongodb
ports:
- "27017:27017"
environment:
MONGO_INITDB_ROOT_USERNAME: admin
MONGO_INITDB_ROOT_PASSWORD: ${MONGO_PASSWORD}
volumes:
- mongodb_data:/data/db
networks:
- swapbits-network

# Redis
redis:
image: redis:7-alpine
container_name: swapbits-redis
ports:
- "6379:6379"
command: redis-server --requirepass ${REDIS_PASSWORD}
volumes:
- redis_data:/data
networks:
- swapbits-network

# Auth Service
auth-service:
build:
context: .
dockerfile: docker/services/auth/Dockerfile
container_name: auth-service
ports:
- "3001:3000"
environment:
- NODE_ENV=development
- MONGODB_URI=mongodb://admin:${MONGO_PASSWORD}@mongodb:27017
- REDIS_URI=redis://:${REDIS_PASSWORD}@redis:6379
- JWT_SECRET=${JWT_SECRET}
depends_on:
- mongodb
- redis
networks:
- swapbits-network
volumes:
- ./docker/services/auth/src:/app/src # Hot reload

# Wallet Service
wallet-service:
build:
context: .
dockerfile: docker/services/wallet-service/Dockerfile
container_name: wallet-service
ports:
- "3002:3000"
environment:
- NODE_ENV=development
- MONGODB_URI=mongodb://admin:${MONGO_PASSWORD}@mongodb:27017
- REDIS_URI=redis://:${REDIS_PASSWORD}@redis:6379
- AUTH_SERVICE_URL=http://auth-service:3000
depends_on:
- mongodb
- redis
- auth-service
networks:
- swapbits-network

networks:
swapbits-network:
driver: bridge

volumes:
mongodb_data:
redis_data:

3. CI/CD Pipeline

GitHub Actions Workflow

# .github/workflows/deploy.yml
name: Deploy to Production

on:
push:
branches: [main]
workflow_dispatch:

env:
AWS_REGION: us-east-1
ECR_REGISTRY: 123456789.dkr.ecr.us-east-1.amazonaws.com

jobs:
# Job 1: Tests
test:
runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v3

- name: Setup Node.js
uses: actions/setup-node@v3
with:
node-version: '18'
cache: 'npm'

- name: Install dependencies
run: npm ci

- name: Run linter
run: npm run lint

- name: Run unit tests
run: npm test

- name: Run integration tests
run: npm run test:integration

# Job 2: Build Docker Images
build:
needs: test
runs-on: ubuntu-latest

strategy:
matrix:
service: [auth, user-service, wallet-service, bank-service, exchange-service]

steps:
- uses: actions/checkout@v3

- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v2
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: ${{ env.AWS_REGION }}

- name: Login to Amazon ECR
id: login-ecr
uses: aws-actions/amazon-ecr-login@v1

- name: Build and push Docker image
env:
IMAGE_TAG: ${{ github.sha }}
run: |
docker build \
-t $ECR_REGISTRY/swapbits-${{ matrix.service }}:$IMAGE_TAG \
-t $ECR_REGISTRY/swapbits-${{ matrix.service }}:latest \
-f docker/services/${{ matrix.service }}/Dockerfile .

docker push $ECR_REGISTRY/swapbits-${{ matrix.service }}:$IMAGE_TAG
docker push $ECR_REGISTRY/swapbits-${{ matrix.service }}:latest

# Job 3: Deploy to Staging
deploy-staging:
needs: build
runs-on: ubuntu-latest
environment: staging

steps:
- name: Deploy to Staging
run: |
# SSH to staging server
ssh deploy@staging.swapbits.com "cd /app && docker-compose pull && docker-compose up -d"

- name: Run migrations
run: |
ssh deploy@staging.swapbits.com "cd /app/migrations && node runMigrations.js"

- name: Health check
run: |
sleep 30
curl -f https://staging-api.swapbits.com/health || exit 1

# Job 4: Deploy to Production (Manual Approval)
deploy-production:
needs: deploy-staging
runs-on: ubuntu-latest
environment: production

steps:
- name: Deploy to Production
run: |
# Rolling update
ssh deploy@prod.swapbits.com "cd /app && ./deploy.sh"

- name: Run migrations
run: |
ssh deploy@prod.swapbits.com "cd /app/migrations && node runMigrations.js"

- name: Health check
run: |
sleep 60
curl -f https://api.swapbits.com/health || exit 1

- name: Notify team
if: success()
uses: slackapi/slack-github-action@v1
with:
payload: |
{
"text": "[SUCCESS] Production deployment successful!"
}
env:
SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL }}

4. Zero-Downtime Deployment

Rolling Update Strategy

#!/bin/bash
# deploy.sh - Script de despliegue sin downtime

set -e

echo "Starting rolling deployment..."

# 1. Pull nuevas imágenes
docker-compose pull

# 2. Escalar a 2 instancias (si solo hay 1)
docker-compose up -d --scale auth-service=2

# 3. Esperar que nueva instancia esté healthy
echo "Waiting for new instance to be healthy..."
sleep 30

# 4. Detener instancia vieja
OLD_CONTAINER=$(docker ps -q -f name=auth-service | head -n1)
docker stop $OLD_CONTAINER

# 5. Remover instancia vieja
docker rm $OLD_CONTAINER

# 6. Volver a escala normal
docker-compose up -d

echo "[SUCCESS] Deployment completed successfully!"

Blue-Green Deployment

#!/bin/bash
# blue-green-deploy.sh

# 1. Deploy a "green" environment
docker-compose -f docker-compose.green.yml up -d

# 2. Health check del green environment
for i in {1..30}; do
if curl -f http://green-lb:3000/health; then
echo "Green environment is healthy"
break
fi
sleep 2
done

# 3. Switch load balancer al green
aws elbv2 modify-listener \
--listener-arn $LISTENER_ARN \
--default-actions Type=forward,TargetGroupArn=$GREEN_TARGET_GROUP

# 4. Esperar que tráfico drene del blue
sleep 60

# 5. Detener blue environment
docker-compose -f docker-compose.blue.yml down

# 6. Renombrar green -> blue para próximo deploy
mv docker-compose.green.yml docker-compose.blue.yml

5. Monitoreo y Alertas

Prometheus + Grafana

# docker-compose.monitoring.yml
version: '3.8'

services:
# Prometheus
prometheus:
image: prom/prometheus:latest
container_name: prometheus
ports:
- "9090:9090"
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
- prometheus_data:/prometheus
command:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.path=/prometheus'
networks:
- monitoring

# Grafana
grafana:
image: grafana/grafana:latest
container_name: grafana
ports:
- "3000:3000"
environment:
- GF_SECURITY_ADMIN_PASSWORD=${GRAFANA_PASSWORD}
volumes:
- grafana_data:/var/lib/grafana
- ./grafana/dashboards:/etc/grafana/provisioning/dashboards
depends_on:
- prometheus
networks:
- monitoring

# Node Exporter
node-exporter:
image: prom/node-exporter:latest
container_name: node-exporter
ports:
- "9100:9100"
networks:
- monitoring

networks:
monitoring:
driver: bridge

volumes:
prometheus_data:
grafana_data:

Configuración Prometheus

# prometheus.yml
global:
scrape_interval: 15s
evaluation_interval: 15s

scrape_configs:
# Services
- job_name: 'auth-service'
static_configs:
- targets: ['auth-service:3000']

- job_name: 'wallet-service'
static_configs:
- targets: ['wallet-service:3000']

# Infrastructure
- job_name: 'mongodb'
static_configs:
- targets: ['mongodb-exporter:9216']

- job_name: 'redis'
static_configs:
- targets: ['redis-exporter:9121']

- job_name: 'node'
static_configs:
- targets: ['node-exporter:9100']

Métricas Expuestas

// Exponer métricas en cada servicio
import promClient from 'prom-client';

// Registro de métricas
const register = new promClient.Registry();

// Métricas por defecto (CPU, memoria, etc)
promClient.collectDefaultMetrics({ register });

// Métricas custom
const httpRequestDuration = new promClient.Histogram({
name: 'http_request_duration_seconds',
help: 'Duration of HTTP requests in seconds',
labelNames: ['method', 'route', 'status_code'],
registers: [register]
});

const activeConnections = new promClient.Gauge({
name: 'websocket_active_connections',
help: 'Number of active WebSocket connections',
registers: [register]
});

const transactionsProcessed = new promClient.Counter({
name: 'transactions_processed_total',
help: 'Total number of transactions processed',
labelNames: ['coin', 'type', 'status'],
registers: [register]
});

// Endpoint de métricas
app.get('/metrics', async (req, res) => {
res.set('Content-Type', register.contentType);
res.end(await register.metrics());
});

6. Logging Centralizado

ELK Stack (Elasticsearch, Logstash, Kibana)

# docker-compose.logging.yml
version: '3.8'

services:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:8.10.0
environment:
- discovery.type=single-node
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
ports:
- "9200:9200"
volumes:
- elasticsearch_data:/usr/share/elasticsearch/data
networks:
- logging

logstash:
image: docker.elastic.co/logstash/logstash:8.10.0
ports:
- "5000:5000"
- "9600:9600"
volumes:
- ./logstash.conf:/usr/share/logstash/pipeline/logstash.conf
depends_on:
- elasticsearch
networks:
- logging

kibana:
image: docker.elastic.co/kibana/kibana:8.10.0
ports:
- "5601:5601"
environment:
ELASTICSEARCH_HOSTS: http://elasticsearch:9200
depends_on:
- elasticsearch
networks:
- logging

networks:
logging:
driver: bridge

volumes:
elasticsearch_data:

Winston Logger con ELK

import winston from 'winston';
import 'winston-elasticsearch';

const logger = winston.createLogger({
level: process.env.LOG_LEVEL || 'info',
format: winston.format.combine(
winston.format.timestamp(),
winston.format.errors({ stack: true }),
winston.format.json()
),
defaultMeta: {
service: process.env.SERVICE_NAME,
environment: process.env.NODE_ENV
},
transports: [
// Console
new winston.transports.Console({
format: winston.format.combine(
winston.format.colorize(),
winston.format.simple()
)
}),

// Elasticsearch
new winston.transports.Elasticsearch({
level: 'info',
clientOpts: {
node: process.env.ELASTICSEARCH_URL
},
index: 'swapbits-logs'
}),

// File para errores
new winston.transports.File({
filename: 'error.log',
level: 'error'
})
]
});

export default logger;

7. Backups Automatizados

Script de Backup MongoDB

#!/bin/bash
# backup-mongodb.sh

TIMESTAMP=$(date +%Y%m%d-%H%M%S)
BACKUP_DIR="/backups/mongodb"
S3_BUCKET="s3://swapbits-backups/mongodb"

echo "Starting MongoDB backup at $TIMESTAMP"

# 1. Crear backup
mongodump \
--uri="$MONGODB_URI" \
--out="$BACKUP_DIR/$TIMESTAMP" \
--gzip

# 2. Comprimir
tar -czf "$BACKUP_DIR/backup-$TIMESTAMP.tar.gz" \
-C "$BACKUP_DIR" "$TIMESTAMP"

# 3. Subir a S3
aws s3 cp \
"$BACKUP_DIR/backup-$TIMESTAMP.tar.gz" \
"$S3_BUCKET/backup-$TIMESTAMP.tar.gz"

# 4. Limpiar backups locales viejos (> 7 días)
find "$BACKUP_DIR" -type f -name "*.tar.gz" -mtime +7 -delete

# 5. Limpiar backups S3 viejos (> 30 días)
aws s3 ls "$S3_BUCKET/" | while read -r line; do
BACKUP_DATE=$(echo $line | awk '{print $4}' | cut -d'-' -f2 | cut -d'.' -f1)
DAYS_OLD=$(( ($(date +%s) - $(date -d "$BACKUP_DATE" +%s)) / 86400 ))

if [ $DAYS_OLD -gt 30 ]; then
FILENAME=$(echo $line | awk '{print $4}')
aws s3 rm "$S3_BUCKET/$FILENAME"
echo "Deleted old backup: $FILENAME"
fi
done

echo "[SUCCESS] Backup completed successfully"

Cron Job para Backups

# crontab -e

# Backup MongoDB diario a las 2 AM
0 2 * * * /opt/scripts/backup-mongodb.sh >> /var/log/backups.log 2>&1

# Backup Redis cada 6 horas
0 */6 * * * redis-cli --rdb /backups/redis/dump.rdb && \
aws s3 cp /backups/redis/dump.rdb s3://swapbits-backups/redis/

# Limpiar logs viejos semanalmente
0 3 * * 0 find /var/log -name "*.log" -mtime +30 -delete

8. Disaster Recovery

Plan de Recuperación

## Disaster Recovery Plan

### Nivel 1: Servicio Caído (RTO: 15 min, RPO: 5 min)

**Síntomas:**
- Health check falla
- 503 errors
- Servicio no responde

**Acciones:**
1. Verificar estado en Grafana/Prometheus
2. Revisar logs en Kibana
3. Restart del servicio afectado:
```bash
docker-compose restart [service-name]
  1. Si persiste, rollback al deployment anterior:
    ./rollback.sh

Nivel 2: Base de Datos Corrupta (RTO: 1 hora, RPO: 24 horas)

Síntomas:

  • Errores de MongoDB
  • Datos inconsistentes
  • Réplicas fuera de sincronía

Acciones:

  1. Promover réplica secundaria a primaria
  2. Restaurar desde último backup:
    ./restore-mongodb.sh backup-20251020-020000.tar.gz
  3. Verificar integridad de datos
  4. Re-sincronizar réplicas

Nivel 3: Pérdida Total de Datos (RTO: 4 horas, RPO: 24 horas)

Acciones:

  1. Provisionar nueva infraestructura
  2. Restaurar backups desde S3
  3. Ejecutar migraciones
  4. Verificar todos los servicios
  5. Conmutar DNS al nuevo entorno

Nivel 4: Compromiso de Seguridad (RTO: Inmediato, RPO: N/A)

Acciones:

  1. INMEDIATAMENTE apagar servicios afectados
  2. Rotar TODAS las credenciales y secrets
  3. Investigar vector de ataque
  4. Parchear vulnerabilidad
  5. Auditar todos los accesos recientes
  6. Notificar a usuarios afectados (si aplica)

---

## 9. Checklist de Mantenimiento

### Diario
- [✓] Revisar dashboards de Grafana
- [✓] Verificar health checks
- [✓] Revisar logs de errores en Kibana
- [✓] Verificar disk usage (< 80%)
- [✓] Revisar alertas de Prometheus

### Semanal
- [✓] Revisar métricas de performance
- [✓] Analizar queries lentas de MongoDB
- [✓] Verificar backups automáticos
- [✓] Revisar y actualizar dependencias de seguridad
- [✓] Revisar logs de auditoría

### Mensual
- [✓] Actualizar dependencias npm
- [✓] Revisar y optimizar índices de MongoDB
- [✓] Limpiar logs y backups viejos
- [✓] Revisar costos de AWS
- [✓] Auditoría de seguridad
- [✓] Load testing

### Trimestral
- [✓] Disaster recovery drill (simulacro)
- [✓] Revisar y actualizar documentación
- [✓] Capacity planning
- [✓] Actualizar versión de Node.js/MongoDB
- [✓] Penetration testing

---

## Comandos Útiles

### Docker

```bash
# Ver logs de un servicio
docker-compose logs -f auth-service

# Ver recursos utilizados
docker stats

# Restart todos los servicios
docker-compose restart

# Rebuild y restart
docker-compose up -d --build

# Limpiar imágenes viejas
docker system prune -a

MongoDB

# Conectar a MongoDB
mongosh $MONGODB_URI

# Ver tamaño de bases de datos
db.stats()

# Crear índice
db.collection.createIndex({ field: 1 })

# Ver queries lentas
db.system.profile.find({millis:{$gt:100}}).sort({ts:-1})

Redis

# Conectar a Redis
redis-cli -h host -p 6379 -a password

# Ver todas las keys
KEYS *

# Ver memoria utilizada
INFO memory

# Flush cache
FLUSHALL

PM2 (si no usas Docker)

# Start todos los servicios
pm2 start ecosystem.config.js

# Reload sin downtime
pm2 reload all

# Ver logs
pm2 logs

# Monitoreo
pm2 monit