Skip to main content

Production Deployment Guide: Malware Scanning System

Target Environment: Vercel (Platform App) + Railway (ClamAV) + Supabase (Database)

Last Updated: October 17, 2025

Prerequisites: Phases 1-6 completed, tested locally


Overview​

This guide walks through deploying the complete malware scanning system to production. The architecture consists of:

  • Platform App (Vercel): Document upload API, scanning logic
  • Console App (Vercel): Staff quarantine management UI
  • ClamAV Daemon (Railway): Malware scanning engine
  • Supabase: Database and file storage (quarantine bucket)

Pre-Deployment Checklist​

1. Database Migration​

Status: ✅ Schema already deployed (Phase 2)

Verify production schema has these tables:

-- Documents quarantine fields
SELECT column_name FROM information_schema.columns
WHERE table_name = 'documents'
AND column_name IN ('is_quarantined', 'quarantined_at', 'quarantine_reason');

-- Document scans table
SELECT table_name FROM information_schema.tables
WHERE table_name = 'document_scans';

-- Quarantine events table
SELECT table_name FROM information_schema.tables
WHERE table_name = 'quarantine_events';

If missing, apply migrations from /supabase/migrations/20251017_add_quarantine_fields.sql.

2. Supabase Storage Buckets​

Quarantine Bucket:

-- Verify quarantine bucket exists
SELECT name FROM storage.buckets WHERE name = 'quarantine';

-- If missing, create it:
INSERT INTO storage.buckets (id, name, public)
VALUES ('quarantine', 'quarantine', false);

Row Level Security (RLS):

-- Verify RLS policies exist
SELECT policyname FROM pg_policies
WHERE tablename = 'objects'
AND schemaname = 'storage'
AND policyname LIKE '%quarantine%';

Apply policies from Phase 2 documentation if missing.

3. Environment Variables​

Required for Platform App:

# ClamAV Configuration
CLAMAV_HOST=torvus-clamav.railway.internal # Railway internal URL
CLAMAV_PORT=3310
CLAMAV_TIMEOUT=5000 # 5 seconds
CLAMAV_MAX_DOWNLOAD_MB=100
CLAMAV_DOWNLOAD_TIMEOUT=30000 # 30 seconds

# Scanning Configuration
ENABLE_DOCUMENT_SCANNING=true
SCAN_EMIT_ON_FINALIZE=true
SCAN_MAX_FILE_MB=100

# Supabase (already configured)
SUPABASE_URL=https://your-project.supabase.co
SUPABASE_SERVICE_ROLE_KEY=your-service-role-key

Required for Console App:

# Same Supabase credentials as Platform
SUPABASE_URL=https://your-project.supabase.co
SUPABASE_SERVICE_ROLE_KEY=your-service-role-key

Step 1: Deploy ClamAV to Railway​

1.1 Create Railway Service​

  1. Log in to Railway: https://railway.app
  2. Select your project (or create new)
  3. Click "New Service" → "Docker Image"
  4. Use image: clamav/clamav:latest

1.2 Configure ClamAV Service​

Environment Variables (none required - uses defaults)

Resources:

  • Memory: 4GB minimum (ClamAV signature database is ~400MB)
  • CPU: 1 vCPU minimum
  • Storage: 2GB for virus definitions

Networking:

  • Port: 3310 (exposed automatically)
  • Internal URL: torvus-clamav.railway.internal
  • Public URL: Not required (internal only)

1.3 Health Check​

Railway will show "Deploying..." → "Active" when ready.

Verify from local machine:

# Test with nc (netcat)
nc -zv torvus-clamav.up.railway.app 3310

# Or test ping via Node.js
node -e "
const net = require('net');
const client = net.connect(3310, 'torvus-clamav.up.railway.app', () => {
client.write('PING\n');
});
client.on('data', (data) => {
console.log('Response:', data.toString());
client.end();
});
"

Expected response: PONG

1.4 Update Signature Database​

ClamAV automatically updates virus definitions on startup. Initial boot takes ~2-3 minutes.

Monitor logs in Railway:

Downloading daily.cvd [===========================] 100%
Downloading main.cvd [============================] 100%
Database updated successfully
ClamAV daemon ready

Step 2: Configure Vercel Environment Variables​

2.1 Platform App Configuration​

Navigate to Vercel Dashboard → Project: torvus-platform → Settings → Environment Variables

Add/Update:

CLAMAV_HOST = torvus-clamav.railway.internal
CLAMAV_PORT = 3310
CLAMAV_TIMEOUT = 5000
CLAMAV_MAX_DOWNLOAD_MB = 100
CLAMAV_DOWNLOAD_TIMEOUT = 30000
ENABLE_DOCUMENT_SCANNING = true
SCAN_EMIT_ON_FINALIZE = true
SCAN_MAX_FILE_MB = 100

Environment: Set for Production, Preview, and Development

2.2 Console App Configuration​

Navigate to Vercel Dashboard → Project: torvus-console → Settings → Environment Variables

No additional variables needed (uses same Supabase credentials).


Step 3: Deploy Code to Vercel​

3.1 Commit and Push​

# Ensure all Phase 1-6 code is committed
git status

# Add any uncommitted files
git add .
git commit -m "feat: Complete malware scanning system (Phases 1-6)

- ClamAV client and Docker integration
- Database schema with quarantine tables
- Core scanning logic with retry mechanism
- ClamAV adapter for production workflow
- Console quarantine management UI
- Metrics dashboard and compliance reporting

All phases tested and documented."

# Push to main branch (triggers Vercel deployment)
git push origin main

3.2 Monitor Deployment​

Vercel will automatically deploy both apps:

Platform App:

  • Build time: ~3-5 minutes
  • Check logs for ClamAV initialization

Console App:

  • Build time: ~2-4 minutes
  • New routes: /admin/quarantine, /admin/quarantine/metrics

3.3 Verify Build Success​

Check Vercel deployment logs for:

✓ Building pages
✓ Compiling TypeScript
✓ Linting
✓ Collecting page data
✓ Generating static pages
✓ Finalizing build

Step 4: Verification & Testing​

4.1 Health Check Endpoints​

Test ClamAV Integration:

# From Platform API
curl https://api.torvussecurity.com/api/health

# Expected response:
{
"status": "healthy",
"scanner": {
"engine": "clamav",
"available": true,
"version": "ClamAV 1.x.x/..."
}
}

Test Console Quarantine Page:

# Navigate in browser (requires security_admin role)
https://console.torvussecurity.com/admin/quarantine

# Should display:
- Empty state: "No quarantined documents"
- Or: List of quarantined documents (if any exist)

4.2 End-to-End Scan Test​

Upload Clean File:

# 1. Create test file
echo "This is a clean test document" > test-clean.txt

# 2. Upload to Platform (use your vault ID)
curl -X POST https://api.torvussecurity.com/api/vaults/{vaultId}/documents/initiate \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{"filename":"test-clean.txt","mimeType":"text/plain","size":30}'

# 3. Upload file to storage (use signed URL from response)
curl -X PUT "{signedUrl}" \
--upload-file test-clean.txt

# 4. Finalize upload
curl -X POST https://api.torvussecurity.com/api/documents/finalize \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{"documentId":"...","storageKey":"..."}'

# Expected: Scan completes with status: CLEAN

Upload EICAR Test Virus:

# 1. Create EICAR test file (safe test signature)
echo 'X5O!P%@AP[4\PZX54(P^)7CC)7}$EICAR-STANDARD-ANTIVIRUS-TEST-FILE!$H+H*' > eicar.txt

# 2. Follow same upload process as above

# Expected:
# - Scan detects virus: "Eicar-Test-Signature"
# - Document automatically quarantined
# - Appears in Console quarantine page

4.3 Console UI Testing​

Quarantine Documents Page:

  1. Log in to Console as security_admin
  2. Navigate to Security → Quarantine → Documents
  3. Verify EICAR test file appears in list
  4. Check virus badge shows "Eicar-Test-Signature"
  5. Verify status badge shows "INFECTED"

Metrics Dashboard:

  1. Navigate to Security → Quarantine → Metrics
  2. Verify metrics cards show:
    • Total Quarantined: 1
    • Infected Documents: 1
    • Top Threats: "Eicar-Test-Signature"
  3. Check timeline shows recent scan

Compliance Report:

# Generate 30-day compliance report
curl https://console.torvussecurity.com/api/admin/quarantine/compliance \
-H "Authorization: Bearer YOUR_CONSOLE_TOKEN" \
"?startDate=2025-09-17T00:00:00Z&endDate=2025-10-17T23:59:59Z"

# Expected: JSON report with summary, threats, vault breakdown

Step 5: Monitoring & Alerting​

5.1 ClamAV Health Monitoring​

Railway Monitoring:

  • Check ClamAV memory usage (should stay under 2GB)
  • Monitor CPU usage (spikes during scans are normal)
  • Check uptime (should be 99%+)

Add Alerts (Railway Dashboard):

  • Memory > 3.5GB → Alert
  • CPU > 90% for 5 minutes → Alert
  • Service down > 1 minute → Page

5.2 Scan Metrics​

Supabase Queries:

-- Monitor scan success rate (last 24 hours)
SELECT
status,
COUNT(*) as count,
ROUND(COUNT(*) * 100.0 / SUM(COUNT(*)) OVER (), 2) as percentage
FROM document_scans
WHERE scanned_at > NOW() - INTERVAL '24 hours'
GROUP BY status;

-- Expected: CLEAN > 95%, INFECTED < 5%, FAILED < 1%

Set Up Alerts:

  • Failed scans > 10/hour → Investigate ClamAV
  • Infections > 50/day → Security review
  • Quarantined documents > 100 → Review with security team

5.3 Performance Metrics​

Track Scan Duration:

-- Average scan time (should be < 2 seconds for small files)
SELECT
AVG(scan_duration_ms) as avg_ms,
PERCENTILE_CONT(0.95) WITHIN GROUP (ORDER BY scan_duration_ms) as p95_ms
FROM document_scans
WHERE scanned_at > NOW() - INTERVAL '24 hours'
AND status != 'FAILED';

Expected Performance:

  • Average: 300-800ms
  • P95: < 2000ms
  • P99: < 5000ms

Step 6: Rollback Plan​

If issues occur after deployment:

6.1 Emergency Rollback​

Disable Scanning (immediate):

# In Vercel, set environment variable:
ENABLE_DOCUMENT_SCANNING=false

# Redeploy (takes 2-3 minutes)
# This disables scanning but allows uploads to continue

Revert Vercel Deployment:

  1. Go to Vercel Dashboard → Deployments
  2. Find last known good deployment
  3. Click "..." → "Promote to Production"

6.2 Partial Rollback​

Keep scanning, disable auto-quarantine:

# Update environment variable:
SCAN_EMIT_ON_FINALIZE=false

# Scans will still run but won't quarantine automatically
# Staff can manually review in Console

6.3 Database Rollback​

If database issues occur:

-- Restore quarantine state
UPDATE documents
SET is_quarantined = false
WHERE quarantined_at > 'YYYY-MM-DD HH:MM:SS';

-- Delete scan records (if needed)
DELETE FROM document_scans
WHERE created_at > 'YYYY-MM-DD HH:MM:SS';

Step 7: Post-Deployment Validation​

7.1 First 24 Hours​

Monitor:

  • ClamAV uptime: Should be 100%
  • Scan success rate: Should be > 95%
  • No FAILED scans (unless expected)
  • Vercel function logs: No errors related to ClamAV
  • Console quarantine page: Accessible to security_admin

Test Cases:

  • Upload 10 clean documents → All marked CLEAN
  • Upload 1 EICAR test → Marked INFECTED, quarantined
  • Check quarantine metrics dashboard → Shows correct counts
  • Generate compliance report → Returns valid data

7.2 First Week​

Performance Review:

-- Weekly scan statistics
SELECT
DATE(scanned_at) as date,
COUNT(*) as total_scans,
SUM(CASE WHEN status = 'CLEAN' THEN 1 ELSE 0 END) as clean,
SUM(CASE WHEN status = 'INFECTED' THEN 1 ELSE 0 END) as infected,
SUM(CASE WHEN status = 'FAILED' THEN 1 ELSE 0 END) as failed,
AVG(scan_duration_ms) as avg_duration_ms
FROM document_scans
WHERE scanned_at > NOW() - INTERVAL '7 days'
GROUP BY DATE(scanned_at)
ORDER BY date DESC;

Security Review:

  • Review all quarantined documents
  • Investigate any false positives
  • Check for scanning blind spots
  • Validate virus signature coverage

Troubleshooting​

Issue 1: ClamAV Connection Timeout​

Symptoms:

ERROR: Failed to connect to ClamAV: ETIMEDOUT

Diagnosis:

# Check Railway service status
railway status

# Check internal DNS resolution
nslookup torvus-clamav.railway.internal

Fix:

  1. Verify ClamAV service is running in Railway
  2. Check CLAMAV_HOST environment variable
  3. Increase CLAMAV_TIMEOUT to 10000ms
  4. Ensure Railway internal networking is enabled

Issue 2: Scans Failing with "Download Timeout"​

Symptoms:

ERROR: Download timeout after 30000ms

Diagnosis:

  • Large files (> 50MB) taking too long to download
  • Storage bucket under heavy load

Fix:

  1. Increase CLAMAV_DOWNLOAD_TIMEOUT to 60000ms
  2. Check Supabase storage bandwidth limits
  3. Monitor file sizes (consider size limits)

Issue 3: Memory Issues on Railway​

Symptoms:

ERROR: ClamAV service restarting frequently
Memory usage: 4.2GB / 4GB

Fix:

  1. Upgrade Railway plan to 8GB memory
  2. Or: Reduce CLAMAV_MAX_DOWNLOAD_MB to 50MB
  3. Monitor concurrent scan load

Issue 4: False Positives​

Symptoms:

  • Clean files marked as INFECTED
  • Virus signatures don't match known threats

Fix:

  1. Review quarantined file in Console
  2. Check ClamAV signature database version:
    railway run -- clamconf | grep DatabaseDirectory
  3. Update ClamAV signatures:
    railway run -- freshclam
  4. Restore false positive from quarantine (future feature)

Success Criteria​

✅ Deployment Complete when:

  • ClamAV deployed to Railway and responding to pings
  • Platform app deployed with ClamAV environment variables
  • Console app deployed with quarantine UI
  • Test upload (clean file) scans successfully
  • Test upload (EICAR) detects virus and quarantines
  • Quarantine page accessible to security_admin
  • Metrics dashboard shows accurate data
  • No errors in Vercel function logs
  • Scan success rate > 95% over 24 hours

Maintenance​

Weekly Tasks​

  • Review quarantined documents (false positives?)
  • Check ClamAV memory usage trends
  • Monitor scan success rate
  • Review compliance reports

Monthly Tasks​

  • Verify ClamAV signature database updates
  • Review virus detection trends
  • Performance optimization review
  • Security incident review

Quarterly Tasks​

  • Full system audit
  • Load testing (simulate high upload volume)
  • Disaster recovery drill
  • Update documentation

Production Deployment Status: ✅ Ready for deployment

Estimated Deployment Time: 30-45 minutes

Rollback Time: < 5 minutes

Questions? Contact security team or check Phase 1-6 documentation.


Last Updated: October 2025