Production Deployment Guide: Malware Scanning System
Target Environment: Vercel (Platform App) + Railway (ClamAV) + Supabase (Database)
Last Updated: October 17, 2025
Prerequisites: Phases 1-6 completed, tested locally
Overview​
This guide walks through deploying the complete malware scanning system to production. The architecture consists of:
- Platform App (Vercel): Document upload API, scanning logic
- Console App (Vercel): Staff quarantine management UI
- ClamAV Daemon (Railway): Malware scanning engine
- Supabase: Database and file storage (quarantine bucket)
Pre-Deployment Checklist​
1. Database Migration​
Status: ✅ Schema already deployed (Phase 2)
Verify production schema has these tables:
-- Documents quarantine fields
SELECT column_name FROM information_schema.columns
WHERE table_name = 'documents'
AND column_name IN ('is_quarantined', 'quarantined_at', 'quarantine_reason');
-- Document scans table
SELECT table_name FROM information_schema.tables
WHERE table_name = 'document_scans';
-- Quarantine events table
SELECT table_name FROM information_schema.tables
WHERE table_name = 'quarantine_events';
If missing, apply migrations from /supabase/migrations/20251017_add_quarantine_fields.sql.
2. Supabase Storage Buckets​
Quarantine Bucket:
-- Verify quarantine bucket exists
SELECT name FROM storage.buckets WHERE name = 'quarantine';
-- If missing, create it:
INSERT INTO storage.buckets (id, name, public)
VALUES ('quarantine', 'quarantine', false);
Row Level Security (RLS):
-- Verify RLS policies exist
SELECT policyname FROM pg_policies
WHERE tablename = 'objects'
AND schemaname = 'storage'
AND policyname LIKE '%quarantine%';
Apply policies from Phase 2 documentation if missing.
3. Environment Variables​
Required for Platform App:
# ClamAV Configuration
CLAMAV_HOST=torvus-clamav.railway.internal # Railway internal URL
CLAMAV_PORT=3310
CLAMAV_TIMEOUT=5000 # 5 seconds
CLAMAV_MAX_DOWNLOAD_MB=100
CLAMAV_DOWNLOAD_TIMEOUT=30000 # 30 seconds
# Scanning Configuration
ENABLE_DOCUMENT_SCANNING=true
SCAN_EMIT_ON_FINALIZE=true
SCAN_MAX_FILE_MB=100
# Supabase (already configured)
SUPABASE_URL=https://your-project.supabase.co
SUPABASE_SERVICE_ROLE_KEY=your-service-role-key
Required for Console App:
# Same Supabase credentials as Platform
SUPABASE_URL=https://your-project.supabase.co
SUPABASE_SERVICE_ROLE_KEY=your-service-role-key
Step 1: Deploy ClamAV to Railway​
1.1 Create Railway Service​
- Log in to Railway: https://railway.app
- Select your project (or create new)
- Click "New Service" → "Docker Image"
- Use image:
clamav/clamav:latest
1.2 Configure ClamAV Service​
Environment Variables (none required - uses defaults)
Resources:
- Memory: 4GB minimum (ClamAV signature database is ~400MB)
- CPU: 1 vCPU minimum
- Storage: 2GB for virus definitions
Networking:
- Port: 3310 (exposed automatically)
- Internal URL:
torvus-clamav.railway.internal - Public URL: Not required (internal only)
1.3 Health Check​
Railway will show "Deploying..." → "Active" when ready.
Verify from local machine:
# Test with nc (netcat)
nc -zv torvus-clamav.up.railway.app 3310
# Or test ping via Node.js
node -e "
const net = require('net');
const client = net.connect(3310, 'torvus-clamav.up.railway.app', () => {
client.write('PING\n');
});
client.on('data', (data) => {
console.log('Response:', data.toString());
client.end();
});
"
Expected response: PONG
1.4 Update Signature Database​
ClamAV automatically updates virus definitions on startup. Initial boot takes ~2-3 minutes.
Monitor logs in Railway:
Downloading daily.cvd [===========================] 100%
Downloading main.cvd [============================] 100%
Database updated successfully
ClamAV daemon ready
Step 2: Configure Vercel Environment Variables​
2.1 Platform App Configuration​
Navigate to Vercel Dashboard → Project: torvus-platform → Settings → Environment Variables
Add/Update:
CLAMAV_HOST = torvus-clamav.railway.internal
CLAMAV_PORT = 3310
CLAMAV_TIMEOUT = 5000
CLAMAV_MAX_DOWNLOAD_MB = 100
CLAMAV_DOWNLOAD_TIMEOUT = 30000
ENABLE_DOCUMENT_SCANNING = true
SCAN_EMIT_ON_FINALIZE = true
SCAN_MAX_FILE_MB = 100
Environment: Set for Production, Preview, and Development
2.2 Console App Configuration​
Navigate to Vercel Dashboard → Project: torvus-console → Settings → Environment Variables
No additional variables needed (uses same Supabase credentials).
Step 3: Deploy Code to Vercel​
3.1 Commit and Push​
# Ensure all Phase 1-6 code is committed
git status
# Add any uncommitted files
git add .
git commit -m "feat: Complete malware scanning system (Phases 1-6)
- ClamAV client and Docker integration
- Database schema with quarantine tables
- Core scanning logic with retry mechanism
- ClamAV adapter for production workflow
- Console quarantine management UI
- Metrics dashboard and compliance reporting
All phases tested and documented."
# Push to main branch (triggers Vercel deployment)
git push origin main
3.2 Monitor Deployment​
Vercel will automatically deploy both apps:
Platform App:
- Build time: ~3-5 minutes
- Check logs for ClamAV initialization
Console App:
- Build time: ~2-4 minutes
- New routes:
/admin/quarantine,/admin/quarantine/metrics
3.3 Verify Build Success​
Check Vercel deployment logs for:
✓ Building pages
✓ Compiling TypeScript
✓ Linting
✓ Collecting page data
✓ Generating static pages
✓ Finalizing build
Step 4: Verification & Testing​
4.1 Health Check Endpoints​
Test ClamAV Integration:
# From Platform API
curl https://api.torvussecurity.com/api/health
# Expected response:
{
"status": "healthy",
"scanner": {
"engine": "clamav",
"available": true,
"version": "ClamAV 1.x.x/..."
}
}
Test Console Quarantine Page:
# Navigate in browser (requires security_admin role)
https://console.torvussecurity.com/admin/quarantine
# Should display:
- Empty state: "No quarantined documents"
- Or: List of quarantined documents (if any exist)
4.2 End-to-End Scan Test​
Upload Clean File:
# 1. Create test file
echo "This is a clean test document" > test-clean.txt
# 2. Upload to Platform (use your vault ID)
curl -X POST https://api.torvussecurity.com/api/vaults/{vaultId}/documents/initiate \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{"filename":"test-clean.txt","mimeType":"text/plain","size":30}'
# 3. Upload file to storage (use signed URL from response)
curl -X PUT "{signedUrl}" \
--upload-file test-clean.txt
# 4. Finalize upload
curl -X POST https://api.torvussecurity.com/api/documents/finalize \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{"documentId":"...","storageKey":"..."}'
# Expected: Scan completes with status: CLEAN
Upload EICAR Test Virus:
# 1. Create EICAR test file (safe test signature)
echo 'X5O!P%@AP[4\PZX54(P^)7CC)7}$EICAR-STANDARD-ANTIVIRUS-TEST-FILE!$H+H*' > eicar.txt
# 2. Follow same upload process as above
# Expected:
# - Scan detects virus: "Eicar-Test-Signature"
# - Document automatically quarantined
# - Appears in Console quarantine page
4.3 Console UI Testing​
Quarantine Documents Page:
- Log in to Console as security_admin
- Navigate to Security → Quarantine → Documents
- Verify EICAR test file appears in list
- Check virus badge shows "Eicar-Test-Signature"
- Verify status badge shows "INFECTED"
Metrics Dashboard:
- Navigate to Security → Quarantine → Metrics
- Verify metrics cards show:
- Total Quarantined: 1
- Infected Documents: 1
- Top Threats: "Eicar-Test-Signature"
- Check timeline shows recent scan
Compliance Report:
# Generate 30-day compliance report
curl https://console.torvussecurity.com/api/admin/quarantine/compliance \
-H "Authorization: Bearer YOUR_CONSOLE_TOKEN" \
"?startDate=2025-09-17T00:00:00Z&endDate=2025-10-17T23:59:59Z"
# Expected: JSON report with summary, threats, vault breakdown
Step 5: Monitoring & Alerting​
5.1 ClamAV Health Monitoring​
Railway Monitoring:
- Check ClamAV memory usage (should stay under 2GB)
- Monitor CPU usage (spikes during scans are normal)
- Check uptime (should be 99%+)
Add Alerts (Railway Dashboard):
- Memory > 3.5GB → Alert
- CPU > 90% for 5 minutes → Alert
- Service down > 1 minute → Page
5.2 Scan Metrics​
Supabase Queries:
-- Monitor scan success rate (last 24 hours)
SELECT
status,
COUNT(*) as count,
ROUND(COUNT(*) * 100.0 / SUM(COUNT(*)) OVER (), 2) as percentage
FROM document_scans
WHERE scanned_at > NOW() - INTERVAL '24 hours'
GROUP BY status;
-- Expected: CLEAN > 95%, INFECTED < 5%, FAILED < 1%
Set Up Alerts:
- Failed scans > 10/hour → Investigate ClamAV
- Infections > 50/day → Security review
- Quarantined documents > 100 → Review with security team
5.3 Performance Metrics​
Track Scan Duration:
-- Average scan time (should be < 2 seconds for small files)
SELECT
AVG(scan_duration_ms) as avg_ms,
PERCENTILE_CONT(0.95) WITHIN GROUP (ORDER BY scan_duration_ms) as p95_ms
FROM document_scans
WHERE scanned_at > NOW() - INTERVAL '24 hours'
AND status != 'FAILED';
Expected Performance:
- Average: 300-800ms
- P95: < 2000ms
- P99: < 5000ms
Step 6: Rollback Plan​
If issues occur after deployment:
6.1 Emergency Rollback​
Disable Scanning (immediate):
# In Vercel, set environment variable:
ENABLE_DOCUMENT_SCANNING=false
# Redeploy (takes 2-3 minutes)
# This disables scanning but allows uploads to continue
Revert Vercel Deployment:
- Go to Vercel Dashboard → Deployments
- Find last known good deployment
- Click "..." → "Promote to Production"
6.2 Partial Rollback​
Keep scanning, disable auto-quarantine:
# Update environment variable:
SCAN_EMIT_ON_FINALIZE=false
# Scans will still run but won't quarantine automatically
# Staff can manually review in Console
6.3 Database Rollback​
If database issues occur:
-- Restore quarantine state
UPDATE documents
SET is_quarantined = false
WHERE quarantined_at > 'YYYY-MM-DD HH:MM:SS';
-- Delete scan records (if needed)
DELETE FROM document_scans
WHERE created_at > 'YYYY-MM-DD HH:MM:SS';
Step 7: Post-Deployment Validation​
7.1 First 24 Hours​
Monitor:
- ClamAV uptime: Should be 100%
- Scan success rate: Should be > 95%
- No FAILED scans (unless expected)
- Vercel function logs: No errors related to ClamAV
- Console quarantine page: Accessible to security_admin
Test Cases:
- Upload 10 clean documents → All marked CLEAN
- Upload 1 EICAR test → Marked INFECTED, quarantined
- Check quarantine metrics dashboard → Shows correct counts
- Generate compliance report → Returns valid data
7.2 First Week​
Performance Review:
-- Weekly scan statistics
SELECT
DATE(scanned_at) as date,
COUNT(*) as total_scans,
SUM(CASE WHEN status = 'CLEAN' THEN 1 ELSE 0 END) as clean,
SUM(CASE WHEN status = 'INFECTED' THEN 1 ELSE 0 END) as infected,
SUM(CASE WHEN status = 'FAILED' THEN 1 ELSE 0 END) as failed,
AVG(scan_duration_ms) as avg_duration_ms
FROM document_scans
WHERE scanned_at > NOW() - INTERVAL '7 days'
GROUP BY DATE(scanned_at)
ORDER BY date DESC;
Security Review:
- Review all quarantined documents
- Investigate any false positives
- Check for scanning blind spots
- Validate virus signature coverage
Troubleshooting​
Issue 1: ClamAV Connection Timeout​
Symptoms:
ERROR: Failed to connect to ClamAV: ETIMEDOUT
Diagnosis:
# Check Railway service status
railway status
# Check internal DNS resolution
nslookup torvus-clamav.railway.internal
Fix:
- Verify ClamAV service is running in Railway
- Check
CLAMAV_HOSTenvironment variable - Increase
CLAMAV_TIMEOUTto 10000ms - Ensure Railway internal networking is enabled
Issue 2: Scans Failing with "Download Timeout"​
Symptoms:
ERROR: Download timeout after 30000ms
Diagnosis:
- Large files (> 50MB) taking too long to download
- Storage bucket under heavy load
Fix:
- Increase
CLAMAV_DOWNLOAD_TIMEOUTto 60000ms - Check Supabase storage bandwidth limits
- Monitor file sizes (consider size limits)
Issue 3: Memory Issues on Railway​
Symptoms:
ERROR: ClamAV service restarting frequently
Memory usage: 4.2GB / 4GB
Fix:
- Upgrade Railway plan to 8GB memory
- Or: Reduce
CLAMAV_MAX_DOWNLOAD_MBto 50MB - Monitor concurrent scan load
Issue 4: False Positives​
Symptoms:
- Clean files marked as INFECTED
- Virus signatures don't match known threats
Fix:
- Review quarantined file in Console
- Check ClamAV signature database version:
railway run -- clamconf | grep DatabaseDirectory - Update ClamAV signatures:
railway run -- freshclam - Restore false positive from quarantine (future feature)
Success Criteria​
✅ Deployment Complete when:
- ClamAV deployed to Railway and responding to pings
- Platform app deployed with ClamAV environment variables
- Console app deployed with quarantine UI
- Test upload (clean file) scans successfully
- Test upload (EICAR) detects virus and quarantines
- Quarantine page accessible to security_admin
- Metrics dashboard shows accurate data
- No errors in Vercel function logs
- Scan success rate > 95% over 24 hours
Maintenance​
Weekly Tasks​
- Review quarantined documents (false positives?)
- Check ClamAV memory usage trends
- Monitor scan success rate
- Review compliance reports
Monthly Tasks​
- Verify ClamAV signature database updates
- Review virus detection trends
- Performance optimization review
- Security incident review
Quarterly Tasks​
- Full system audit
- Load testing (simulate high upload volume)
- Disaster recovery drill
- Update documentation
Production Deployment Status: ✅ Ready for deployment
Estimated Deployment Time: 30-45 minutes
Rollback Time: < 5 minutes
Questions? Contact security team or check Phase 1-6 documentation.
Last Updated: October 2025