Design Email Campaign System
A comprehensive system design for an Email Campaign System, covering architecture, scalability, and key features for managing and delivering email marketing campaigns at scale.
Table of Contents
- Problem Statement
- Requirements
- Capacity Estimation
- System APIs
- Database Design
- High-Level Design
- Component Design
- Email Delivery Pipeline
- Template Engine
- A/B Testing
- Analytics & Tracking
- Data Partitioning
- Caching Strategy
- Load Balancing
- Security & Compliance
- Monitoring & Alerting
Problem Statement
Design an Email Campaign System that enables businesses to:
- Create and manage email marketing campaigns
- Design and customize email templates
- Manage contact lists and audience segmentation
- Schedule and automate email campaigns
- Deliver millions of emails reliably
- Track campaign performance (opens, clicks, bounces, unsubscribes)
- Conduct A/B testing for campaign optimization
- Ensure compliance with email regulations (CAN-SPAM, GDPR)
- Handle high-volume email delivery with high deliverability rates
Key Challenges
- High Volume Delivery: Send millions of emails per day reliably
- Deliverability: Maintain high inbox placement rates and avoid spam filters
- Personalization: Support dynamic content and personalization at scale
- Real-time Tracking: Track opens, clicks, and bounces in real-time
- Compliance: Handle unsubscribes, bounces, and regulatory requirements
- Rate Limiting: Respect email service provider limits and avoid throttling
- Template Rendering: Efficiently render personalized templates for millions of recipients
Requirements
Functional Requirements
Campaign Management
- Create, edit, and delete email campaigns
- Schedule campaigns for future delivery
- Pause, resume, and cancel active campaigns
- Duplicate existing campaigns
- Support recurring campaigns (daily, weekly, monthly)
Template Management
- Create and edit email templates
- Support HTML and plain text formats
- Template versioning and history
- Responsive design templates
- Dynamic content and personalization variables
- Template preview and testing
Contact Management
- Import contacts (CSV, API)
- Create and manage contact lists/segments
- Contact deduplication
- Contact tags and custom fields
- Unsubscribe management
- Bounce and complaint handling
Audience Segmentation
- Segment contacts based on attributes (demographics, behavior, tags)
- Dynamic segmentation (real-time updates)
- Segment preview and size estimation
- Exclude segments from campaigns
Email Delivery
- Send emails via multiple email service providers (SMTP, AWS SES, SendGrid, Mailgun)
- Support transactional and marketing emails
- Handle email attachments
- Support email priority levels
- Retry failed deliveries with exponential backoff
Tracking & Analytics
- Track email opens (pixel tracking)
- Track link clicks (link wrapping)
- Track bounces (hard and soft bounces)
- Track unsubscribes
- Track spam complaints
- Real-time campaign statistics dashboard
- Historical analytics and reporting
A/B Testing
- Test subject lines, content, send times
- Automatic winner selection
- Statistical significance calculation
- Support multiple variants (A/B/C testing)
Automation
- Triggered emails (welcome, abandoned cart, etc.)
- Drip campaigns (multi-email sequences)
- Event-based automation
- Conditional logic in automation flows
Compliance
- Unsubscribe links in every email
- Preference center for subscribers
- Handle bounce management (suppress hard bounces)
- GDPR compliance (right to be forgotten, consent management)
- CAN-SPAM compliance
Non-Functional Requirements
- Scalability: Support 100 million contacts, 10 million emails per day
- Availability: 99.9% uptime
- Performance:
- Campaign creation: < 2 seconds
- Template rendering: < 100ms per email
- Email delivery: < 5 seconds from send request to queued
- Analytics queries: < 1 second
- Reliability: No email loss, handle failures gracefully
- Deliverability: Maintain > 95% inbox delivery rate
- Consistency: Strong consistency for critical operations (unsubscribes, bounces)
- Throughput: Process 100,000 emails per minute
Capacity Estimation
Traffic Estimates
- Total contacts: 100 million
- Active campaigns: 10,000 concurrent campaigns
- Emails per day: 10 million
- Peak sending rate: 100,000 emails per minute (1,667 emails/second)
- Read-to-write ratio: 10:1 (analytics reads vs. email sends)
- Template renders per day: 10 million
- Tracking events per day:
- Opens: 2 million (20% open rate)
- Clicks: 500,000 (5% click rate)
- Bounces: 100,000 (1% bounce rate)
- Unsubscribes: 10,000 (0.1% unsubscribe rate)
- Total tracking events: ~2.6 million per day
Storage Estimates
Campaign Data:
- Campaign metadata: ~5 KB per campaign
- 10,000 active campaigns × 5 KB = 50 MB
- Historical campaigns (1 year): 100,000 campaigns × 5 KB = 500 MB
Email Templates:
- Average template size: ~50 KB (HTML + assets)
- 10,000 templates × 50 KB = 500 MB
- Template versions: 50,000 versions × 50 KB = 2.5 GB
- Total: ~3 GB
Contact Data:
- Contact record: ~1 KB (name, email, custom fields, tags)
- 100M contacts × 1 KB = 100 GB
- With 3x replication: 300 GB
Email Delivery Logs:
- Delivery log entry: ~500 bytes (campaign_id, contact_id, status, timestamp)
- 10M emails/day × 500 bytes = 5 GB/day
- 1 year retention: 5 GB × 365 = 1.8 TB
- With 3x replication: 5.4 TB
Tracking Events:
- Tracking event: ~200 bytes (event_type, email_id, timestamp, metadata)
- 2.6M events/day × 200 bytes = 520 MB/day
- 1 year retention: 520 MB × 365 = 190 GB
- With 3x replication: 570 GB
Total Storage: ~6.3 TB (with replication and 1 year retention)
Bandwidth Estimates
Outbound (Email Delivery):
- Average email size: 50 KB (including HTML and images)
- 10M emails/day × 50 KB = 500 GB/day
- Peak: 100K emails/min × 50 KB = 5 GB/min = 300 GB/hour
Inbound (Tracking Events):
- Tracking pixel requests: 2M opens/day × 1 KB = 2 GB/day
- Click tracking redirects: 500K clicks/day × 2 KB = 1 GB/day
- Total inbound: ~3 GB/day
System APIs
Campaign Management APIs
# Create a new campaign
POST /api/v1/campaigns
Request Body:
{
"name": "Summer Sale 2024",
"subject": "50% Off Summer Collection",
"template_id": "template_123",
"sender_email": "marketing@example.com",
"sender_name": "Example Store",
"contact_list_ids": ["list_1", "list_2"],
"schedule_time": "2024-07-01T10:00:00Z",
"ab_test_config": {
"enabled": true,
"variants": [
{"subject": "50% Off Summer", "weight": 50},
{"subject": "Summer Sale - Save 50%", "weight": 50}
],
"test_duration_hours": 2
}
}
Response: { "campaign_id": "campaign_456", "status": "scheduled" }
# Get campaign details
GET /api/v1/campaigns/{campaign_id}
Response: {
"campaign_id": "campaign_456",
"name": "Summer Sale 2024",
"status": "scheduled",
"total_recipients": 100000,
"sent_count": 0,
"delivered_count": 0,
"opened_count": 0,
"clicked_count": 0,
"bounced_count": 0,
"unsubscribed_count": 0
}
# Update campaign
PUT /api/v1/campaigns/{campaign_id}
Request Body: { "name": "Updated Campaign Name" }
# Delete campaign
DELETE /api/v1/campaigns/{campaign_id}
# Send campaign immediately
POST /api/v1/campaigns/{campaign_id}/send
# Pause campaign
POST /api/v1/campaigns/{campaign_id}/pause
# Resume campaign
POST /api/v1/campaigns/{campaign_id}/resume
Template Management APIs
# Create template
POST /api/v1/templates
Request Body: {
"name": "Welcome Email",
"subject": "Welcome to {{company_name}}!",
"html_content": "<html>...</html>",
"text_content": "Plain text version",
"variables": ["company_name", "user_name"]
}
Response: { "template_id": "template_123" }
# Get template
GET /api/v1/templates/{template_id}
# Update template
PUT /api/v1/templates/{template_id}
# Delete template
DELETE /api/v1/templates/{template_id}
# Preview template with sample data
POST /api/v1/templates/{template_id}/preview
Request Body: {
"variables": {
"company_name": "Example Corp",
"user_name": "John Doe"
}
}
Response: { "html": "...", "text": "..." }
Contact Management APIs
# Create contact
POST /api/v1/contacts
Request Body: {
"email": "user@example.com",
"first_name": "John",
"last_name": "Doe",
"tags": ["vip", "newsletter"],
"custom_fields": {"age": 30, "city": "NYC"}
}
Response: { "contact_id": "contact_789" }
# Bulk import contacts
POST /api/v1/contacts/import
Request Body: {
"list_id": "list_1",
"contacts": [
{"email": "user1@example.com", "first_name": "John"},
{"email": "user2@example.com", "first_name": "Jane"}
]
}
Response: { "import_id": "import_123", "status": "processing" }
# Get contact
GET /api/v1/contacts/{contact_id}
# Update contact
PUT /api/v1/contacts/{contact_id}
# Delete contact
DELETE /api/v1/contacts/{contact_id}
# Unsubscribe contact
POST /api/v1/contacts/{contact_id}/unsubscribe
Request Body: { "reason": "no_longer_interested" }
List/Segment Management APIs
# Create list
POST /api/v1/lists
Request Body: {
"name": "VIP Customers",
"description": "Customers with purchase > $1000"
}
Response: { "list_id": "list_1" }
# Create dynamic segment
POST /api/v1/segments
Request Body: {
"name": "Active Users",
"conditions": {
"and": [
{"field": "last_purchase_date", "operator": ">", "value": "2024-01-01"},
{"field": "tags", "operator": "contains", "value": "active"}
]
}
}
Response: { "segment_id": "segment_1" }
# Get list/segment contacts
GET /api/v1/lists/{list_id}/contacts?page=1&page_size=100
# Add contacts to list
POST /api/v1/lists/{list_id}/contacts
Request Body: { "contact_ids": ["contact_1", "contact_2"] }
# Remove contacts from list
DELETE /api/v1/lists/{list_id}/contacts
Request Body: { "contact_ids": ["contact_1"] }
Analytics APIs
# Get campaign analytics
GET /api/v1/campaigns/{campaign_id}/analytics
Response: {
"campaign_id": "campaign_456",
"sent": 100000,
"delivered": 98000,
"opened": 19600,
"clicked": 4900,
"bounced": 2000,
"unsubscribed": 100,
"open_rate": 20.0,
"click_rate": 5.0,
"bounce_rate": 2.0,
"unsubscribe_rate": 0.1,
"deliverability_rate": 98.0
}
# Get real-time campaign stats
GET /api/v1/campaigns/{campaign_id}/stats/realtime
# Get contact engagement history
GET /api/v1/contacts/{contact_id}/engagement
Response: {
"contact_id": "contact_789",
"total_emails_sent": 50,
"total_opens": 30,
"total_clicks": 10,
"last_opened": "2024-06-15T10:30:00Z",
"last_clicked": "2024-06-15T11:00:00Z"
}
Webhook APIs
# Register webhook
POST /api/v1/webhooks
Request Body: {
"url": "https://example.com/webhook",
"events": ["email.opened", "email.clicked", "email.bounced"]
}
Response: { "webhook_id": "webhook_123" }
# Unsubscribe endpoint (public)
POST /api/v1/public/unsubscribe
Request Body: {
"email": "user@example.com",
"campaign_id": "campaign_456",
"token": "unsubscribe_token"
}
Database Design
Core Tables
campaigns
CREATE TABLE campaigns (
campaign_id VARCHAR(50) PRIMARY KEY,
name VARCHAR(255) NOT NULL,
subject VARCHAR(500),
template_id VARCHAR(50) NOT NULL,
sender_email VARCHAR(255) NOT NULL,
sender_name VARCHAR(255),
status ENUM('draft', 'scheduled', 'sending', 'sent', 'paused', 'cancelled') NOT NULL,
schedule_time TIMESTAMP,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
created_by VARCHAR(50),
total_recipients INT DEFAULT 0,
sent_count INT DEFAULT 0,
delivered_count INT DEFAULT 0,
opened_count INT DEFAULT 0,
clicked_count INT DEFAULT 0,
bounced_count INT DEFAULT 0,
unsubscribed_count INT DEFAULT 0,
ab_test_config JSON,
INDEX idx_status (status),
INDEX idx_schedule_time (schedule_time),
INDEX idx_created_at (created_at)
);
templates
CREATE TABLE templates (
template_id VARCHAR(50) PRIMARY KEY,
name VARCHAR(255) NOT NULL,
subject VARCHAR(500),
html_content TEXT NOT NULL,
text_content TEXT,
variables JSON, -- List of available variables
version INT DEFAULT 1,
parent_template_id VARCHAR(50), -- For versioning
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
created_by VARCHAR(50),
is_active BOOLEAN DEFAULT TRUE,
INDEX idx_created_by (created_by),
INDEX idx_is_active (is_active)
);
contacts
CREATE TABLE contacts (
contact_id VARCHAR(50) PRIMARY KEY,
email VARCHAR(255) NOT NULL UNIQUE,
first_name VARCHAR(100),
last_name VARCHAR(100),
phone VARCHAR(20),
status ENUM('active', 'unsubscribed', 'bounced', 'complained') DEFAULT 'active',
unsubscribe_reason VARCHAR(255),
subscribed_at TIMESTAMP,
unsubscribed_at TIMESTAMP,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
custom_fields JSON, -- Flexible custom fields
INDEX idx_email (email),
INDEX idx_status (status),
INDEX idx_created_at (created_at)
);
contact_lists
CREATE TABLE contact_lists (
list_id VARCHAR(50) PRIMARY KEY,
name VARCHAR(255) NOT NULL,
description TEXT,
type ENUM('static', 'dynamic') DEFAULT 'static',
segment_conditions JSON, -- For dynamic lists
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
created_by VARCHAR(50),
contact_count INT DEFAULT 0,
INDEX idx_type (type)
);
list_contacts (Many-to-Many)
CREATE TABLE list_contacts (
list_id VARCHAR(50) NOT NULL,
contact_id VARCHAR(50) NOT NULL,
added_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (list_id, contact_id),
INDEX idx_contact_id (contact_id),
FOREIGN KEY (list_id) REFERENCES contact_lists(list_id) ON DELETE CASCADE,
FOREIGN KEY (contact_id) REFERENCES contacts(contact_id) ON DELETE CASCADE
);
campaign_contacts (Email Delivery Queue)
CREATE TABLE campaign_contacts (
id BIGINT AUTO_INCREMENT PRIMARY KEY,
campaign_id VARCHAR(50) NOT NULL,
contact_id VARCHAR(50) NOT NULL,
email VARCHAR(255) NOT NULL,
status ENUM('pending', 'queued', 'sent', 'delivered', 'failed', 'bounced') DEFAULT 'pending',
scheduled_at TIMESTAMP,
sent_at TIMESTAMP,
delivered_at TIMESTAMP,
error_message TEXT,
retry_count INT DEFAULT 0,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
INDEX idx_campaign_status (campaign_id, status),
INDEX idx_scheduled_at (scheduled_at),
INDEX idx_contact_id (contact_id),
FOREIGN KEY (campaign_id) REFERENCES campaigns(campaign_id) ON DELETE CASCADE,
FOREIGN KEY (contact_id) REFERENCES contacts(contact_id) ON DELETE CASCADE
);
email_events (Tracking Events)
CREATE TABLE email_events (
event_id BIGINT AUTO_INCREMENT PRIMARY KEY,
campaign_id VARCHAR(50) NOT NULL,
contact_id VARCHAR(50) NOT NULL,
email VARCHAR(255) NOT NULL,
event_type ENUM('sent', 'delivered', 'opened', 'clicked', 'bounced', 'unsubscribed', 'complained') NOT NULL,
event_data JSON, -- Additional event metadata (link URL for clicks, bounce reason, etc.)
ip_address VARCHAR(45),
user_agent TEXT,
occurred_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
INDEX idx_campaign_event (campaign_id, event_type),
INDEX idx_contact_id (contact_id),
INDEX idx_occurred_at (occurred_at),
INDEX idx_email (email)
) PARTITION BY RANGE (UNIX_TIMESTAMP(occurred_at)) (
PARTITION p2024_01 VALUES LESS THAN (UNIX_TIMESTAMP('2024-02-01')),
PARTITION p2024_02 VALUES LESS THAN (UNIX_TIMESTAMP('2024-03-01')),
-- ... monthly partitions
);
tags
CREATE TABLE tags (
tag_id VARCHAR(50) PRIMARY KEY,
name VARCHAR(100) NOT NULL UNIQUE,
color VARCHAR(7), -- Hex color code
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
contact_tags (Many-to-Many)
CREATE TABLE contact_tags (
contact_id VARCHAR(50) NOT NULL,
tag_id VARCHAR(50) NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (contact_id, tag_id),
INDEX idx_tag_id (tag_id),
FOREIGN KEY (contact_id) REFERENCES contacts(contact_id) ON DELETE CASCADE,
FOREIGN KEY (tag_id) REFERENCES tags(tag_id) ON DELETE CASCADE
);
email_providers
CREATE TABLE email_providers (
provider_id VARCHAR(50) PRIMARY KEY,
name VARCHAR(100) NOT NULL, -- 'aws_ses', 'sendgrid', 'mailgun', 'smtp'
type ENUM('smtp', 'api') NOT NULL,
config JSON NOT NULL, -- Provider-specific configuration
rate_limit_per_second INT,
daily_limit INT,
is_active BOOLEAN DEFAULT TRUE,
priority INT DEFAULT 0, -- For load balancing
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
ab_test_results
CREATE TABLE ab_test_results (
test_id VARCHAR(50) PRIMARY KEY,
campaign_id VARCHAR(50) NOT NULL,
variant_name VARCHAR(50) NOT NULL,
recipients INT DEFAULT 0,
sent INT DEFAULT 0,
delivered INT DEFAULT 0,
opened INT DEFAULT 0,
clicked INT DEFAULT 0,
open_rate DECIMAL(5,2),
click_rate DECIMAL(5,2),
is_winner BOOLEAN DEFAULT FALSE,
statistical_significance DECIMAL(5,2),
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
INDEX idx_campaign_id (campaign_id),
FOREIGN KEY (campaign_id) REFERENCES campaigns(campaign_id) ON DELETE CASCADE
);
High-Level Design
┌─────────────────────────────────────────────────────────────────┐
│ Client Applications │
│ (Web Dashboard, Mobile App, API Clients) │
└────────────────────────────┬────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ API Gateway / Load Balancer │
│ (Rate Limiting, Authentication) │
└────────────────────────────┬────────────────────────────────────┘
│
┌────────────────────┼────────────────────┐
│ │ │
▼ ▼ ▼
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Campaign │ │ Template │ │ Contact │
│ Service │ │ Service │ │ Service │
└──────┬───────┘ └──────┬───────┘ └──────┬───────┘
│ │ │
└───────────────────┼────────────────────┘
│
▼
┌────────────────────────┐
│ Campaign Orchestrator │
│ (Scheduler, Queue) │
└────────────┬────────────┘
│
▼
┌────────────────────────┐
│ Email Renderer │
│ (Template Engine) │
└────────────┬────────────┘
│
▼
┌────────────────────────┐
│ Email Delivery │
│ Service │
└────────────┬────────────┘
│
┌──────────────────┼──────────────────┐
│ │ │
▼ ▼ ▼
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ AWS SES │ │ SendGrid │ │ Mailgun │
│ Provider │ │ Provider │ │ Provider │
└──────────────┘ └──────────────┘ └──────────────┘
│ │ │
└──────────────────┼──────────────────┘
│
▼
┌──────────────┐
│ Internet │
│ (SMTP) │
└──────────────┘
│
▼
┌──────────────┐
│ Recipients │
│ (Opens, │
│ Clicks) │
└──────┬───────┘
│
▼
┌────────────────────────┐
│ Tracking Service │
│ (Pixel, Link Wrap) │
└────────────┬────────────┘
│
▼
┌────────────────────────┐
│ Analytics Service │
│ (Aggregation, Stats) │
└────────────┬────────────┘
│
┌──────────────────┼──────────────────┐
│ │ │
▼ ▼ ▼
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ MySQL │ │ Redis │ │ ClickHouse │
│ (Primary) │ │ (Cache) │ │ (Analytics)│
└──────────────┘ └──────────────┘ └──────────────┘
Key Components
- API Gateway: Handles authentication, rate limiting, request routing
- Campaign Service: Manages campaign CRUD operations
- Template Service: Handles template management and versioning
- Contact Service: Manages contacts, lists, and segmentation
- Campaign Orchestrator: Schedules campaigns and manages delivery queue
- Email Renderer: Renders templates with personalization
- Email Delivery Service: Handles actual email sending via providers
- Tracking Service: Tracks opens, clicks, bounces
- Analytics Service: Aggregates and provides analytics data
Component Design
1. Campaign Service
Responsibilities:
- Create, update, delete campaigns
- Validate campaign data
- Manage campaign status lifecycle
- Coordinate with other services
Key Operations:
class CampaignService:
def create_campaign(self, campaign_data):
# Validate template exists
# Validate contact lists exist
# Create campaign record
# Schedule if scheduled_time is set
pass
def send_campaign(self, campaign_id):
# Get campaign details
# Get recipient list
# Create campaign_contacts entries
# Trigger email delivery
pass
def get_campaign_analytics(self, campaign_id):
# Aggregate stats from email_events
# Return comprehensive analytics
pass
2. Template Service
Responsibilities:
- Template CRUD operations
- Template versioning
- Variable extraction and validation
- Template preview
Key Operations:
class TemplateService:
def create_template(self, template_data):
# Validate HTML content
# Extract variables from template
# Store template
pass
def render_template(self, template_id, variables):
# Load template
# Replace variables
# Return rendered HTML/text
pass
def validate_variables(self, template_id, variables):
# Check all required variables are provided
pass
3. Contact Service
Responsibilities:
- Contact CRUD operations
- List/segment management
- Contact deduplication
- Unsubscribe handling
Key Operations:
class ContactService:
def create_contact(self, contact_data):
# Validate email format
# Check for duplicates
# Create contact record
pass
def import_contacts(self, list_id, contacts):
# Batch insert contacts
# Handle duplicates
# Add to list
pass
def get_segment_contacts(self, segment_id):
# Evaluate segment conditions
# Return matching contacts
pass
def unsubscribe_contact(self, contact_id, reason):
# Update contact status
# Log unsubscribe event
# Remove from active lists
pass
4. Campaign Orchestrator
Responsibilities:
- Schedule campaign execution
- Manage delivery queue
- Coordinate email rendering and delivery
- Handle campaign status updates
Architecture:
Scheduler → Queue Manager → Email Renderer → Delivery Service
Key Operations:
class CampaignOrchestrator:
def schedule_campaign(self, campaign_id, schedule_time):
# Add to scheduler
pass
def execute_campaign(self, campaign_id):
# Get campaign and recipients
# For each recipient:
# - Render email
# - Queue for delivery
pass
def process_delivery_queue(self):
# Poll queue for pending emails
# Batch process
# Send via delivery service
pass
5. Email Renderer
Responsibilities:
- Render templates with personalization
- Handle variable substitution
- Generate both HTML and text versions
- Cache rendered templates
Key Operations:
class EmailRenderer:
def render_email(self, template_id, contact_id, campaign_id):
# Load template
# Load contact data
# Load campaign data
# Replace variables
# Add tracking pixels and links
# Return rendered email
pass
def add_tracking(self, html_content, campaign_id, contact_id):
# Add open tracking pixel
# Wrap links with click tracking
pass
6. Email Delivery Service
Responsibilities:
- Send emails via multiple providers
- Handle provider rate limits
- Retry failed deliveries
- Load balance across providers
Key Operations:
class EmailDeliveryService:
def send_email(self, email_data):
# Select provider (round-robin or based on load)
# Check provider rate limits
# Send via provider
# Handle errors and retries
pass
def handle_bounce(self, email, bounce_type, reason):
# Update contact status
# Log bounce event
# Suppress hard bounces
pass
7. Tracking Service
Responsibilities:
- Track email opens (pixel tracking)
- Track link clicks (link wrapping)
- Handle unsubscribe requests
- Log tracking events
Key Operations:
class TrackingService:
def track_open(self, campaign_id, contact_id, request_data):
# Verify tracking token
# Log open event
# Update campaign stats
pass
def track_click(self, campaign_id, contact_id, link_url, request_data):
# Verify tracking token
# Log click event
# Redirect to original URL
pass
def handle_unsubscribe(self, token):
# Verify token
# Unsubscribe contact
# Return confirmation page
pass
8. Analytics Service
Responsibilities:
- Aggregate tracking events
- Calculate campaign metrics
- Generate reports
- Real-time dashboard updates
Key Operations:
class AnalyticsService:
def get_campaign_stats(self, campaign_id):
# Aggregate events from email_events table
# Calculate rates
# Return comprehensive stats
pass
def get_realtime_stats(self, campaign_id):
# Query recent events
# Return current stats
pass
def generate_report(self, campaign_id, date_range):
# Aggregate events by time period
# Generate detailed report
pass
Email Delivery Pipeline
Delivery Flow
1. Campaign Scheduled/Triggered
↓
2. Campaign Orchestrator fetches recipients
↓
3. For each recipient:
a. Email Renderer renders personalized email
b. Add tracking pixels and wrap links
c. Queue email for delivery
↓
4. Delivery Service processes queue
↓
5. Select email provider (load balancing)
↓
6. Check provider rate limits
↓
7. Send email via provider API/SMTP
↓
8. Update delivery status
↓
9. Handle bounces/failures (retry logic)
Queue Management
Use Message Queue (RabbitMQ/Kafka) for:
- Email delivery queue
- Tracking event processing
- Analytics aggregation jobs
Queue Structure:
email_delivery_queue:
- campaign_id
- contact_id
- rendered_email (HTML + text)
- priority
- retry_count
tracking_events_queue:
- event_type
- campaign_id
- contact_id
- event_data
- timestamp
Retry Logic
def send_with_retry(email_data, max_retries=3):
for attempt in range(max_retries):
try:
result = email_provider.send(email_data)
if result.success:
return result
except RateLimitError:
wait_time = 2 ** attempt # Exponential backoff
sleep(wait_time)
except TransientError:
wait_time = 2 ** attempt
sleep(wait_time)
except PermanentError:
# Mark as bounced, don't retry
mark_as_bounced(email_data)
return
# All retries failed
mark_as_failed(email_data)
Provider Load Balancing
Strategies:
- Round-Robin: Distribute evenly across providers
- Weighted: Based on provider capacity and reliability
- Least Loaded: Send to provider with lowest current load
- Provider Priority: Use primary provider, fallback to others
class ProviderLoadBalancer:
def select_provider(self):
# Get active providers
# Check current load
# Select based on strategy
# Return provider
pass
def update_provider_stats(self, provider_id, success, latency):
# Update provider metrics
# Adjust weights if needed
pass
Template Engine
Variable System
Supported Variables:
- Contact variables:
{{contact.first_name}},{{contact.email}} - Campaign variables:
{{campaign.name}},{{campaign.subject}} - Custom fields:
{{contact.custom_fields.city}} - Conditional logic:
{% if contact.tags contains 'vip' %}...{% endif %}
Template Rendering Process
def render_template(template, context):
# 1. Parse template
# 2. Extract variables
# 3. Validate all variables are provided
# 4. Replace variables
# 5. Process conditionals
# 6. Return rendered HTML/text
pass
Caching Strategy
- Cache rendered templates per contact segment
- Invalidate cache when template or contact data changes
- Use Redis for template caching
def get_cached_template(template_id, contact_segment):
cache_key = f"template:{template_id}:segment:{contact_segment}"
cached = redis.get(cache_key)
if cached:
return cached
# Render and cache
rendered = render_template(template_id, contact_segment)
redis.setex(cache_key, 3600, rendered) # 1 hour TTL
return rendered
A/B Testing
A/B Test Flow
1. Create campaign with A/B test config
↓
2. Split recipients into variants (50/50, 33/33/33, etc.)
↓
3. Send variants to different groups
↓
4. Track performance metrics for each variant
↓
5. After test duration, determine winner
↓
6. Send remaining recipients winner variant
Winner Selection
Metrics for Winner:
- Open rate (primary)
- Click rate (secondary)
- Conversion rate (if tracked)
Statistical Significance:
- Use chi-square test or t-test
- Minimum sample size per variant
- Confidence level: 95%
def determine_winner(variants, metric='open_rate'):
# Calculate metric for each variant
# Perform statistical test
# Return winner if significant, else None
pass
Implementation
class ABTestManager:
def split_recipients(self, recipients, variants):
# Randomly assign recipients to variants
# Ensure equal distribution
pass
def evaluate_test(self, campaign_id):
# Get results for all variants
# Calculate statistical significance
# Determine winner
# Send remaining emails with winner
pass
Analytics & Tracking
Open Tracking
Implementation:
- Add 1x1 transparent pixel to email HTML
- Pixel URL:
https://track.example.com/open?c={campaign_id}&t={token} - When pixel loads, log open event
- Return 1x1 transparent GIF
<img src="https://track.example.com/open?c=campaign_123&t=token_abc"
width="1" height="1" style="display:none" />
Click Tracking
Implementation:
- Wrap all links in email with tracking URL
- Tracking URL:
https://track.example.com/click?c={campaign_id}&t={token}&url={encoded_url} - When clicked, log click event
- Redirect to original URL (HTTP 302)
def wrap_link(original_url, campaign_id, contact_id):
token = generate_tracking_token(campaign_id, contact_id)
encoded_url = urllib.parse.quote(original_url)
return f"https://track.example.com/click?c={campaign_id}&t={token}&url={encoded_url}"
Event Processing
Real-time Processing:
- Use message queue for tracking events
- Process events asynchronously
- Update campaign stats in real-time
Batch Processing:
- Aggregate events hourly/daily
- Update analytics tables
- Generate reports
Analytics Aggregation
Metrics Calculated:
- Open rate: (opens / delivered) × 100
- Click rate: (clicks / delivered) × 100
- Click-to-open rate: (clicks / opens) × 100
- Bounce rate: (bounces / sent) × 100
- Unsubscribe rate: (unsubscribes / delivered) × 100
- Deliverability rate: (delivered / sent) × 100
Storage:
- Real-time stats: Redis (for fast access)
- Historical data: ClickHouse or TimescaleDB (for time-series analytics)
- Aggregated reports: MySQL (for dashboard queries)
Data Partitioning
Partitioning Strategy
email_events Table:
- Partition by
occurred_at(monthly partitions) - Enables efficient time-range queries
- Easy to archive old data
PARTITION BY RANGE (UNIX_TIMESTAMP(occurred_at)) (
PARTITION p2024_01 VALUES LESS THAN (UNIX_TIMESTAMP('2024-02-01')),
PARTITION p2024_02 VALUES LESS THAN (UNIX_TIMESTAMP('2024-03-01')),
...
);
campaign_contacts Table:
- Shard by
campaign_id(hash-based sharding) - Distribute load across multiple databases
- Each shard handles subset of campaigns
contacts Table:
- Shard by
emailhash (consistent hashing) - Distribute contacts across shards
- Enables horizontal scaling
Sharding Strategy
Consistent Hashing:
- Use email hash to determine shard
- Add/remove shards without full rebalancing
- Replicate each shard for availability
Caching Strategy
Cache Layers
1. Template Cache (Redis)
- Cache rendered templates
- Key:
template:{template_id}:segment:{segment_hash} - TTL: 1 hour
- Invalidate on template update
2. Contact Data Cache (Redis)
- Cache frequently accessed contact data
- Key:
contact:{contact_id} - TTL: 15 minutes
- Invalidate on contact update
3. Campaign Stats Cache (Redis)
- Cache campaign analytics
- Key:
campaign:stats:{campaign_id} - TTL: 5 minutes
- Update on new events
4. List/Segment Cache (Redis)
- Cache segment contact lists
- Key:
segment:contacts:{segment_id} - TTL: 30 minutes
- Invalidate on list update
5. Provider Stats Cache (Redis)
- Cache provider load and health
- Key:
provider:stats:{provider_id} - TTL: 1 minute
- Real-time updates
Cache Invalidation
def invalidate_template_cache(template_id):
# Find all cache keys for this template
pattern = f"template:{template_id}:*"
keys = redis.keys(pattern)
redis.delete(*keys)
def invalidate_contact_cache(contact_id):
redis.delete(f"contact:{contact_id}")
# Also invalidate related segment caches
Load Balancing
API Load Balancing
Strategy:
- Round-robin or least connections
- Health checks for service instances
- Auto-scaling based on load
Components:
- Application Load Balancer (AWS ALB) or NGINX
- Multiple service instances
- Health check endpoints
Database Load Balancing
Read Replicas:
- Primary database for writes
- Multiple read replicas for reads
- Automatic failover
Connection Pooling:
- Use connection pooler (PgBouncer, ProxySQL)
- Distribute reads across replicas
- Handle connection limits
Security & Compliance
Security Measures
Authentication & Authorization
- API keys for programmatic access
- OAuth 2.0 for user authentication
- Role-based access control (RBAC)
Data Encryption
- Encrypt sensitive data at rest
- TLS/SSL for data in transit
- Encrypt email content if needed
Rate Limiting
- Per-user rate limits
- Per-IP rate limits
- Prevent abuse and spam
Input Validation
- Sanitize all user inputs
- Validate email addresses
- Prevent XSS in templates
Tracking Token Security
- Cryptographically secure tokens
- Token expiration
- Prevent token guessing
Compliance
CAN-SPAM Act (US)
- Include unsubscribe link in every email
- Honor unsubscribe requests within 10 days
- Include physical mailing address
- Accurate "From" information
GDPR (EU)
- Obtain explicit consent before sending
- Right to access personal data
- Right to be forgotten (delete contact)
- Data portability
- Privacy by design
Unsubscribe Management
- One-click unsubscribe
- Preference center (unsubscribe from specific lists)
- Global unsubscribe option
- Suppress unsubscribed contacts automatically
Bounce Management
- Hard bounces: Remove immediately
- Soft bounces: Retry with backoff
- Suppress hard-bounced emails
- Monitor bounce rates
Spam Complaint Handling
- Track spam complaints
- Remove complainers immediately
- Monitor complaint rates
- Investigate high complaint rates
Implementation
class ComplianceManager:
def check_compliance(self, campaign, contact):
# Check if contact is unsubscribed
# Check if contact has opted in
# Check bounce status
# Return compliance status
pass
def handle_unsubscribe(self, contact_id, list_id=None):
# Update contact status
# Remove from lists
# Log unsubscribe event
pass
def handle_bounce(self, email, bounce_type):
if bounce_type == 'hard':
# Mark as hard bounce
# Suppress contact
else:
# Retry with backoff
pass
Monitoring & Alerting
Key Metrics
System Metrics
- API response times
- Error rates
- Queue depth
- Database connection pool usage
Campaign Metrics
- Emails sent per minute
- Delivery success rate
- Bounce rate
- Open rate
- Click rate
Provider Metrics
- Provider health
- Provider latency
- Provider error rates
- Rate limit usage
Business Metrics
- Active campaigns
- Total contacts
- Daily email volume
- Revenue (if applicable)
Monitoring Tools
- Application Monitoring: Datadog, New Relic, Prometheus
- Log Aggregation: ELK Stack, Splunk, CloudWatch Logs
- Error Tracking: Sentry, Rollbar
- Uptime Monitoring: Pingdom, UptimeRobot
Alerting Rules
- High Bounce Rate: > 5% bounce rate
- High Error Rate: > 1% delivery failures
- Queue Backup: Queue depth > 100,000
- Provider Down: Provider health check fails
- Low Deliverability: < 90% deliverability rate
- High Complaint Rate: > 0.1% spam complaints
Dashboards
Operational Dashboard:
- Real-time email sending rate
- Queue depth
- Provider status
- Error rates
Campaign Dashboard:
- Active campaigns
- Campaign performance
- Top performing campaigns
- Campaign trends
Analytics Dashboard:
- Overall email metrics
- Engagement trends
- Contact growth
- List performance
Scalability Considerations
Horizontal Scaling
- Stateless Services: All services are stateless, enable horizontal scaling
- Database Sharding: Shard contacts and campaigns across databases
- Queue Workers: Scale email delivery workers based on queue depth
- Caching: Use distributed cache (Redis Cluster)
Vertical Scaling
- Database Optimization: Index optimization, query optimization
- Connection Pooling: Efficient connection management
- Batch Processing: Process emails in batches for efficiency
Performance Optimization
- Async Processing: Use message queues for async operations
- Batch Operations: Batch database writes
- Connection Pooling: Reuse database connections
- CDN for Assets: Serve email images via CDN
- Template Caching: Cache rendered templates
Trade-offs & Design Decisions
1. Eventual vs Strong Consistency
Decision: Eventual consistency for tracking events, strong consistency for critical operations
Rationale:
- Tracking events (opens, clicks) can tolerate slight delays
- Unsubscribes and bounces need strong consistency to prevent sending to invalid contacts
- Improves performance and scalability
2. Real-time vs Batch Analytics
Decision: Hybrid approach - real-time for dashboard, batch for historical reports
Rationale:
- Real-time stats for active campaign monitoring
- Batch aggregation for historical analysis and reporting
- Balances performance and accuracy
3. Multiple Email Providers
Decision: Support multiple providers with load balancing
Rationale:
- Redundancy and failover
- Distribute load to avoid rate limits
- Provider-specific optimizations
- Cost optimization
4. Template Rendering Strategy
Decision: Render on-demand with caching
Rationale:
- Personalization requires per-contact rendering
- Cache common segments to improve performance
- Balance between personalization and performance
5. Tracking Implementation
Decision: Server-side tracking with pixel and link wrapping
Rationale:
- More reliable than client-side tracking
- Works across all email clients
- Can track opens even with images disabled (some clients)
- Privacy considerations (transparent pixels)
Future Enhancements
- Advanced Segmentation: Machine learning-based segmentation
- Predictive Analytics: Predict best send times, content preferences
- Dynamic Content: Real-time content personalization
- Multi-channel: Extend to SMS, push notifications
- Advanced A/B Testing: Multi-variate testing, automatic optimization
- Email Automation: Complex workflow automation
- AI-powered Content: Generate email content using AI
- Advanced Analytics: Cohort analysis, customer journey tracking
Conclusion
This Email Campaign System design provides a scalable, reliable solution for managing and delivering email marketing campaigns at scale. Key highlights:
- Scalable Architecture: Handles millions of emails per day
- High Deliverability: Multiple providers, bounce management, compliance
- Real-time Tracking: Track opens, clicks, bounces in real-time
- Comprehensive Analytics: Detailed campaign performance metrics
- Compliance Ready: Built-in support for CAN-SPAM, GDPR
- Flexible & Extensible: Supports templates, segmentation, A/B testing
The system can be built incrementally, starting with core features (campaign creation, email delivery) and adding advanced features (A/B testing, automation) as needed.