AWS Database Best Practices: RDS, DynamoDB, and ElastiCache Guide
Master AWS database selection and architecture. Learn RDS engine selection, DynamoDB modeling, ElastiCache patterns, read replicas, multi-AZ deployments, and database migration strategies.
Technical TL;DR
Database = Application Heart. Wrong choice = re-architecture.
Key takeaways:
---
1. Database Selection Framework
1.1 Relational vs. NoSQL Decision
Start with relational (RDS). Only use NoSQL if specific requirements demand it.
| Requirement | RDS (Relational) | DynamoDB (NoSQL) |
|-------------|------------------|------------------|
| **Structured data, fixed schema** | ✅ Best Choice | ❌ Not designed for |
| **Complex joins, transactions** | ✅ ACID compliant | ❌ No joins |
| **Ad-hoc queries, BI tools** | ✅ SQL standard | ❌ No query language |
| **Single-digit ms latency at any scale** | ❌ Performance degrades | ✅ Consistent performance |
| **Simple access patterns (key-value)** | ⚠️ Overkill | ✅ Optimized for |
| **Auto-scaling writes** | ❌ Manual scaling | ✅ Unlimited scale |
| **Eventual consistency acceptable** | ⚠️ Can be configured | ✅ Default behavior |
1.2 RDS Engine Selection
Choose the right RDS engine based on requirements and team skills.
| Engine | Best For | Migration Complexity | Cost |
|--------|----------|---------------------|------|
| **Aurora MySQL** | Cloud-native, high availability | Low (MySQL compatible) | Medium |
| **Aurora PostgreSQL** | Cloud-native, Postgres features | Low (Postgres compatible) | Medium |
| **MySQL** | Existing MySQL workloads | None (drop-in) | Low |
| **PostgreSQL** | Existing Postgres workloads, advanced features | None (drop-in) | Low |
| **SQL Server** | Windows stack, .NET applications | Medium (version compatibility) | High (licensing) |
| **Oracle** | Legacy Oracle applications | High (complex migration) | Very High (licensing) |
| **MariaDB** | MySQL alternative, community-focused | Low (MySQL compatible) | Low |
Recommendation:
```yaml
New Projects → Aurora (MySQL or PostgreSQL)
Existing MySQL/Postgres → Migrate to Aurora when convenient
SQL Server/Oracle → Migrate to Aurora only if business justification
```
---
2. Amazon RDS Best Practices
2.1 Multi-AZ Deployments: HA Requirement
All production RDS instances must be Multi-AZ.
Multi-AZ Architecture:
```yaml
Primary DB: us-east-1a
↓ Synchronous replication
Standby DB: us-east-1b (different AZ)
Automatic Failover:
- DNS updates to point to standby
- Typical failover: 30-60 seconds
- No data loss (synchronous replication)
- Zero manual intervention required
```
Configuration:
```yaml
Production: Multi-AZ = YES (non-negotiable)
Development: Single-AZ acceptable
Staging: Multi-AZ recommended (production-like)
```
2.2 Read Replicas: Scale Reads, Not Writes
Read replicas offload read traffic from primary database.
Use Cases:
```yaml
☐ Analytics/reporting queries (no impact on production)
☐ Geographic distribution (low-latency reads in other regions)
☐ Burst traffic (scale reads temporarily)
☐ Backup verification (queries on replica, not primary)
```
Important Limitations:
```yaml
☐ Asynchronous replication (potential data lag: milliseconds to seconds)
☐ Write traffic still goes to primary
☐ No automatic failover to replica (use Multi-AZ for HA)
☐ Maximum 5 read replicas per database
```
2.3 RDS Instance Sizing
Right-size RDS instances. Databases are often 2-3x oversized.
Sizing Methodology:
```yaml
1. Monitor CloudWatch metrics (CPU, memory, connections)
2. Target: 70-80% CPU utilization at peak
3. Use AWS Compute Optimizer for ML-driven recommendations
4. Test smaller instance classes in non-production
5. Consider burstable instances (db.t3) for dev/test
```
Instance Classes:
```yaml
db.t3 (burstable): Dev/test, low-traffic prod
db.m5 (general purpose): Most production workloads
db.r5 (memory-optimized): In-memory caches, high connection counts
db.x1g (memory-optimized): Large datasets (2TB+ in memory)
```
2.4 RDS Storage: Provisioned IOPS for Databases
Databases require provisioned IOPS (io1) for consistent performance.
Storage Types:
```yaml
gp2/gp3: General purpose SSD
- gp3: Recommended baseline (3,000 IOPS included)
- Cost-effective for most workloads
io1: Provisioned IOPS SSD
- Required for production databases
- Consistent IOPS regardless of volume size
- Cost: $0.125/GB/month + IOPS fee
```
IOPS Provisioning:
```yaml
Database IOPS = (Transactions/sec × IOPS per transaction)
Example: 2,000 TPS × 3 IOPS/transaction = 6,000 IOPS
Add 30% headroom: 7,800 IOPS provisioned
```
2.5 RDS Encryption: Required
All production RDS instances must be encrypted.
Configuration:
```yaml
☐ Enable encryption at creation (cannot add later)
☐ Use customer-managed KMS keys (CMKs)
☐ Separate keys per environment (dev/staging/prod)
☐ Snapshots inherit encryption from instance
☐ Read replicas require encryption if primary is encrypted
```
2.6 RDS Parameter Groups: Tuning
Parameter groups control database configuration.
Required Customizations:
```yaml
MySQL/PostgreSQL:
- max_connections: Based on instance memory
- shared_buffers: 25% of instance memory (Postgres)
- innodb_buffer_pool_size: 50-70% of instance memory (MySQL)
- log_slow_queries: Enable for performance analysis
- backup_retention_period: 7-35 days (compliance)
SQL Server:
- max degree of parallelism: Based on CPU cores
- cost threshold for parallelism: 30-50
- max server memory: Leave 20% for OS
```
2.7 RDS Backups: Point-in-Time Recovery
Backups are enabled by default. Configure retention for compliance.
Backup Strategy:
```yaml
Backup Window: Choose low-traffic period (backups impact performance)
Retention Period:
- Production: 35 days (PCI/HIPAA require 30+)
- Staging: 7 days
- Development: 1-7 days
Snapshot Window: Align with backup window
Automated Backups: Enable (required for PITR)
Manual Snapshots: Long-term retention (quarterly/yearly)
```
Recovery Testing:
```yaml
Quarterly: Restore snapshot to new RDS instance
Verify: Data integrity, application connectivity
Document: Restore procedure for incident response
```
---
3. Amazon Aurora Best Practices
3.1 When to Use Aurora
Aurora is a cloud-native database. 5x better performance than MySQL.
Aurora Advantages:
```yaml
☐ 5x MySQL performance, 3x PostgreSQL performance
☐ Auto-storage (up to 128 TB, no provisioning)
☐ Read scaling: Up to 15 replicas (vs. 5 for RDS)
☐ Faster failover: Typically < 30 seconds
☐ Serverless option: Scale-to-zero (cost savings)
☐ Global Database: Cross-region replication
```
Migration Justification:
```yaml
Migrate to Aurora if:
- Current RDS requires higher performance
- Manual storage management is burdensome
- Read scaling beyond 5 replicas needed
- Serverless patterns desired
```
3.2 Aurora Serverless: Cost Optimization
Aurora Serverless automatically scales compute capacity.
Use Cases:
```yaml
✓ Infrequent applications (dev/test, low-traffic prod)
✓ Unpredictable workloads (sudden spikes)
✓ Development and testing environments
```
Not Recommended For:
```yaml
✗ Production databases requiring consistent performance
✗ High connection counts (limited to 1,000 connections)
✗ Databases requiring long-running queries
```
3.3 Aurora Global Database: Multi-Region
Aurora Global Database replicates to up to 5 secondary regions.
Architecture:
```yaml
Primary Region: us-east-1 (read/write)
↓ Replication (typically < 1 second lag)
Secondary Region: us-west-2, eu-west-1 (read-only)
Failover: Promote secondary to primary (RTO: minutes)
```
Use Cases:
```yaml
☐ Global applications (low-latency reads worldwide)
☐ Disaster recovery (regional failure resilience)
☐ Data residency (local data storage requirements)
```
---
4. DynamoDB Best Practices
4.1 When to Use DynamoDB
DynamoDB is for workloads requiring unlimited scale and single-digit ms latency.
Ideal Use Cases:
```yaml
☐ User profiles, preferences, sessions
☐ Shopping carts, order histories
☐ IoT sensor data (high write throughput)
☐ Real-time bidding, gaming leaderboards
☐ Message queues, job queues
☐ Metadata stores, configuration stores
```
Anti-Patterns:
```yaml
✗ Complex joins (use RDS)
✗ Ad-hoc queries (use Redshift or RDS)
✗ Large documents (> 400 KB per item)
✗ Frequent item updates (use conditional writes)
```
4.2 DynamoDB Data Modeling
DynamoDB is NoSQL. Model data based on access patterns, not normalization.
Design Principles:
```yaml
1. Identify access patterns first (queries, reads, writes)
2. Design primary keys (partition + sort key)
3. Use GSIs (Global Secondary Indexes) for alternative access
4. Denormalize data (avoid joins, duplicate data)
5. Use composite keys for hierarchical data
```
Example: E-commerce Orders
```yaml
Primary Key:
- PK: ORDER#<order-id> (single item lookup)
- SK: METADATA#<order-id>
GSI 1 (User Orders):
- PK: USER#<user-id>
- SK: ORDER#<order-date>
GSI 2 (Product Orders):
- PK: PRODUCT#<product-id>
- SK: ORDER#<order-date>
```
4.3 DynamoDB Capacity Planning
Choose On-Demand or Provisioned capacity based on traffic patterns.
| Mode | Use Case | Cost |
|------|----------|------|
| **On-Demand** | Unknown/unpredictable traffic | 2.5x provisioned cost |
| **Provisioned** | Predictable traffic, cost optimization | Lower cost |
Provisioning Guidelines:
```yaml
RCU (Read Capacity Units):
- 1 RCU = 1 strongly consistent read (4 KB)
- 1 RCU = 2 eventually consistent reads (4 KB)
WCU (Write Capacity Units):
- 1 WCU = 1 write (1 KB)
Example: 100 reads/sec, 50 writes/sec
→ 50 RCU (eventual consistency)
→ 50 WCU (1 KB items)
```
4.4 DynamoDB Accelerator (DAX)
DAX is an in-memory cache for DynamoDB. Up to 10x read performance improvement.
When to Use DAX:
```yaml
✓ Read-heavy workloads (90%+ reads)
✓ Repeatable reads (hot data)
✓ Microsecond latency required
```
Not Recommended For:
```yaml
✗ Write-heavy workloads (no benefit)
✗ Eventually consistent reads acceptable (DAX costs money)
```
---
5. ElastiCache Best Practices
5.1 Redis vs. Memcached
Choose the right caching engine for your use case.
| Feature | Redis | Memcached |
|---------|-------|-----------|
| **Data types** | Strings, lists, sets, sorted sets, hashes | Strings only |
| **Persistence** | AOF/RDB snapshots (data survives restart) | No persistence (cache only) |
| **Replication** | Yes (high availability) | No |
| **Multi-AZ** | Yes (with replication) | No |
| **Partitioning** | Cluster enabled (sharding) | Client-side sharding |
| **Use Case** | Caching + data structures + pub/sub | Simple caching |
Recommendation:
```yaml
General Purpose → Redis (recommended default)
Simple Caching Only → Memcached (lower cost)
Pub/Sub, Locking → Redis (required)
```
5.2 Redis Cluster Mode
Redis Cluster enables horizontal scaling.
Cluster Mode Configuration:
```yaml
Cluster Mode Enabled:
- Up to 500 nodes
- Up to 500 shards (primary + replicas)
- Automatic sharding (hash slots)
- Best for: Large datasets (> 100 GB), high throughput
Cluster Mode Disabled:
- Single shard (primary + replicas)
- Simpler operations
- Best for: Smaller datasets, simpler requirements
```
5.3 ElastiCache Security
ElastiCache in VPC: Private subnets only.
Security Configuration:
```yaml
☐ VPC: Private subnets only (no public access)
☐ Security Groups: Restrict to specific EC2/IPs
☐ Encryption at rest: Enabled (Redis 6.0+)
☐ Encryption in transit: TLS (AUTH token required)
☐ AUTH Token: Require password authentication
☐ Redis version: Latest (security patches)
```
---
6. Database Migration
6.1 Migration Strategies
| Migration Method | Downtime | Complexity | Use Case |
|------------------|----------|------------|----------|
| **Export/Import** | High (hours) | Low | Small databases (< 10 GB), dev/test |
| **AWS DMS** | Minimal | Medium | Production, near-zero downtime |
| **Native Replication** | Minimal | High | Complex migrations, custom logic |
| **Blue/Green Deployment** | Minimal | High | Critical applications, rollback required |
6.2 AWS DMS (Database Migration Service)
DMS is the recommended migration tool for most scenarios.
DMS Architecture:
```yaml
Source Database (on-prem or cloud)
↓ DMS Replication Instance
Target Database (RDS, Aurora, Redshift)
Continuous Replication:
- Change Data Capture (CDC)
- Near-zero downtime (minutes of cutover)
- Schema conversion tool (SCT) for engine changes
```
Migration Process:
```yaml
1. Assess source database (AWS SCT)
2. Create target RDS instance
3. Create DMS replication instance
4. Create migration task (full load + CDC)
5. Validate data (row counts, checksums)
6. Cutover: Stop app, wait for replication lag, switch DNS
7. Decommission source after validation period
```
---
7. Monitoring and Alerting
Required Metrics
```yaml
RDS:
- CPU/Memory utilization
- Database connections (used vs. max)
- Read/write latency
- Freeable memory
- Disk queue depth (IOPS bottleneck)
DynamoDB:
- Consumed read/write capacity (vs. provisioned)
- Throttled requests (provisioning insufficient)
- System errors (5xx)
- Latency (average, p95, p99)
ElastiCache:
- CPU utilization
- Memory fragmentation ratio
- Evictions (cache insufficient size)
- Replication lag (Redis replicas)
```
Alerting
```yaml
☐ RDS CPU > 80% for 5 minutes (scale up)
☐ RDS connections > 80% of max (connection pooling needed)
☐ RDS free storage < 10 GB (storage emergency)
☐ DynamoDB throttles > 100/min (capacity issue)
☐ ElastiCache evictions > 1000/min (cache size too small)
```
---
8. Cost Optimization Checklist
Immediate Actions (Week 1)
Short-Term (Month 1)
Long-Term (Quarter 1)
---
Summary: Database Excellence Pillars
1. **Right engine selection** (RDS for relational, DynamoDB for NoSQL)
2. **Multi-AZ for production** (automatic failover, HA)
3. **Read replicas for scaling** (offload read traffic)
4. **Encryption at rest** (required for compliance)
5. **Backups tested** (quarterly restore verification)
6. **Monitor performance** (CloudWatch metrics + alerting)
7. **Right-size instances** (avoid 2-3x oversizing)
8. **Use Aurora for cloud-native** (performance + auto-storage)
---
Need Help with Database Architecture?
Our database architects can design scalable, high-availability data architectures and migrate your databases with near-zero downtime.
<a href="/contact" className="text-aws-orange font-semibold hover:text-aws-light">
Schedule a Free Database Consultation →
</a>
Internal Linking Strategy:
---
*Last updated: January 5, 2025*
Related Articles
AWS AI/ML Best Practices: SageMaker and Bedrock Architecture Guide
Master AWS AI/ML services. Learn SageMaker training/deployment, Bedrock generative AI, foundation model selection, MLOps patterns, and cost optimization for machine learning workloads.
AWS Compute Best Practices: The Complete Well-Architected Guide
Master EC2, Lambda, and ECS/EKS compute architectures. Learn right-sizing, auto-scaling, serverless patterns, and cost optimization strategies aligned with AWS Well-Architected Framework.
AWS Networking Best Practices: VPC, CloudFront, and Connectivity Guide
Master AWS networking architecture. Learn VPC design, CloudFront CDN, Route 53 DNS, Direct Connect hybrid networking, and network security best practices aligned with AWS Well-Architected Framework.