AWS Database Best Practices: RDS, DynamoDB, and ElastiCache Guide

Master AWS database selection and architecture. Learn RDS engine selection, DynamoDB modeling, ElastiCache patterns, read replicas, multi-AZ deployments, and database migration strategies.

12 min read
By CloudBridgeHub

Technical TL;DR


Database = Application Heart. Wrong choice = re-architecture.


Key takeaways:

  • **RDS for relational** (MySQL, PostgreSQL, SQL Server, Oracle, MariaDB, Aurora)
  • **DynamoDB for NoSQL** (single-digit ms latency, unlimited scale)
  • **ElastiCache for caching** (Redis, Memcached)
  • **Multi-AZ for HA** (automatic failover, synchronous replication)
  • **Read Replicas for scaling reads** (asynchronous, eventual consistency)
  • **Aurora for cloud-native** (5x MySQL performance, auto-storage)

  • ---


    1. Database Selection Framework


    1.1 Relational vs. NoSQL Decision


    Start with relational (RDS). Only use NoSQL if specific requirements demand it.


    | Requirement | RDS (Relational) | DynamoDB (NoSQL) |

    |-------------|------------------|------------------|

    | **Structured data, fixed schema** | ✅ Best Choice | ❌ Not designed for |

    | **Complex joins, transactions** | ✅ ACID compliant | ❌ No joins |

    | **Ad-hoc queries, BI tools** | ✅ SQL standard | ❌ No query language |

    | **Single-digit ms latency at any scale** | ❌ Performance degrades | ✅ Consistent performance |

    | **Simple access patterns (key-value)** | ⚠️ Overkill | ✅ Optimized for |

    | **Auto-scaling writes** | ❌ Manual scaling | ✅ Unlimited scale |

    | **Eventual consistency acceptable** | ⚠️ Can be configured | ✅ Default behavior |


    1.2 RDS Engine Selection


    Choose the right RDS engine based on requirements and team skills.


    | Engine | Best For | Migration Complexity | Cost |

    |--------|----------|---------------------|------|

    | **Aurora MySQL** | Cloud-native, high availability | Low (MySQL compatible) | Medium |

    | **Aurora PostgreSQL** | Cloud-native, Postgres features | Low (Postgres compatible) | Medium |

    | **MySQL** | Existing MySQL workloads | None (drop-in) | Low |

    | **PostgreSQL** | Existing Postgres workloads, advanced features | None (drop-in) | Low |

    | **SQL Server** | Windows stack, .NET applications | Medium (version compatibility) | High (licensing) |

    | **Oracle** | Legacy Oracle applications | High (complex migration) | Very High (licensing) |

    | **MariaDB** | MySQL alternative, community-focused | Low (MySQL compatible) | Low |


    Recommendation:

    ```yaml

    New Projects → Aurora (MySQL or PostgreSQL)

    Existing MySQL/Postgres → Migrate to Aurora when convenient

    SQL Server/Oracle → Migrate to Aurora only if business justification

    ```


    ---


    2. Amazon RDS Best Practices


    2.1 Multi-AZ Deployments: HA Requirement


    All production RDS instances must be Multi-AZ.


    Multi-AZ Architecture:

    ```yaml

    Primary DB: us-east-1a

    ↓ Synchronous replication

    Standby DB: us-east-1b (different AZ)


    Automatic Failover:

    - DNS updates to point to standby

    - Typical failover: 30-60 seconds

    - No data loss (synchronous replication)

    - Zero manual intervention required

    ```


    Configuration:

    ```yaml

    Production: Multi-AZ = YES (non-negotiable)

    Development: Single-AZ acceptable

    Staging: Multi-AZ recommended (production-like)

    ```


    2.2 Read Replicas: Scale Reads, Not Writes


    Read replicas offload read traffic from primary database.


    Use Cases:

    ```yaml

    ☐ Analytics/reporting queries (no impact on production)

    ☐ Geographic distribution (low-latency reads in other regions)

    ☐ Burst traffic (scale reads temporarily)

    ☐ Backup verification (queries on replica, not primary)

    ```


    Important Limitations:

    ```yaml

    ☐ Asynchronous replication (potential data lag: milliseconds to seconds)

    ☐ Write traffic still goes to primary

    ☐ No automatic failover to replica (use Multi-AZ for HA)

    ☐ Maximum 5 read replicas per database

    ```


    2.3 RDS Instance Sizing


    Right-size RDS instances. Databases are often 2-3x oversized.


    Sizing Methodology:

    ```yaml

    1. Monitor CloudWatch metrics (CPU, memory, connections)

    2. Target: 70-80% CPU utilization at peak

    3. Use AWS Compute Optimizer for ML-driven recommendations

    4. Test smaller instance classes in non-production

    5. Consider burstable instances (db.t3) for dev/test

    ```


    Instance Classes:

    ```yaml

    db.t3 (burstable): Dev/test, low-traffic prod

    db.m5 (general purpose): Most production workloads

    db.r5 (memory-optimized): In-memory caches, high connection counts

    db.x1g (memory-optimized): Large datasets (2TB+ in memory)

    ```


    2.4 RDS Storage: Provisioned IOPS for Databases


    Databases require provisioned IOPS (io1) for consistent performance.


    Storage Types:

    ```yaml

    gp2/gp3: General purpose SSD

    - gp3: Recommended baseline (3,000 IOPS included)

    - Cost-effective for most workloads


    io1: Provisioned IOPS SSD

    - Required for production databases

    - Consistent IOPS regardless of volume size

    - Cost: $0.125/GB/month + IOPS fee

    ```


    IOPS Provisioning:

    ```yaml

    Database IOPS = (Transactions/sec × IOPS per transaction)

    Example: 2,000 TPS × 3 IOPS/transaction = 6,000 IOPS


    Add 30% headroom: 7,800 IOPS provisioned

    ```


    2.5 RDS Encryption: Required


    All production RDS instances must be encrypted.


    Configuration:

    ```yaml

    ☐ Enable encryption at creation (cannot add later)

    ☐ Use customer-managed KMS keys (CMKs)

    ☐ Separate keys per environment (dev/staging/prod)

    ☐ Snapshots inherit encryption from instance

    ☐ Read replicas require encryption if primary is encrypted

    ```


    2.6 RDS Parameter Groups: Tuning


    Parameter groups control database configuration.


    Required Customizations:

    ```yaml

    MySQL/PostgreSQL:

    - max_connections: Based on instance memory

    - shared_buffers: 25% of instance memory (Postgres)

    - innodb_buffer_pool_size: 50-70% of instance memory (MySQL)

    - log_slow_queries: Enable for performance analysis

    - backup_retention_period: 7-35 days (compliance)


    SQL Server:

    - max degree of parallelism: Based on CPU cores

    - cost threshold for parallelism: 30-50

    - max server memory: Leave 20% for OS

    ```


    2.7 RDS Backups: Point-in-Time Recovery


    Backups are enabled by default. Configure retention for compliance.


    Backup Strategy:

    ```yaml

    Backup Window: Choose low-traffic period (backups impact performance)

    Retention Period:

    - Production: 35 days (PCI/HIPAA require 30+)

    - Staging: 7 days

    - Development: 1-7 days

    Snapshot Window: Align with backup window


    Automated Backups: Enable (required for PITR)

    Manual Snapshots: Long-term retention (quarterly/yearly)

    ```


    Recovery Testing:

    ```yaml

    Quarterly: Restore snapshot to new RDS instance

    Verify: Data integrity, application connectivity

    Document: Restore procedure for incident response

    ```


    ---


    3. Amazon Aurora Best Practices


    3.1 When to Use Aurora


    Aurora is a cloud-native database. 5x better performance than MySQL.


    Aurora Advantages:

    ```yaml

    ☐ 5x MySQL performance, 3x PostgreSQL performance

    ☐ Auto-storage (up to 128 TB, no provisioning)

    ☐ Read scaling: Up to 15 replicas (vs. 5 for RDS)

    ☐ Faster failover: Typically < 30 seconds

    ☐ Serverless option: Scale-to-zero (cost savings)

    ☐ Global Database: Cross-region replication

    ```


    Migration Justification:

    ```yaml

    Migrate to Aurora if:

    - Current RDS requires higher performance

    - Manual storage management is burdensome

    - Read scaling beyond 5 replicas needed

    - Serverless patterns desired

    ```


    3.2 Aurora Serverless: Cost Optimization


    Aurora Serverless automatically scales compute capacity.


    Use Cases:

    ```yaml

    ✓ Infrequent applications (dev/test, low-traffic prod)

    ✓ Unpredictable workloads (sudden spikes)

    ✓ Development and testing environments

    ```


    Not Recommended For:

    ```yaml

    ✗ Production databases requiring consistent performance

    ✗ High connection counts (limited to 1,000 connections)

    ✗ Databases requiring long-running queries

    ```


    3.3 Aurora Global Database: Multi-Region


    Aurora Global Database replicates to up to 5 secondary regions.


    Architecture:

    ```yaml

    Primary Region: us-east-1 (read/write)

    ↓ Replication (typically < 1 second lag)

    Secondary Region: us-west-2, eu-west-1 (read-only)


    Failover: Promote secondary to primary (RTO: minutes)

    ```


    Use Cases:

    ```yaml

    ☐ Global applications (low-latency reads worldwide)

    ☐ Disaster recovery (regional failure resilience)

    ☐ Data residency (local data storage requirements)

    ```


    ---


    4. DynamoDB Best Practices


    4.1 When to Use DynamoDB


    DynamoDB is for workloads requiring unlimited scale and single-digit ms latency.


    Ideal Use Cases:

    ```yaml

    ☐ User profiles, preferences, sessions

    ☐ Shopping carts, order histories

    ☐ IoT sensor data (high write throughput)

    ☐ Real-time bidding, gaming leaderboards

    ☐ Message queues, job queues

    ☐ Metadata stores, configuration stores

    ```


    Anti-Patterns:

    ```yaml

    ✗ Complex joins (use RDS)

    ✗ Ad-hoc queries (use Redshift or RDS)

    ✗ Large documents (> 400 KB per item)

    ✗ Frequent item updates (use conditional writes)

    ```


    4.2 DynamoDB Data Modeling


    DynamoDB is NoSQL. Model data based on access patterns, not normalization.


    Design Principles:

    ```yaml

    1. Identify access patterns first (queries, reads, writes)

    2. Design primary keys (partition + sort key)

    3. Use GSIs (Global Secondary Indexes) for alternative access

    4. Denormalize data (avoid joins, duplicate data)

    5. Use composite keys for hierarchical data

    ```


    Example: E-commerce Orders

    ```yaml

    Primary Key:

    - PK: ORDER#<order-id> (single item lookup)

    - SK: METADATA#<order-id>


    GSI 1 (User Orders):

    - PK: USER#<user-id>

    - SK: ORDER#<order-date>


    GSI 2 (Product Orders):

    - PK: PRODUCT#<product-id>

    - SK: ORDER#<order-date>

    ```


    4.3 DynamoDB Capacity Planning


    Choose On-Demand or Provisioned capacity based on traffic patterns.


    | Mode | Use Case | Cost |

    |------|----------|------|

    | **On-Demand** | Unknown/unpredictable traffic | 2.5x provisioned cost |

    | **Provisioned** | Predictable traffic, cost optimization | Lower cost |


    Provisioning Guidelines:

    ```yaml

    RCU (Read Capacity Units):

    - 1 RCU = 1 strongly consistent read (4 KB)

    - 1 RCU = 2 eventually consistent reads (4 KB)


    WCU (Write Capacity Units):

    - 1 WCU = 1 write (1 KB)


    Example: 100 reads/sec, 50 writes/sec

    → 50 RCU (eventual consistency)

    → 50 WCU (1 KB items)

    ```


    4.4 DynamoDB Accelerator (DAX)


    DAX is an in-memory cache for DynamoDB. Up to 10x read performance improvement.


    When to Use DAX:

    ```yaml

    ✓ Read-heavy workloads (90%+ reads)

    ✓ Repeatable reads (hot data)

    ✓ Microsecond latency required

    ```


    Not Recommended For:

    ```yaml

    ✗ Write-heavy workloads (no benefit)

    ✗ Eventually consistent reads acceptable (DAX costs money)

    ```


    ---


    5. ElastiCache Best Practices


    5.1 Redis vs. Memcached


    Choose the right caching engine for your use case.


    | Feature | Redis | Memcached |

    |---------|-------|-----------|

    | **Data types** | Strings, lists, sets, sorted sets, hashes | Strings only |

    | **Persistence** | AOF/RDB snapshots (data survives restart) | No persistence (cache only) |

    | **Replication** | Yes (high availability) | No |

    | **Multi-AZ** | Yes (with replication) | No |

    | **Partitioning** | Cluster enabled (sharding) | Client-side sharding |

    | **Use Case** | Caching + data structures + pub/sub | Simple caching |


    Recommendation:

    ```yaml

    General Purpose → Redis (recommended default)

    Simple Caching Only → Memcached (lower cost)

    Pub/Sub, Locking → Redis (required)

    ```


    5.2 Redis Cluster Mode


    Redis Cluster enables horizontal scaling.


    Cluster Mode Configuration:

    ```yaml

    Cluster Mode Enabled:

    - Up to 500 nodes

    - Up to 500 shards (primary + replicas)

    - Automatic sharding (hash slots)

    - Best for: Large datasets (> 100 GB), high throughput


    Cluster Mode Disabled:

    - Single shard (primary + replicas)

    - Simpler operations

    - Best for: Smaller datasets, simpler requirements

    ```


    5.3 ElastiCache Security


    ElastiCache in VPC: Private subnets only.


    Security Configuration:

    ```yaml

    ☐ VPC: Private subnets only (no public access)

    ☐ Security Groups: Restrict to specific EC2/IPs

    ☐ Encryption at rest: Enabled (Redis 6.0+)

    ☐ Encryption in transit: TLS (AUTH token required)

    ☐ AUTH Token: Require password authentication

    ☐ Redis version: Latest (security patches)

    ```


    ---


    6. Database Migration


    6.1 Migration Strategies


    | Migration Method | Downtime | Complexity | Use Case |

    |------------------|----------|------------|----------|

    | **Export/Import** | High (hours) | Low | Small databases (< 10 GB), dev/test |

    | **AWS DMS** | Minimal | Medium | Production, near-zero downtime |

    | **Native Replication** | Minimal | High | Complex migrations, custom logic |

    | **Blue/Green Deployment** | Minimal | High | Critical applications, rollback required |


    6.2 AWS DMS (Database Migration Service)


    DMS is the recommended migration tool for most scenarios.


    DMS Architecture:

    ```yaml

    Source Database (on-prem or cloud)

    ↓ DMS Replication Instance

    Target Database (RDS, Aurora, Redshift)


    Continuous Replication:

    - Change Data Capture (CDC)

    - Near-zero downtime (minutes of cutover)

    - Schema conversion tool (SCT) for engine changes

    ```


    Migration Process:

    ```yaml

    1. Assess source database (AWS SCT)

    2. Create target RDS instance

    3. Create DMS replication instance

    4. Create migration task (full load + CDC)

    5. Validate data (row counts, checksums)

    6. Cutover: Stop app, wait for replication lag, switch DNS

    7. Decommission source after validation period

    ```


    ---


    7. Monitoring and Alerting


    Required Metrics

    ```yaml

    RDS:

    - CPU/Memory utilization

    - Database connections (used vs. max)

    - Read/write latency

    - Freeable memory

    - Disk queue depth (IOPS bottleneck)


    DynamoDB:

    - Consumed read/write capacity (vs. provisioned)

    - Throttled requests (provisioning insufficient)

    - System errors (5xx)

    - Latency (average, p95, p99)


    ElastiCache:

    - CPU utilization

    - Memory fragmentation ratio

    - Evictions (cache insufficient size)

    - Replication lag (Redis replicas)

    ```


    Alerting

    ```yaml

    ☐ RDS CPU > 80% for 5 minutes (scale up)

    ☐ RDS connections > 80% of max (connection pooling needed)

    ☐ RDS free storage < 10 GB (storage emergency)

    ☐ DynamoDB throttles > 100/min (capacity issue)

    ☐ ElastiCache evictions > 1000/min (cache size too small)

    ```


    ---


    8. Cost Optimization Checklist


    Immediate Actions (Week 1)

  • [ ] Review RDS instance sizing (AWS Compute Optimizer)
  • [ ] Enable RDS reserved instances for steady-state workloads
  • [ ] Audit unused RDS snapshots (delete)
  • [ ] Review DynamoDB provisioned capacity (downsize if possible)

  • Short-Term (Month 1)

  • [ ] Migrate to Aurora Serverless for dev/test
  • [ ] Implement ElastiCache for read-heavy workloads
  • [ ] Set up automated backups (AWS Backup)
  • [ ] Configure DynamoDB on-demand for spiky workloads

  • Long-Term (Quarter 1)

  • [ ] Archive cold data to S3 (lower storage costs)
  • [ ] Implement read replicas for geographic distribution
  • [ ] Review and optimize expensive queries
  • [ ] Set up budget alerts via Cost Explorer

  • ---


    Summary: Database Excellence Pillars


    1. **Right engine selection** (RDS for relational, DynamoDB for NoSQL)

    2. **Multi-AZ for production** (automatic failover, HA)

    3. **Read replicas for scaling** (offload read traffic)

    4. **Encryption at rest** (required for compliance)

    5. **Backups tested** (quarterly restore verification)

    6. **Monitor performance** (CloudWatch metrics + alerting)

    7. **Right-size instances** (avoid 2-3x oversizing)

    8. **Use Aurora for cloud-native** (performance + auto-storage)


    ---


    Need Help with Database Architecture?


    Our database architects can design scalable, high-availability data architectures and migrate your databases with near-zero downtime.


    <a href="/contact" className="text-aws-orange font-semibold hover:text-aws-light">

    Schedule a Free Database Consultation →

    </a>


    Internal Linking Strategy:

  • For **compute**, see [Database-Driven EC2 Sizing](/blog/aws-compute-best-practices)
  • For **security**, refer to [Database Encryption + IAM](/blog/aws-security-best-practices)
  • For **storage**, explore [EBS for Database Volumes](/blog/aws-storage-best-practices)

  • ---


    *Last updated: January 5, 2025*


    Need Help with Your AWS Infrastructure?

    Our AWS certified experts can help you optimize costs, improve security, and build scalable solutions.