Back to Blog
AWS

The 20 worst AWS annoyances and misdesigns

A list of the twenty worst AWS annoyances. Some are mis-designs, some are poorly-documented limitations. All are gotchas that might cost developers hours if not days of headaches.

JohannaFebruary 1, 20252 min read

Here are twenty significant AWS "gotchas" that can cause unexpected problems in production:

  1. IAM Permissions Propagation Delay - New IAM permissions can take up to 30 minutes to propagate, causing unexpected "access denied" errors after you've already updated policies.

  2. S3 Strong Consistency Misconception - Before December 2020, S3 was eventually consistent. Legacy code might still have unnecessary workarounds that now cause problems.

  3. CloudFormation Drift Detection Limitations - Doesn't detect drift for all resource types, giving false confidence about infrastructure state.

  4. Lambda Cold Starts - The first invocation of a Lambda can be significantly slower, especially for JVM-based runtimes, causing timeout issues.

  5. API Gateway 29-second Timeout - API Gateway has a hard 29-second timeout that cannot be increased, forcing workarounds for longer-running operations.

  6. CloudWatch Logs Delayed Delivery - Logs can take several minutes to appear in CloudWatch, making real-time debugging nearly impossible.

  7. RDS Connection Limits - RDS instances have connection limits based on instance size, but the AWS console doesn't make this clear.

  8. ECS Task Definition Immutability - Task definitions cannot be modified after creation, requiring a complete replacement flow for simple changes.

  9. Default VPC Deletion Consequences - If you accidentally delete your default VPC, many services that rely on it will fail to launch without clear error messages.

  10. CloudFront Cache Invalidation Quotas - Limited to 1,000 free invalidations per month, with per-path charges afterward, encouraging poor cache practices.

  11. CDK Resource Naming - CDK generates cryptic resource names that change with refactoring, causing unexpected resource replacements during deployment.

  12. DynamoDB Hot Keys - Partition key design can lead to throttling if traffic concentrates on specific partitions, despite having sufficient overall capacity.

  13. S3 Object Ownership Model - The default ACL behavior can cause objects uploaded by other accounts to be inaccessible to the bucket owner.

  14. Lambda Environment Variable Size Limits - Limited to 4KB total for all environment variables, which is easy to exceed with configuration settings.

  15. ALB Slow Rule Evaluation - Application Load Balancers evaluate rules sequentially, causing performance issues with many complex rules.

  16. Step Functions Express vs. Standard - Two different execution models with different guarantees, pricing, and execution limits that aren't clear from the interface.

  17. Route 53 Health Check Limitations - Cannot directly health check internal resources without additional infrastructure.

  18. SQS FIFO Queue Lambda Limitations - Lambda won't automatically scale past 5 concurrent executions when processing SQS FIFO queues, causing unexpected bottlenecks.

  19. RDS Snapshot Restoration Parameter Reset - When restoring from a snapshot, custom parameter groups are not maintained.

  20. SES Sandbox Limitations - New accounts are placed in the "SES sandbox," limiting emails to verified addresses without clear warnings in the console.

These gotchas typically aren't well-documented in the main AWS documentation and often only become apparent after encountering them in production scenarios.