· Cloud Architecture · 5 min read
Three Architecture Decisions That Are Inflating Your Cloud Bill
Cloud cost overruns are rarely a billing problem. They're an architecture problem. Here are the three most common structural causes — and how to fix them.

When cloud spend grows faster than usage, the instinct is to treat it as a billing problem — review the invoice, identify the spike, create a ticket. That approach finds symptoms. It doesn’t find the cause.
Cloud cost overruns at scale almost always trace back to architectural decisions: how resources are provisioned, how environments are managed, and how services are placed relative to each other. Three structural patterns account for the majority of waste we find in cloud architecture assessments. They compound over time, and they don’t get better without deliberate architectural intervention.
1. Missing Resource Lifecycle Management
The most persistent source of cloud waste isn’t a single oversized instance — it’s the accumulated cost of resources that persist because no one owns decommissioning them.
Development and testing environments are the most common case. A team provisions an environment for a feature or performance test, the work completes, and the environment continues running because removing it isn’t in anyone’s sprint. At the instance level the cost seems manageable. Across dozens of projects and years of accumulation, it becomes a material line item.
The architectural gap here is the absence of lifecycle policy. Environments provisioned through Infrastructure as Code should have corresponding teardown automation. Resources should carry ownership tags enforced at the policy level, not as a convention. Cloud providers surface this through native tooling — AWS Trusted Advisor, GCP Recommender, Azure Advisor — but these tools identify existing waste; they don’t prevent it from accumulating.
The fix requires a policy decision, not just a cleanup exercise: any resource that can be created automatically should be destroyable automatically, and that automation should be part of the same IaC that provisions it. Tagging policies that enforce environment, owner, and expiry attributes — with hard enforcement, not soft recommendations — close the gap structurally rather than requiring periodic audits to catch what’s accumulated.
2. Provisioning for Peak Without Auto-Scaling Architecture
The second pattern is overprovisioning — instances and services sized for peak load that runs a fraction of the time. This isn’t an engineering failure. It’s a predictable outcome when teams provision resources manually and carry the risk of under-provisioning in an environment where scaling up is a manual intervention.
The architectural solution is not smaller instances — it’s auto-scaling groups that make manual rightsizing unnecessary. Horizontal scaling for stateless services, scheduled scaling for predictable load patterns, and target tracking policies for variable workloads eliminate the need to choose between peak-capacity waste and under-provisioning risk.
Where auto-scaling isn’t directly applicable — databases, memory-intensive services, reserved workloads — systematic rightsizing reviews based on actual utilization data should be scheduled, not reactive. Cloud providers publish utilization metrics at the instance level. Comparing p95 utilization to provisioned capacity across the fleet typically surfaces 30-40% of provisioned resources that could be downsized or consolidated without performance impact.
Reserved and committed-use pricing compounds the savings from rightsizing but requires confidence in utilization levels. Organizations that haven’t addressed auto-scaling architecture first often find that their reserved capacity commitments don’t match their actual workload patterns after a year of growth.
3. Cross-Region Data Transfer and the Canadian Cost Structure
The third pattern is less obvious but frequently significant, particularly for Canadian organizations: unnecessary cross-region data transfer and egress costs driven by service placement decisions.
Cloud architecture decisions about where to locate services have direct cost implications. Services that exchange large volumes of data — application servers and databases, analytics pipelines and data stores, microservices with high inter-service communication — incur data transfer charges when placed in different regions or availability zones. These charges are small per gigabyte but accumulate proportionally to data volume and call frequency. In high-throughput architectures, inter-service data transfer can represent a significant fraction of total cloud spend.
For Canadian organizations, this is compounded by currency structure. Public cloud pricing is denominated in USD. Canadian companies paying in CAD absorb exchange rate fluctuation as a direct cost multiplier that doesn’t appear in the architecture diagram but shows up in every invoice. Architectural decisions that minimize data egress — service co-location in a single region, direct VPC peering over internet routing, caching at service boundaries to reduce redundant data retrieval — reduce the volume of spend exposed to that multiplier.
A systematic review of data transfer costs in a cloud architecture assessment typically identifies egress patterns that weren’t intentional design decisions — they emerged from services being deployed wherever was convenient at the time. Consolidating data-intensive services into a single region and implementing VPC endpoints for AWS service communication (eliminating public internet egress for S3, DynamoDB, and other managed services) are two of the highest-leverage changes available.
The Common Thread
All three patterns share a root cause: architectural decisions made in isolation, without visibility into their aggregate cost implications over time. Idle resources accumulate because no one sees the fleet-level cost of orphaned environments. Overprovisioning persists because the risk of under-provisioning is visible and the waste is not. Data transfer costs grow because service placement decisions are made for operational convenience, not cost visibility.
Addressing these patterns requires a structured review of the architecture, not a month-over-month analysis of line items. The architecture decisions that are generating cost were made once; they continue generating cost until they’re changed.
ERMI Labs offers a Cloud Modernization Assessment — a structured review of your cloud architecture, cost structure, and operational practices, delivered as a prioritized remediation roadmap. Schedule a discovery call to discuss what we typically find.
ERMI Labs Architecture Team
Principal architects with 20+ years of experience in distributed systems, cloud infrastructure, and data platforms.



