Cloud Migration and Cost Optimization: The Top Initiative of 2025

In 2025, 84% of companies worldwide name cloud cost management their top priority. The reason is simple: in five years, the cloud has evolved from a “cheap server room” into one of the fastest-growing items in the budget, while companies admit that, on average, 27% of their cloud spend goes to waste. Using the examples of GitLab and a global pharmaceutical corporation, we will show how FinOps teams, with the help of multi-cloud and ML analytics, turn chaotic “cutting costs” into systematic ROI control. We will also examine where the most expensive “black holes” are hiding and why discounts rarely save the budget. In addition, we will look at why Gartner and Flexera are already forecasting a shift toward value-based cloud, when the cloud is evaluated not by price, but by return. The game is changing: those who manage the cloud as a strategic asset, and not just rent resources, will win.

IT Transformation: Physical Infrastructure, Cloud, or Hybrid Model?

The shift to the cloud has long since become the standard. But a fully “cloud-only” model, where a company’s entire IT infrastructure runs exclusively in the cloud, is still more common among startups and small businesses, for whom it is the fastest and most affordable path. Large and mid-sized companies, on the other hand, often choose hybrid models: part of their data and services remain on their own physical servers, while another part runs on public or private clouds. In practice, this often results in hybrid or multi-cloud strategies, where companies combine different environments and providers to achieve a balance of cost, control, and flexibility.

Many companies are adopting a cloud-first strategy, when cloud solutions are considered the priority option (but not the only one). At the same time, the question “What is more beneficial for the business: cloud, traditional on-premises servers (the classic model), or a combination of the two?” remains relevant and is revisited periodically.

Why are companies moving away from a “pure” cloud? Why are hybrid models becoming the new standard? And how do businesses choose between on-premises infrastructure, the cloud, a hybrid approach, or multi-cloud?

Let’s take a closer look at what these strategies mean and examine the advantages and disadvantages of each architecture.

Traditional Servers (On-Premises): Why Businesses Still Need Them

On-premises (or simply “on-prem”) refers to a company’s own servers and data center, entirely under its control.

Why is this convenient? Any server can be customized for specific tasks, and a separate secure environment can be created (for example, for a bank or government organization). In large companies with their own IT teams, issues can be resolved internally without waiting for external contractors. For small businesses, however, it’s not always so straightforward: one or two servers may actually be serviced more slowly than with a provider, where everything is already streamlined and automated.

But control comes at a price. As IT infrastructure became more complex, the costs became more significant: regular hardware upgrades (servers and disks become outdated every few years), maintaining reserves, a team of experienced system administrators, electricity bills, cooling systems, and security. For mid-sized businesses, unless required by their industry, this often becomes a dead-end situation: the budget is consumed by infrastructure expenses, leaving almost nothing for IT development.

That’s why a logical question often arises: Is it worth keeping your own server just for the sake of “control” if the cloud exists, where you rent precisely what you need, scale in minutes, and don’t have to worry about hardware or physical infrastructure?

From our team’s experience: most services, from small bots to simple projects, are much easier and cheaper to run in the cloud. This enables rapid scaling, reduces the need for server maintenance resources, and allows for a focus on product development.

The bottom line is that on-premises solutions are gradually losing ground. This is confirmed by Gartner research, which states that cloud computing continues to evolve, transforming from a simple technological breakthrough into a reality that, by 2028, will become a necessity.

Cloud-only: a Path to Efficiency or New Risks

If 'on-prem' means control, then 'cloud-only' implies speed.

In 2020, for the first time, according to CIO Dive, company spending on the cloud surpassed investments in their own data centers. Let’s look at the secret behind this trend. Cloud-only solutions make business operations more flexible, as resources can be instantly scaled up to meet a surge in demand, a pilot project can be launched, or new technologies can be quickly implemented: from artificial intelligence and big data analytics to managing an entire network of “smart” devices. Everything that not long ago required months of procurement and approvals is now available as a service, by subscription, in just a few clicks. It is precisely because of this that the cloud has become a catalyst for digital transformation and a driver of growth for companies of any size.

However, the wider the adoption of cloud-only solutions (where the cloud is considered the only option), the more obvious their pitfalls become.

Financial control remains the main pain point for CIOs: according to a study by Azul (via CIO Dive), 83% of IT directors spend more on the cloud than they had planned, and almost half exceed their budget by more than a quarter. The primary reason is the lack of awareness about the cost of cloud services and the resources they consume.

A particularly large share of overspending is linked to AI projects: developers often fail to consider the cost of the services they are using.

Problems with financial control are not the only challenge; there are also other issues. The risks of dependence on a single provider (vendor lock-in) and reverse migration (cloud repatriation) remain.

According to a Citrix report, Research finds IT leaders are choosing hybrid cloud strategies due to flexibility, cost-effectiveness, and security,” some clients are moving key IT workloads back from the cloud to local servers. This phenomenon is known as cloud repatriation.

A striking example is GEICO, one of the largest insurance companies in the United States (part of Berkshire Hathaway). According to Rebecca Weekly, Vice President of Infrastructure, “we have a lot of data, and it turns out that storage in the cloud is one of the most expensive things you can do in the cloud, followed by AI in the cloud.” Previously, an attempt at a full move to the cloud resulted in bills increasing 2.5 times and reduced reliability: due to a “lift-and-shift” approach (mechanical transfer of applications to the cloud without adapting them to its architecture), the company became entirely dependent on cloud providers without a consistent strategy. As a result, GEICO moved part of its workloads back to on-premises environments. It adopted a hybrid architecture, featuring a large-scale deployment of a private cloud on OpenStack, extensive use of Kubernetes for managing containerized services, and an emphasis on local storage systems.

This story clearly shows that the reverse transfer of workloads is usually connected to specific reasons:

  • reallocation of internal finances,
  • the need for control over resources and data,
  • security requirements.

However, Gartner refutes the myth of mass repatriation, calling it a narrative promoted by traditional solution providers as a means to retain their market share in the enterprise segment. In practice, “flight from the cloud” is rare; more often, it is a result of individual implementation mistakes or an architectural revision.

Where is the golden middle ground? It all comes down to a company’s goals and its level of maturity. As Gartner rightly notes (The Top 10 Cloud Myths), moving to the cloud is not the finish line, but rather the starting point. To extract real value, process changes, architectural adjustments, and constant management of performance and costs are required.

This idea is also confirmed by Flexera research: in 2025, CIOs name cloud cost management as their top priority. At the same time, however, there is no universal roadmap for businesses; each company must find its own. This is why it is essential to understand where budgets are allocated and which mistakes most frequently lead to overspending. These points will be discussed in the following sections.

However, first, let’s discuss an alternative that is gaining momentum: hybrid infrastructure, which combines the best features of cloud and on-premises solutions.

Hybrid and Multi-Cloud Strategy: Cloud-First as the Dominant Trend

A hybrid cloud is an integrated environment that combines:

  • local (on-premises) resources,
  • public cloud services,
  • private cloud platforms.

All components are connected by a unified management system, enabling the seamless movement of workloads between environments as needed. For example, a bank may store customer data in a private cloud, use the public cloud for real-time analytics, and process critical transactions on its own servers.

Why has the hybrid model become essential? At least three reasons can be identified:

  • Regulatory balance enables compliance with regulations, such as GDPR, while maintaining flexibility.
  • Financial optimization: “hot” workloads run in the cloud, while stable ones stay on-premises.
  • Business continuity: risk is distributed across different platforms.

These logical arguments are strongly supported by analytics. According to Statista, the hybrid solutions market is projected to grow from $85 billion in 2021 to $262 billion by 2027.

There is also multi-cloud, which involves the simultaneous use of two or more public cloud providers. It allows companies to leverage the strengths of different stacks (analytics/AI/databases), meet requirements for geography and sovereignty, reduce dependence on a single vendor, and build DR/failover.

According to Flexera, 89% of companies are already working with multiple cloud providers (multi-cloud).

What does this give businesses? Hybrid and multi-cloud strategies enable the construction of infrastructure that is not limited by a single technology, but tailored to specific tasks. Each workload is placed where it will be most effective in terms of cost, speed, and security. This approach enhances business resilience to failures, accelerates the launch of new products to market, and provides the flexibility to respond quickly to changes—from sudden traffic spikes to new regulatory requirements.

However, even the most well-thought-out cloud strategy does not eliminate another serious threat, the uncontrolled growth of costs.

Why the Cloud Has Become a Budget Trap — and How to Build Control with FinOps and ML

At the early stage of cloud services, their primary advantage was considered to be the ability to avoid capital investments in infrastructure and rent resources flexibly for specific tasks.

However, as the market matured, the cloud increasingly evolved from a bonus for businesses into a significant expense item. Below, we suggest exploring why this happened, the pitfalls companies face, and, most importantly, the ways to control these issues with the help of FinOps and ML.

Mistakes in Planning or Human Factor

Sharp budget growth

Until recently, cloud expenses were considered a secondary budget item, and CIOs praised the savings of the pay-as-you-go model. Today, according to Flexera, almost every third medium or large organization spends more than $12 million per year on public clouds, with an average plan exceeding by 17%.

Two-thirds of IT directors (Publicis Sapient) admitted to IDC that they did not stay within budget, and more than half of CEOs are concerned about uncontrolled spending. In response, companies are expanding FinOps practices and increasingly involving external optimizers to regain control. But even these measures do not always prevent surprises: every forgotten virtual machine or service left running shows up on the bill at the end of the month. As a result, the cloud becomes one of the largest expense items for businesses.

If the problem were only growing consumption, it could be solved by limiting capacity. But the pricing models of cloud providers make everything more complicated.

Data Charges and Complex Pricing

Paying only for computing power is just half the story. The real costs are often hidden in additional fees: outbound traffic, API calls, data queries, and transfers between zones. According to Wasabi (2025 Global Cloud Storage Index), on average, 49% of a cloud storage bill is comprised of charges for transfer and access, rather than the disk space itself.

It is no surprise that in 2024, 62% of companies exceeded their storage budgets, and 56% even froze projects because of unexpected bills for accessing their own data.

There are two reasons. The first is the complexity of pricing: thousands of SKUs (stock-keeping units, or unique tariff positions or services in a provider’s price list), which are difficult to understand even for experienced IT directors. The second is the unpredictable growth of data: 42% of companies moved more to the cloud than they had planned, and 39% faced much higher operational fees. The result is that even with a low price per gigabyte, the bill can suddenly balloon due to “unaccounted details.”

For business, this is a direct risk: without clear control, the cloud drains money faster than it delivers value. Increasingly, companies are realizing that one-time measures are insufficient: a new model of cost management is necessary.

Time to Change Strategy: FinOps 2.0 — Data Unification and Shared Responsibility

Modern FinOps is no longer just about controlling bills, but about a culture of collaboration between engineers, finance teams, and the business. The effectiveness of the cloud depends on how quickly and openly these teams exchange data and make decisions. Top-down support remains important, but increasingly, success is driven by horizontal connections: rapid expense reviews, shared access to metrics, and coordinated actions at the intersection of IT and finance.

According to the State of FinOps 2025 report, cost optimization and the fight against “cloud waste” remain top priorities. But new challenges are emerging: complex pricing, the expansion of FinOps responsibilities to SaaS, licenses, and private data centers. Here, technical expertise alone is insufficient; standardized data and cross-departmental collaboration are also required.

One of the key tools is the FOCUS specification, a unified format for cloud bills and usage metrics. It simplifies multi-cloud and hybrid scenarios where companies work with at least two providers. In practice, the implementation of FOCUS begins with a catalog of all resources, end-to-end tags for services, and automatic expense labeling. Increasingly, attention is shifting to unit economics: how much a specific transaction, session, or user costs, and where resources actually accelerate the business versus where they “burn out.”

Thus, FinOps 2.0 is no longer only about reducing costs, but about creating a single, transparent picture for all participants in the process. Here, financial and technical decisions are made jointly and in the same language.

The next stage involves automating management and transitioning from manual processes to ML-driven analytics, which identifies optimization points before they become seven-figure expenses on the bill.

New Workloads — Artificial Intelligence

The Price of Progress

It may seem that neural networks should reduce the cost of specific processes, but in practice, their implementation often results in higher expenses. According to the IBM Institute for Business Value (Gramener), in 2023–2025, the average computing costs of companies will increase by 89%, mainly due to AI-related expenses. Seventy percent of top managers directly attribute generative AI as the primary factor driving up IT budgets: models require expensive GPU instances, regular retraining, storage of large datasets, and high network bandwidth.

AI is also changing the labor market. By the end of 2023, according to ResumeBuilder, 37% of companies that implemented AI had already replaced part of their workforce with it, and 44% planned further reductions. The reason is straightforward: AI-based automation is often viewed as a cost-effective way to reduce expenses in a volatile economy.

But more and more counter-analysis is emerging: experts warn that relying solely on replacing people is a mistake. Stanford University professor Erik Brynjolfsson notes that companies using AI to complement, rather than displace, employees gain greater benefits. In a study of call centers, operators supported by AI became, on average, 14% more productive, while service quality improved and staff turnover decreased.

Thus, the picture is contradictory: some companies are cutting staff, while others are strengthening it with the help of AI. But full replacement of people is still far off.

In 2025, researchers from Andon Labs proposed Vending-Bench—a simulation in which LLM agents had to independently manage a vending machine, including purchasing goods, setting prices, and maintaining accounts. Even top models, such as Claude 3.5 Sonnet, showed wide variation: some runs generated profit, while others ended in “bankruptcy” due to forgotten orders or infinite loops. Anthropic went further and launched a real experiment, Project Vend: Claude (“Claudius”) managed an actual automated store in the company’s office. The results were telling: the agent was able to find suppliers and interact with customers, but at the same time, it sold goods at a loss, invented non-existent partners, and handed out discounts left and right. The conclusion is obvious: even the most advanced LLMs are not yet ready for fully autonomous business management, and every company must budget for auditing, testing, and manual adjustments.

Nevertheless, AI is not only a new line of expenses but also a powerful tool for optimizing them. Increasingly, companies are leveraging AI/ML analytics for FinOps tasks, such as automatically detecting “black holes” in their cloud budgets and preventing overspending.

And while in the previous examples we saw how neural networks sometimes behave like inexperienced interns, we will now discuss how the same technologies, in the hands of experienced specialists, become the foundation of a mature cost management strategy.

Automation and ML Analytics: From Intuition to Data

AI tools are no longer just “experimental toys” but are becoming the foundation of mature FinOps, allowing problems to be anticipated before they appear on the bill.

Machine learning for cost optimization. According to Splunk and the latest FinOps reports, more and more companies are using ML algorithms to analyze patterns of resource consumption and detect anomalies. Such systems detect sudden traffic spikes in a timely manner, identify areas of inefficient utilization, and forecast future expenses based on trends and historical data. If a service suddenly starts consuming more CPU or disk resources, the algorithm not only alerts but also evaluates the most cost-effective action.

A special priority is right-sizing, which involves selecting virtual machine configurations according to actual workloads. ML platforms identify underutilized instances, suggest reducing their parameters, and, as the business grows, scaling without unnecessary costs.

AI approaches from major cloud providers. At FinOps X 2025, industry leaders presented solutions that radically simplify cost control:

  • AWS — Q for Cost Optimization: an AI assistant that gathers and ranks recommendations, offers a step-by-step implementation plan, analyzes “drops” in the database (Aurora I/O Optimized Recommendations), and tracks spending spikes. The CUDOS Dashboard and built-in calculator enable modeling scenarios such as a “+15% load” and displaying the final amount on the bill.
  • Google Cloud — FinOps Hub 2.0 and Gemini Cloud Assist: detection of resources with utilization below 5%, prioritization of savings, short optimization reports, and automatic elimination of “wasteful” expenses (GKE, Cloud Run, Cloud SQL).
  • Microsoft — AI agents in GitHub Copilot and Azure AI Foundry Agent: automation of capacity optimization, application updates, selection of resource reservation strategies, and transition to more efficient solutions.

Hybrid and multi-cloud scenarios. The more providers there are, the more complex accounting becomes. Modern AI platforms aggregate data from different sources, bring it into the FOCUS standard, and provide end-to-end analytics—from the cloud to on-prem.

As a result, the cloud and AI, which were previously associated with the risks of overspending, are becoming tools of transparency, efficiency, and sustainable growth. But a complete rejection of human control has not yet happened—verification and adjustment remain mandatory.

Checklist: 7 Steps to Eliminate “Black Holes” in Cloud Spending

  1. Plan scenarios ahead
    Before starting a project, compare regions, instance types, and configurations. Model multiple budget scenarios to prevent early overspending.
  2. Turn off test environments
    Dev/test instances running 24/7 are a standard budget drain. Set up automatic shutdowns for nights, weekends, and holidays.
  3. Use commitments wisely
    For stable workloads, it is cost-effective to reserve resources (Reserved Instances, savings plans). Purchase in small batches and review needs regularly. (In AWS, for example, there is the Reserved Instance Marketplace—a platform where you can resell or purchase unused reserves, which helps manage costs flexibly.)
  4. Control egress and API
    Nearly half of a storage bill is comprised of data transfer and API fees (Wasabi data). Monitor and optimize these expenses.
  5. Right-size your resources
    Use recommendation services to avoid overpaying for extra cores, memory, and disks.
  6. Find and remove “zombie” resources
    Eliminate unused volumes, old snapshots, and forgotten services. Introduce mandatory tagging to simplify accounting.
  7. Enable monitoring with ML analytics
    Set up systems that not only display the bill but also identify anomalies, forecast expenses, and recommend automatic measures.

By applying these steps, you can quickly localize key budget leaks and move from ad-hoc cuts to systematic ROI growth.

But theory is only half the story.

Next comes practice: how GitLab and one of the largest pharmaceutical corporations managed, through FinOps processes and ML analytics, to save tens of millions of dollars and turn the cloud from a cost item into a strategic asset.

How Leaders Do It: Real Cases and Numbers

GitLab: Unified FOCUS Data Standard and Improved Cost Allocation Accuracy

GitLab’s multi-cloud architecture (Google Cloud, AWS, Oracle) turned into chaos in expense accounting: different report formats, terminology, and pricing models. Budget forecasts varied, with only 30% accuracy.

The solution was the transition to the FinOps FOCUS standard. GitLab built its own data pipeline: all data from different clouds went into Snowflake, was automatically converted into the FOCUS format, and integrated into Tableau dashboards.

FOCUS provided a common language for engineers and finance teams, making it now clear how much each function, customer category, or product module costs. The accuracy of cost allocation increased from 30% to 80%, making it possible to quickly adjust expenses and forecast budgets in real-time.

Major Pharmaceutical Holding: $8 Million in the First Year and $19 Million in Total Savings

An international pharmaceutical company migrated dozens of applications to the cloud but postponed optimization—and quickly faced an explosive increase in bills.

It all began with two analysts and Excel. Within a year, a full-fledged FinOps center of expertise was formed. From the outset, thanks to strict tagging and the purchase of the right discount packages (Reserved Instances and Savings Plans), $8 million was saved.

Next came scaling: reports were moved to Cloudability, optimization was implemented using IBM Turbonomic, and the implementation of changes was automated through ServiceNow. The key barrier turned out not to be technology but people: more than 100 teams were motivated through gamification and public rankings, which led to cumulative savings of $19 million.

Both cases demonstrate that FinOps is effective only when it encompasses all levels—from data standardization and automation to a culture of shared responsibility. Technology provides the foundation, but sustainable ROI management emerges where engaged people reinforce tools.

Conclusion: Don’t Abandon the Cloud — Learn to Manage It

The rise of cloud technologies has made them the main catalyst of digital growth—but at the same time, one of the largest expense items. Faced with “bloated” bills, companies are often tempted to return to on-premises. However, recent years' experience proves that the problem is not that the cloud is too expensive, but rather that it cannot be effectively managed without transparency, discipline, and collaboration among all participants in the process.

We have examined why costs get out of control—from pricing and egress charges to “sleeping” instances and suboptimal commitments. We have demonstrated how a FinOps culture, data standardization through FOCUS, and ML analytics enable the transformation of chaotic spending into managed investments.

Today, the cloud is not just infrastructure but a strategic asset. Tomorrow, the key question will not be “how to save?” but “how to extract maximum value?” Value-based cloud requires planning, investment, and management—just like any other business asset.

The first step is to stop perceiving the cloud as a “black hole” in the budget. Instead:

  • Build a cross-functional team.
  • Standardize data.
  • Implement automation.
  • Train engineers to account for the cost of their decisions.

Only in this way will the cloud remain a source of flexibility and innovation—rather than a headache for the CFO.

Comment

Similar texts

See more posts

Modernizing Cloud Infrastructure in 2025: Key Insights for Every CEO

2025 is reshaping the game: cloud has become the core of corporate strategy, influencing financial efficiency, independence, and resilience. This article explores how to manage costs through FinOps, strengthen trust with sovereign cloud and ESG practices, and leverage Generative AI and Serverless 2.0 to accelerate innovation.

Subscribe to our newsletter to get articles and news