AI’s total cost of ownership

CFO

Danila Pavlov at Nebius argues that it’s time to focus on the Total Cost of Ownership of AI

Artificial intelligence (AI), machine learning (ML), and generative AI (GenAI) are restructuring industries around the world, forcing incumbents to react and creating opportunities for disruptive start-ups.

As these incumbents adapt and add AI and ML applications to their product lineup or internal toolsets, they will have to understand and overcome the potentially high costs of using public cloud infrastructure.

Given the unique requirements of developing and deploying AI at scale, one strategy is to focus on the Total Cost of Ownership (TCO), rather than just considering headline compute costs.

As Andreessen Horowitz has highlighted, relying solely on large cloud computing providers can pose challenges to managing operating costs and maintaining healthy profit margins.

While reliable and able to operate at scale, they are seldom the most cost-effective source of compute and other resources. In that estimate, 50 of the top public software companies that rely on cloud infrastructure may have saved approximately $100 billion in market value as a result.

Given the current boom in AI, the compute requirements of both training and deploying GenAI applications, and the associated demands for GPU access, the present situation is conceivably even worse. AI start-ups spend most of their funding on compute, so deploying that capital optimally can be a make-or-break decision. While the stakes may not be quite as high for established businesses, failing to implement AI cost-effectively at the start can cause major problems down the line.

One issue with using major cloud providers to develop and deploy AI models is pricing. Not only are they among the most expensive compute providers when considering the typical per-hour rate for access to NVIDIA Hopper GPUs, but they also charge for additional computing resources, data transfer, load balancing, IP addresses, and more.

While it is possible to estimate the TCO, the best pricing is generally only available to large companies prepared to reserve a certain amount of compute capacity. While locking in a 36-month deal might sound like an obvious solution for steady demand, it is unlikely to be cost-effective – especially if a company’s compute demands are volatile, as is usually the case with start-ups.

Optimising and maintaining cloud infrastructure also demands a different set of skills to developing applications. Unless an organisation already has a DevOps or MLOps team, they will likely need to hire an additional employee or contractor with the required skillset.

Many companies underestimate the costs associated with this, as effective cloud infrastructure management typically requires 24/7 support from these professionals. Organisations should factor this into their calculations when calculating the TCO.

While ditching public cloud providers and deploying servers onsite is an option, it demands major upfront infrastructure investment and requires a large and experienced engineering department. For large enterprises it may well be the best way to optimise the TCO, but for most businesses it is likely to require too much financial and human capital.

For most companies, the best approach to solving TCO is to use an AI-specific cloud provider. These offer more transparent pricing and flexible implementation options which gives businesses more control of the costs of developing AI applications and deploying them to customers at scale.

Not only that, but because their infrastructure is optimised for AI and ML workloads, they are typically cheaper than the Big Three. While they may not offer the same wide range of services, by staying focused on AI applications they keep costs down, with flexible pricing models and per-token pricing so that users only pay for the resources they need.

As the cloud provider handles all the hardware and deployment optimisation, there’s typically no need to hire additional DevOps or MLOps staff. Instead, the engineering department should have the necessary skillset, helping keep the TCO lower.

AI-specific cloud providers can also allow developers to provision their own resources through self-service platforms. Transparent and on-demand pricing makes it easy to calculate costs without the requirement for a service commitment, which risks reserved capacity sitting unused.

The current AI boom shows no signs of stopping. The companies best positioned to take advantage of the opportunities it creates will be those that are able to optimise their costs. When it comes to AI, that means focusing on TCO.

Danila Pavlov is CFO at Nebius which offers flexible pricing models for GPU cloud from as little as $1.50 per GPU hour. Companies can also access their AI Studio inferencing service using APIs with per-token pricing

Main image courtesy of iStockPhoto.com and MicroStockHub