How do AI business models relate to data business models?

Image by NongAsiMo | Bigstockphoto

We’re living in a time when AI is disrupting many services. But what is the impact of AI on business models? It is still quite unclear. We can see some AI solutions are implemented by relying on very traditional business models. Other models to adopt AI can potentially change the whole software market and many things in between. But is it really about AI, or is it more about data and making all services more data and customer-oriented?

Pouring money into hardware

Even VC investors are still looking to settle on their approach to the AI market. They have invested heavily in some startups that focus on training AI models and need a lot of expensive hardware. Some feel they can achieve better AI models in that way and then offer SaaS types of AI services. But it is easy to challenge this model and say that hardware always becomes more powerful, and training an excellent AI model gives you a competitive advantage only for some time. Then other players may make similar attempts with cheaper hardware.

On the other hand, I just saw a reply from a VC to an investment deck. They didn’t feel it was possible to sell more software and data to enterprises in the future, especially considering that enterprises can make so much more in-house with their AI. Should we really believe that each enterprise is training its own AI models to perform all kinds of tasks, and they no longer buy external services (SaaS) and data? It is a very interesting thought. However, I’m sceptical for several reasons.

Currently, many companies live in their own data silos. They basically use their own data and data they have been able to buy from data brokers. And as I wrote earlier, many companies are not yet innovative to explore new models to work with data. Companies haven’t successfully combined data from their own silos (just think how bad most chatbots are). New AI models, including LLMs, open new opportunities using different data sources as input, but it is still hard to see companies fully utilizing their own data and even harder to utilize (legally and without paying) external data.

Do you have rights to data?

Data ownership and copyrights are becoming more complex. These considerations also raise a more important question – whose data can you really use to train? OpenAI and Google are already facing several lawsuits on those matters. We see multiple battles over how generative AI can use personal appearance or voice. And there are also questions related to the legality of using content from books to train AI. We also have personal data and privacy developments that protect personal data in many countries.

This also has an impact on the business models. It is not simple to get data, not even the training data, for free. Then we have a question about the quality of the data sources. With ChatGPT, we have already seen many cases where the quality of the data sources is not good and qualified, and the models’ output is unreliable.

So, we will soon find ourselves in a situation where the question is not really about creating AI models and using or selling their processing capacity but being able to use the right data. And you must pay for that data, offer concrete value to those who make data available, or you just sell AI processing to those who have the data (even then, you still need to pay or offer value to those who enable you to train your models).

Positions to make money

What possible roles can this more data-oriented value chain have in the future? Here are some plausible scenarios:

  1. Companies that focus on licensing original data, i.e., parties that own data that is valuable for AI training or AI services. There will probably be different kinds of platforms where the licensing is done, and there will be different models for compensation (not only money but the personal value of sharing data somewhere).
  2. Companies that facilitate data access and processing for AI training and services. And this doesn’t necessarily mean their rights to all the data, but enable its use and use federated AI-type solutions to train models without combining data.
  3. Companies that focus on training superior AI models and services for generic use or niche use cases.

One company can, of course, have more than one position in the value chain. But as we can see, this value chain is more about data than AI.

It is about data

There are also differences between consumer-focused data and services and enterprise-first data and solutions. The fundamentals are similar, although each needs different implementation details (e.g., facilitating consumer data and making services for consumers rather than making enterprise services). But you need to offer value to data owners, respect copyright, and a lot of top-quality data to train proper AI models and operate AI services.

It is fundamental for anyone to have top-level data, especially if they want to train and offer top-level AI services. Anyone can make poor or not-so-reliable AI services in the future. They can generate mediocre and not so reliable information and content, they can generate not top quality songs or movies with some people’s voice and appearance and they can offer consumer and enterprise AI services that are not top in their class. But to make something really competitive, top quality and people are willing to trust and use, you need access to top data. Even if you can use open source data or other publicly available data, you need to know which one to use and how.

Many companies, startups and investors should start to think more about data strategies and value chains than only AI service business cases. Maybe you can still now buy super hardware and train with publicly available data, but it doesn’t guarantee a long-term position. It also needs the competence to find optimal data sources, e.g. for many experts and niche services. So, you need access to top-quality data; but to be successful, you need to also understand what data makes you successful in your business.

Be the first to comment

What do you think?

This site uses Akismet to reduce spam. Learn how your comment data is processed.