
It is clear nowadays that data is valuable, and many parties try to gather it. But it was interesting to hear that a shortage of data can be a bigger issue than a shortage of money for startups. And many businesses have lots of data that is not fully utilized. There is still a gap to fill in this data business era by adjusting models to create data and AI applications.
A friend teaching AI at a top university and consulting many tech companies had some interesting comments. In the current market, it is difficult for AI startups to raise money, but getting data is sometimes a bigger issue. He said many startups have big AI plans but no data to implement them.
This is, of course, obvious because you need data for AI. But he was able to crystallize it well, and it also made me think again about all the startups I have looked at most recently. Which of them have data, which of them have access to data, which of them could get or generate unique data and which companies have data that is not utilized?
Data is available
I have been working with a startup called RAIN.global, which creates AI development tools that can be used in telcos’ radio networks (RAN). We have talked about how telcos (or CSPs) have a lot of data they still do not utilise. For example, they could easily save on electricity costs by better optimizing the use of network components by using stored power from batteries.
Others concentrate on consumer data and applications that allow consumers to utilize their own data, i.e., user-held data models. Consumers typically have the right to use the data themselves. But this is still very under-utilized. At the same time, many companies try to develop new ways of accessing data from people but are finding less success with that model.
There is also a lot of public data (e.g. traffic, weather, air quality, available services) that could be combined with proprietary data sources (including telco and user-held data) to make new services. For example, weather forecasts can be combined with RAN data to better manage electricity needs in hot weather or combine air quality data with exercise data.
LLM and GPT tools can also utilize much text data, including news and open source intelligence (OSINT). These tools could be utilized for risk management, logistics planning and help to react to certain situations.
These examples demonstrate that there are many places and new concepts where getting data for AI would be quite straightforward. But it also looks like companies, including startups, are quite fixed with a business model relying on other parties to give them data.
New approaches needed
Getting data from other parties, businesses, and consumers has become more difficult. Data privacy laws/regulation is one factor, but as parties better realize the value of data and the risks of giving it away, it will be harder to access. Now it sounds almost naïve if a startup makes a business plan to try to attract other parties to give significant data to itself.
Based on the examples above, we can have at least three models to make new AI solutions so that data is readily available:
- Make AI tools and applications for businesses that have data but haven’t been able to utilize their data properly.
- Make user-held data-based AI solutions for consumers and tools that make it easy to collect their data for their personal AI services.
- Utilize publicly available data (including OSINT, commercially available reports or data sources, and public sector free data).
The main issue is not that there are no opportunities to make AI solutions that could access needed data; the problem is the mindset of companies wanting to get data for themselves.
This is not very surprising when we have read for 20 years how valuable it is to get data and own it. And many times, this has been the case. Of course, it helps a company to protect its own position if it owns that data and the AI tools to utilize it. But it is important to realize it is getting more difficult all the time, and there are plenty of opportunities to do business with other data models.
VCs are also a part of the problem
VCs are also a part of the problem. They liked the model to own data. Some seem to be sceptical about other models to utilize data and develop AI. They should also see that new models can offer new opportunities to disrupt the market.
Many companies indeed have problems because they have a business and technology model where they should get and own a lot of data, but they cannot get data. At the same time, it is also true that there is a lot of data in many places that are available publicly.
So, it is not true that a shortage of data, as such, is the main problem. Too many business and technology plans are based on a data ownership and availability model that is no longer feasible. It means that to make new AI tools; we also need models that offer other, better ways to get and utilize data.
Be the first to comment