Big data is becoming heavy data – and that’s a problem

heavy data
Image credit: Little Adventures / Shutterstock.com

Data mass is beginning to exhibit gravitational properties – it’s getting heavy – and eventually it will be too big to move. How will networks handle this? And what does it mean for digital identity, privacy and trust?

Those are questions Guy Lupo, General Manager, Head of NaaS 2020 at Telstra, has been asking as he ponders the future of digital transformation. He has some ideas about the answers.

Lupo has created an interesting slide that envisions an intelligent distributed network powered by open APIs.

big data heavy data

He discussed it at TM Forum Action Week recently during a session focusing on the tension between the need for ubiquitous availability of data and the expectation of trust that customers (and in many countries, regulators) have about how data is used. This juxtaposition is something communications service providers (CSPs) and their partners must consider as they develop network and IT architectural models, including the Open Digital Architecture.

Centers of data, not data centers

Lupo suggests that as data becomes too big to move, it will become centralized.

“If you look at IoT, where hundreds and thousands of sensors are creating massive amounts of data, it’s getting too big to move – we are starting to have centers of data, not data centers,” he says. “And if you have centers of data…you will be waking up to a world of APIs.”

Lupo predicts that in the future everything will be ‘as a service’, automated and accessible via APIs. In addition, everything will have a digital identity with a trusted computing base so that data can be pulled from centralized locations.

“In the next six to ten years, there will be identity for sensors, identity for data, even identity for parts of your data records,” Lupo says. “You will be able to expose parts of your health record because your data elements will have an identity.”

Once data is centralized, he expects that “AI agents” will travel between the centers of data, learning and executing policy, without carrying copies of the data with them, only the learnings: “Data and learning are localized, and AI moves around making decisions and taking decisions within the geographical jurisdiction of the governing body that owns the data.”

The time is now

It’s critical to start thinking about data security and privacy before it’s too late. Once AI agents are traversing networks and making decisions, it will be too late to overlay architectural principles. Lupo says we need to consider data in terms of minimal exposure, privacy, confidentiality, longevity, availability, integrity and digital footprint.

For example, most people have no idea how large their digital footprint is.

“Do you really know how much of you is out there?” Lupo asks. “Google is your public digital footprint… At the moment, the cost of compute is not low enough for Google to be able to afford a CPU dedicated to you, but at some point they will have one. At that point you won’t be able to hide in plain sight inside statistics as you do today.”

Realms matter when it comes to data privacy

Lupo and other TM Forum members discussed the role the Forum can play in helping CSPs navigate these issues of rapidly increasing amounts of big data, network and IT transformation, and customer privacy and trust.

“There is no language around this, so there’s an opportunity within the Forum to talk about it because we sit in the IT realm and are the hosts of all this data,” Lupo says.

One important area of focus could be on how data can be used within specific realms. In many architectural models, including ODA, there is supposition that data will be ubiquitously available as part of a giant data lake, but there is no real understanding or agreement about how trust requirements may affect the use of this data.

In the EU, for example, strict rules about customers’ privacy are enforced through the General Data Protection Regulations (GDPR). These rules give consumers a right to privacy, a right to be forgotten, and often specify in which realms data can be used.

“If you’re using data for something you’ve got permission to use it for, you can put as much data into the data lake as you like,” says George Glass, VP, Architecture & APIs, TM Forum, who before joining the Forum was Chief Systems Architect at BT. “But you can run into problems even when you think you’ve anonymized the data.”

Anonymizing data isn’t enough

Glass gives a real-world example of a GPS satellite navigation system, which uses anonymized subscriber data to improve route guidance and give users driving directions based on time of day, typical patterns of congestion, etc.

He explains: “The data is completely anonymized. However, if [as a CSP] I’ve got the GPS data and I can link that to mobile phone data, which also gives me location, I can very, very quickly work out what the anonymization ID is of an individual, and I can tell you exactly where he’s been based on his GPS information. If I bring the anonymized, safe GPS data into my data lake and start processing it and link it to the individual, he is going to get very agitated that I know where he’s been because I don’t have his consent to use the data in that way.”

Forum members who are working on ODA could explore this idea of realms by dissecting an example like GPS, exploring how data is stored and processed using machine learning, and looking at decisions that are made and implemented using AI.

“We’re starting to talk about data, machine learning and AI, and we’re also talking about business practices in terms of owners of data and purpose of use within a realm,” Glass says. “We should be coming up with patterns or guidance on what you can do with the data – or to say these are the kinds of things you can do with the data if you join two realms.”

If you’re interested in participating in this work, please contact George directly via wgglass@tmforum.org. To learn more about Guy Lupo’s views on the future of digital transformation, check out his blog.

Dawn Bushaus, managing editor at TM ForumWritten by Dawn Bushaus, managing editor at TM Forum | Original story posted at TM Forum Inform

The TM Forum’s Digital Transformation Asia event is coming to Kuala Lumpur on November 13-15. Disruptive.Asia is an official media partner with the event. All the details you need are right here.

Be the first to comment

What do you think?

This site uses Akismet to reduce spam. Learn how your comment data is processed.