OSS data – more useless than useless!

Image credit: Harvepino | Shutterstock.com

About 6-8 years ago, I was becoming achingly aware that I’d passed well beyond an information overload (I-O) threshold. More information was reaching my brain each day than I was able to assimilate, process and archive. What to do?

Well, I decided to stop reading newspapers and watching the news, in fact almost all television. I figured that those information sources were empty calories for the brain. At first it was just a trial, but I found that I didn’t miss it much at all and continued. Really important news seemed to find me at the metaphorical water-cooler anyway.

To be completely honest, I’m still operating beyond the I-O threshold, but at least it’s (arguably) now a more healthy information diet. I’m now far more useless at trivia game shows, which could be embarrassing if I ever sign up as a contestant on “Who Wants to be a Millionaire.” And missing out on the latest news sadly makes me far less capable of advising the Queen on how to react to Meghan Markle’s latest royal “atrocity.” The crosses we bear.

But I’m digressing markedly (and Markle-ey) from what this blog is all about – O.S.S.

Let me ask you a question – Is your OSS data like almost everybody else’s (ie also in I-O mode)?

Seth Godin recently quoted 3 rules of data:
“First, don’t collect data unless it has a non-zero chance of changing your actions.
Second, before you seek to collect data, consider the costs of processing that data.
Third, acknowledge that data collected isn’t always accurate, and consider the costs of acting on data that’s incorrect.”

If I remove the double-negative from rule #1 – Only collect data if it has even the slightest chance of changing your actions.

Most people take the perspective that we might as well collect everything because storage is just getting so cheap (and we should keep it, not because we ever use it, but just in case our AI tools eventually find some relevance locked away inside it).

In the meantime, pre-AI (rolls eyes), Seth’s other two rules provide further sanity to the situation. Storing data is cheap, except where it has to be timely and accurate enough to make decisive, reliable actions on.

So, let me go back to the revised quote in bold. How much of the data in your OSS database / data-lake / data-warehouse / etc has even the slightest chance of changing your actions? As a percentage??

I suspect a majority is never used. And as most of it ages, it becomes even more useless than useless. One wonders, why are we storing it then?

Be the first to comment

What do you think?

This site uses Akismet to reduce spam. Learn how your comment data is processed.