Trivia Question: When was the first example of network-based music streaming launched?
I’ll bet many of you guessed that it was Spotify in 2006, or Pandora in 2000. Maybe some of you guessed RealAudio, back in 1995.
But the actual answer is over a century earlier. It was the Théâtrophone, first demonstrated in 1881 in Paris, with commercial services around Europe from 1890. It allowed people to listen to concerts or operas with a telephone handset, from another location across town. It even supported stereo audio, using a headset. It finally went out of business in the 1930s, killed by radio. Although by then, another form of remote audio streaming – Muzak, delivering cabled background music for shops and elevators – was also popular.
Why is this important? Because these services used “remote sound” (from the Greek tele+phonos) over networks. They were voice/audio communications services.
Yet they were not “phone calls”.
Over the last century, we’ve started to use the words “voice communications”, “telephony” and “phone calls” interchangeably, especially in the telecoms industry. But they’re actually different. We often talk about “voice” services being a core component of today’s fixed and mobile operators’ service portfolios.
But actually, most telcos just do phone calls, not voice in general. One specific service, out of a voice universe of hundreds or thousands of possibilities. And a clunky, awkward service at that – one designed 100+ years ago for fixed networks, or 30+ years ago for mobile networks.
*Phone rings, interrupting me*
“Oh, is that Dean Bubley?”
“Yes, that’s me”
“Hi, I’m from Company X. How are you today?”
“I’m fine, thanks. How can I help you?”
… and so on.
It’s unnatural, interruptive and often unwanted. A few years ago a 20-something told me some words of wisdom “The only people who phone me are my parents, or people I don’t want to talk to”. He’s pretty much right. Lots of people hate unsolicited calls, especially from withheld numbers. They’ll leave their phones on silent. (They also hate voicemails even more).
I used to go into meetings at operators and ask them “Why do people make phone calls? Give me the top 10 reasons”. I’d usually get “to speak to someone” as an answer. Or maybe a split between B2B and B2C. But never a list of actual reasons – “calling a doctor”, “chatting to a relative”, “politely speaking to an acquaintance but wishing they’d get to the point”.
Now don’t get me wrong – ad-hoc, unscheduled phone calls can still be very useful. Person A calling Person B for X minutes is not entirely obsolete. It’s been good to speak to friends and relative during lockdown, or a doctor, or a bank or prospective client. There’s a lot of interactions where we don’t have an app to coordinate timings, or an email address to schedule a Zoom call.
But overall, the phone call is declining in utility and popularity. It’s an undifferentiated, lowest-common denominator form of communications, with some serious downsides. Yet it’s viewed as ubiquitous and somehow “official”. Why do web forms always insist on a number, when you never want to receive a call from that organisation?
Partly this relates to history and regulation – governments impose universal service obligations, release numbering, collect stats & make regulations about minutes (volume or price), determine interconnect and wholesale rates and so on. In turn, that has driven revenues for quite a lot of the telecom industry – and defined pricing plans.
But it’s a poor product. There are no fine-grained controls – perhaps turning up the background noise-cancellation for a call from a busy street, and turning it down on a beach so a friend can hear the waves crashing on the shore. There’s no easy one-click “report as spam” button. I can’t give cold-callers a score for relevance, or see their “interruption reputation” stats. I can’t thread phone calls into a conversation. Yes, there’s some wizardry that can be done with cPaaS (comms platforms-as-a-service) but that takes us beyond telephony and the realm of the operators.
Beyond that, there’s a whole wider universe of non-call voice (and audio) applications that operators don’t even consider, or perhaps only a few. For instance:
- Easy audioconferencing
- Voice-to-text transcription (for consumers)
- Voice analytics (e.g. for behavioural cues)
- Voice collaboration
- Voice assistants (like Alexa)
- Audio streaming
- One-way voice / one-way video (eg for a doorbell)
- Telecare and remote intercom functions for elderly people
- Telemedicine with sensor integration (eg ultrasound)
- IoT integrations (from elevator alarms to smartwatches)
- “Whisper mode” or “Barge-in” for 3-person calls
- Voice biometric security
- In-game voice with 3D-positioning
- Veterinary applications – who says voices need to be human?
There are dozens, maybe hundreds of possibilities. Some could be blended with a “call” model, while others have completely different user-interaction models. Certain of these functions are implemented in contact centre and enterprise UCaaS systems, but others don’t really fit well with the call/session metaphor of voice.
I’ve talked about contextual communications in the past, especially with WebRTC as an enabling technology, which allows voice/video elements to be integrated into apps and browser pages. I’ve also written before about the IoT integration opportunities – something which is only now starting to pick up (Disclosure: I’m currently working with specialist platform provider iotcomms.io to describe “people to process” and event-triggered communications).
But what irritates me is that the mainstream telecoms industry has just totally abdicated its role as a provider and innovator of voice services and applications. You only have to look at the mobile industry currently talking about Vo5G (“5G Voice”) as a supposed evolution from the VoLTE system used with 4G. It’s basically the same thing – phone calls – that we’ve had for over 100 years on fixed networks, and 30 years on mobile. It’s still focused on IMS as a platform, dedicated QoS metrics, roaming, interconnection and so on. But it’s still exactly the same boring, clunky, obsolescent model of “calls”.
There was a golden opportunity to rethink everything for 5G and say “Hey, what *is* this voice thing in the 2020s? What do people actually want to use voice communications *for*? What interaction models and use-cases? What would make it broader & more general-purpose?” In fact, I said exactly the same thing around 10 years ago, when VoLTE was being dreamed up.
Nothing’s changed, except better codecs (although HD voice was around on 3G) and lame attempts to integrate it with the even-worse ViLTE video and perennially-useless RCS messaging functions. The focus is on interoperability, not utility. Interop & interconnection is a nice-to-have for communications. Users need to actually like the thing first.
Some of the vendors pay lip-service to device integration and IoT. But unless you can tune the underlying user interface, codecs, acoustic parameters, audio processing, numbering/identity and 100 other variables in some sort of cPaaS, it’s useless.
I don’t want a phone call on a smartwatch – I want an ad-hoc voice-chat with a friend to ask what beer he wants when I’m at the bar. I want tap-to-record-and-upload of conversations, from my sunglasses, when someone’s trying to sell me something & I suspect they’re scamming me. I want realtime audio-effects like an audio Instagram filter that make me sound like I’m a cartoon character, or 007. (I don’t want karaoke, but I imagine millions do)
So remember: the telecoms industry doesn’t do “voice”. It just does one or two voice applications. VoLTE is actually ToLTE. It’s not too late – but telcos and their suppliers need to take a much broader view of voice than just interoperable PSTN-type phone calls. Maybe start with Théâtrophone 2.0?
Written by Dean Bubley, an outspoken industry analyst, strategy advisor & chair/speaker on telecoms, 5G, Wi-Fi, spectrum policy, IoT & futurism. This post was first published via Dean’s LinkedIn Newsletter.