One of the
most chatty, interesting and passionate people about voice that I met at the
former ETel was Thomas
Howe. So I figured the other week it would be good to interview him to get
his perspective. Due to distance we used Skype for the interview. Tom is the
CEO of the Thomas Howe Company
and was formerly a CTO for a business unit at Comverse.
The interview can be downloaded
here in 64kbps-cbr mp3 format (it's 15 meg in size and 34 minutes in
length). Although I can not hear a difference, if you believe you can, a
96kbps-cbr version is here
in mp3 format (it's 23 meg in size).
Below is
some text to give you an idea without listening to the audio interview but it
is by no means anywhere near a transcript nor full account of the interview (so
please listen to it). Rather it is taken from notes I made in real-time. I've
also added hyperlinks where I believe they could be beneficial to
understanding.
I started
by asking him if in the sphere of person-to-person communications whether voice
has ceased to be the jewel - that voice is now somewhat passé' to which he
responded:
yeah, yeah
I think so, the message I am trying to give is that, what has happened for
voice, is that we tend to think of the applications as being voice applications,
so if your talking about voice applications, your taking about PBXs and
conferencing, and pre-paid, PBX sort of functionalities, when, I think in
reality, voice is much more powerful
when it is used to enhance another application which has nothing to do with
voice at all.
Tom then
went on to give an example using Morisky Surveys:
if you're
a patient and your going to get a prescription for drugs, there is a 4 question
survey that you could take that would predict if your going to finish your
course of care, finish your drugs, and
if your able to use a voice form to ask the patient as they give the drugs, if
they are going to take that drugs, you can give them a much better experience
of care, and that has nothing to do with
voice at all, it has to do with making
people more healthy. But by using voice you can reach these people who you
could not reach before in a very controlled and inexpensive way. So I think the next wave of voice is
nothing to do with voice applications but using voice to enhance other
applications
Asked if
voice then takes a secondary seat?
...voice
could be applied to a hundred thousand applications, just none of them happen
to have nothing to do with voice
Once I had
a grasp where Tom was coming from a fired a more extended question at him "Do
you see these applications that voice could be added to, as being new applications
as in completely new or do you see this as just voice enhancing existing
applications - is this new space or enhancing existing space?", to which he
replied:
...I'm not taking about voice being used to
solve new problems. I'm talking about voice being used to solve older problems,
better.
Asked for another
example Tom replied:
One good
example is in the area of logistics, where you have an organisation that is
responsible for managing computers , it could be vending machines, it could be
computers, could be fleet management, anything to do with taking care of
inventory outside of a corporate wall. One problem is tracking data and what
happens to those assets. One way of
extending your business process outside the firewall is by using voice to any
black phone, so imagine that your doing asset repair out in the field, you
could use your phone to call into the corporation , give what happened to the
asset and hang up. This has some real great advantages. First of all none of
the corporate data goes across the firewall. Secondly it works with every single
phone. Smart phone or not. Thirdly it is very inexpensive to implement for the
enterprise manager. The fourth thing is it is very easy business case to make.
The fifth thing is there is really no other technology today that could give
you that sort of bang for the buck. Not even web browsers because if your
running them in the truck, you have the whole issue of wireless Internet access
and smart phones, it's a real pain. So
by using voice in the asset tracking applications, you make that problem much
easier to solve, but it has nothing to do with voice.
When asked
for example clients which the Thomas
Howe Company works with, he replied:
I'll give
you a couple of them. The first client that we are working with , actually it
is a pretty neat project is for a company called Pero Systems down in Texas and with Pero
we're doing a password reset for their corporate facility. So what we are doing
at Pero is taking a job function which
is a guy sitting at a desk who resets
passwords when people forget them and we're replacing it with a voice script. So instead of calling a human being to get
your password reset, you call this voice script. Now this not only reduces
their headcount on the desk but, which is really important to them but also
ensures that they have a controlled , repeatable and consistent process of applying security
to passwords...
We're also
doing an application for a charity called Poverty Action, they are funded by
the Bill and Linda Gates Foundation and they do research into finding ways of
alleviating poverty in the third world using communications, so we did one
application for them that allows, agents in the field to give micro-loan
offers, in a controlled and auditable way to merchants and using any phone. It doesn't net require any smart phones,
these are poor countries, but they have fraud issues and by using voice to
carry that application to the field they are able to enable a much wider reach
of their field workers. I can give you more examples we working with the
large financials in New York City;
it's a big market for us, we sort of focusing in on disease management and pharmaceuticals,
and this financial stuff.
In relation
to his talk he said he would bring along more examples and "hard data" from out
of Forrester and other analyst groups, which say exactly how much money
Enterprises save by deploying voice mashups and that the "numbers are very very
impressive"
Asked if he
was working with many companies APIs,
he responded that he was. Asked specifically if he was working yet with Ribbit (conference sponsor) he said that he
was currently learning it.
I said that
Microsoft seemed to be pushing into the voice mashup direction with
NetworkMashhups.com, Tom replied:
...I must
admit I spend more time on ProgrammableWeb.com
than the Microsoft site...Microsoft has obviously recognized the importance of
mashup technology, I hope they keep an ethos and keep it open...NetworkMashups.com has 200-300 that
are listed there, ProgrammableWeb has almost 3000.
Following
on from this I put it to him that one of the innovation paths that BT are looking down is providing an API to their
new network, known as BT 21CN,
and so I asked whether or not he'd had a chance to look at the BT 21CN's API
I surely
have, and I had a real great opportunity to speak with the general manager of
future voice at the Sylantro global user summit last fall and he shared with me
some of their work and their efforts and I'm really thrilled by what they are
doing. I mean if I was to predict the future of carriers, in a world where most
voice is, or at least most voice applications have mashup infrastructures, I
really think British Telecom is the shining light and is doing a fantastic job.
The API is very interesting, I think the fact that it is running on BTs network
, you can guarantee that it will always work, it is not going to change
capriciously, is a very valuable thing....it is amazing a company that size is
being that entrepreneurial.
I follow up
by asking if the cost still prohibitive:
It's
funny, I never think about that. I tell you why. The cost BT has for their offering is only prohibitive if you are
thinking about horizontal services. If your thinking about vertical services
and I go back to the Morisky Survey, the cost to the country, to the health
care providers, to the people, in real money, that can be saved simply by
identifying those people who will not finish their does of penicillin, makes
the cost of an API infitesmal, who cares, it does not matter.
Really when I am looking at these applications, the returns, the ROIs for my customers are so high and the money and
the money they are saving so big and their relative volume is so small that it
really doesn't matter what the costs are , given that they are somewhat
reasonable
Asked if he
had looked at Vodafone's Betavine
and how it contrasts to the 21CN API
the great
thing about the Vodafone API is that they are focusing in on location based
services, which I think is a wonderful addition to what we're doing and I know
there is a company here in Massachusetts called Where which is working with
other carriers but really in my mind Vodafone is the leading company in enabling
applications ...you know we've heard about location based services quite a bit
and unfortunately for me, most of the times I hear about location based services,
I hear about advertising possibilities or ...actually
I am more interested in how they can help the enterprise business process
and I think the fact that Vodafone has made their API so widely available,
allows these mid to large companies to do their experiments to understand what
sort of LBS applications make sense. One
that I happen to like is workforce automation that will figures out when
the Comcast technical goes into your house, how long does he stay there and
when does he leave? If you can get that information, you can figure out how
much to bill that customer or you could do efficiency studies, you could relate
that sort of repair took that long and just by figuring out how long someone
was at a certain place, is enough to understand all that data.
Asked if
there is an API service out there that lets you take speech in and send out the
transcript via SMS Tom said you could gang it together with a speech to email
API of which there are a few and then you just pipe that into SMS.
Tom
finished off by stating:
By using voice in their application they can
enforce consistency that is hard to do in other ways. So they can do a consistent
collection of data by using voice forms that they would not have to do
otherwise.
I had to
fire in just one more and asked about the discrepancy between the global reach
of Internet player's APIs such as the Google Talk compared to
a pretty national focused company like British Telecom, in particular
questioning interactivity between someone in the States and someone on the UK.
Tom replied:
depends on
important localization is going to be