Siri | Beyond Devices

The question I’ve been asked the most in the last couple of weeks by reporters as we get ready for Apple’s Developer Conference is whether Apple’s Siri personal assistant is behind its competitors, and whether it can catch up. The answer is more complicated than just a yes or a no, and having some context for next week’s announcements is really important to evaluating them properly when the time comes.

Comparing Apples with Apples

First off, many of the comparisons I’m seeing made at the moment are comparing apples with oranges (no pun intended). What I mean by that is that we’re at a particular point in the calendar where we’re comparing everyone else’s products (and in many cases announcements of products that aren’t yet available) from 2016 to Apple’s versions from 2015. Since Apple only makes major changes to Siri and its software in general once a year, we’re still looking at last year’s versions ahead of WWDC next week, but all the other major consumer tech companies have already held their developer events this year, and thus shown their hands.

Adding more to the “unfairness” of the comparison, in many cases competing products aren’t actually available yet, and won’t be for months. So evaluating Apple’s position in digital assistants (and artificial intelligence more broadly) today makes a lot less sense than it will this time next week, when we know what Siri will look like in the second half of 2016. If past patterns continue, it seems likely that at least some of the new features will be available to developers almost immediately, to participants in Apple’s iOS beta program shortly after, and to everyone with an iPhone in September.

Three components to digital assistants

Even though people talk about voice-based assistants in a unitary fashion, there are really three main components to these products, and if you really want to evaluate an individual example, you have break it into these constituent parts. Those three parts are:

Voice recognition – turning sounds into individual words
Natural language processing – turning collections of words into phrases and sentences with specific meanings
Serving up responses from a cloud service.

An effective digital assistant needs to be good at all three of these things in order to do the overall job well. First, it has to recognize the words accurately, then it has to properly identify the meaning of the set of words the user says, and then it has to serve up a response based on the set of things it’s capable of doing.

Siri is competitive but not a leader today

For today, Apple’s Siri is decent but not stellar on the first two points. In both cases, Google’s voice search and Amazon’s Echo device do a better job of both recognizing individual words and ascribing meaning to phrases and sentences. The gap isn’t huge, but it’s noticeable, while Microsoft’s Cortana generally performs roughly on par with Siri in my experience. On the third point, Siri has expanded the range of tasks it can perform, but it’s still limited mostly to things Apple’s services can perform with a handful of third party services feeding in data on particular topics. Google’s voice search can pull in a little more third party data and has a much wider range of first party data to pull from, while Amazon’s Alexa assistant has an open API that’s resulted in connections to many third party services, though a large number are from tiny companies you’ve never heard of. Cortana is, again, roughly on par with Siri here.

On balance, then, Siri is roughly in the same ballpark as competitors, but lags slightly behind in all three of the key areas versus both Google and Amazon. Though it’s not available yet, Google has also announced the next generation of its digital assistant, called simply “the Google assistant”, which will be able to respond to text as well as voice queries and engage in conversations with users. This is capability Cortana has already, but others including Siri don’t. It should roll out over the next few months to users, but it’s hard to evaluate how effective it will be based on keynote demos alone.

Where Apple might go next week

Returning to my first point, as of right now we’re comparing the 2016 versions of others’ products to the 2015 versions of Apple’s, so the question becomes how Apple might move Siri forward at WWDC and close the gap in these various areas. Across those three areas, the most likely changes are:

Voice recognition – Apple has been continually improving its voice recognition, and although we’ve seen the least concrete rumors ahead of time in this area, I would expect it to talk up further improvements at WWDC
Natural language processing – Apple has acquired a variety of companies with expertise in artificial intelligence recently, and among them is VocalIQ, which specializes in conversational voice interactions. I would expect significant improvements in natural language processing including multi-step conversations to be announced at WWDC, which should move Apple forward in a big way in this area
Responses – the biggest thing holding Siri back right now is its lack of third party integrations, and especially the inability for developers to make functionality in their apps available to Siri. Were that to change at WWDC – which it seems likely to do with a Siri API – that would again dramatically improve the utility of Apple’s voice assistant.

I’ve so far focused mostly on the voice aspects of these digital assistants, but Apple also added other elements last year, and might continue to build on them this year. In 2015, it introduced the Proactive elements of Siri, which serve up contacts, apps, news, and other content proactively through notifications and in the Spotlight pane in iOS. The main area I’d like to see it adding more functionality this year is in text interactions with Siri, which could potentially happen either in the standard Siri interface or through iMessage, such that Siri would appear as just another contact you could exchange text messages with. Apple could even open up iMessage as a platform for bots and conversational user interfaces from third parties, which would help Apple keep pace with announcements from Facebook, Microsoft, and Google in this area.

The other thing that’s worth bearing in mind is that these digital assistants are only useful when they’re available. Amazon’s Echo device does very well where it’s present, but its biggest weakness is that Amazon has only sold around three million devices, and its Alexa assistant isn’t available on phones, the devices we carry everywhere with us. Google’s assistant is pervasive, available both on Android and iPhone, on the web, and elsewhere, while Siri is available on most of Apple’s devices in some form (with the exception of the Mac). Cortana is available on PCs running recent versions of Windows, but its availability on phones does little to help since there are so few Windows phones in use. If Apple extends Siri to the Mac at this year’s WWDC, another credible rumor, then it will make it even more ubiquitous in the lives of those committed to the Apple ecosystem.

Changing the narrative

What Apple’s faced with as it heads into WWDC is a growing narrative which suggests it’s falling behind in both AI generally and the realm of digital assistants specifically. As I’ve already said, given the quirks of the calendar, Apple has naturally been silent as others have revealed their 2016 plans, and so this comparison is partly unfair. But Apple has a chance during its developer conference to demonstrate that it’s committed to not just keeping up but establishing leadership in these areas. By Monday afternoon, we’ll be in a much better position to judge whether it’s been successful in changing the narrative.

John Gruber had a post on Tuesday about the improvements to Siri over time. He was mostly talking about the core functions of Siri itself, but I’ve noticed some other subtler improvements over the past couple of years that I’ve been meaning to write up for ages. Technically, these aren’t part of Siri per se, but they sit in the same category as Cortana or Google Now’s equivalent functionality, so I see them as almost an extension of Siri. What’s striking about these to me is that as far as I can tell Apple hasn’t explicitly talked about them and since they don’t trigger explicit notifications many users might not even notice them.

I grabbed a screenshot of an example of this back in February 2014, and I think it’s the best way to explain the sort of thing I’m talking about:

It’s the first paragraph there that I’m focusing on, and I need to provide some context to enable you to understand what happened here. Early last year I played for a few weeks in a church basketball tournament. I had games regularly each Saturday at a similar time, and at the same location. In my calendar I put simply “Basketball” and never entered the address, since I knew already how to get there. What Apple’s machine learning engine did here was (as far as I can guess ¹):

Note that I had an item called “Basketball” in my calendar for that morning
Make a connection with past appointments on Saturday mornings also called “Basketball”
Look up past location behavior in its location database to connect a particular location with past instances of “Basketball” in my calendar
Look up this address and calculate driving time between my current location and this destination
Present it to me at a relevant time in the Today screen.

Again, Apple has talked up some functionality around using calendar locations explicitly entered in your calendar to provide these sorts of alerts, but I’m not sure it’s ever talked about the deeper machine learning stuff in evidence here. I’ve never seen exactly this sort of extrapolation from past behavior again since this occasion, but I have received other notifications on this screen that it’s time to leave for appointments where I’ve explicitly entered a location in my calendar, based on heavy traffic (it happened to me this past week at CES, for example).

I wonder how long it will be before Apple starts surfacing this information more proactively with notifications rather than merely displaying it reactively when users pop open this screen. And I wonder if Apple will begin talking about this functionality more in future – I note that Microsoft is currently running adds which state that Siri can’t tell you if you need to leave for a meeting based on traffic. That’s technically true – Siri is entirely reactive today – but iOS does tell you if you know where to look. That seems like something Apple could easily fix, but it would turn its machine learning capabilities from a background feature that’s so subtle many users likely miss it into a headline feature. And with that might come concerns about the data Apple is gathering and how it’s using it. It may have decided that, for now at least, it would rather keep quiet about all this, but I wonder how long that will last.

Notes:

Another possibility is that Apple merely built a pattern of my past behavior on Saturday mornings without explicitly connecting them to the calendar item. I don’t see that it makes a big difference either way. ↩

Beyond Devices

Category Archives: Siri

The State of Siri

Comparing Apples with Apples

Three components to digital assistants

Siri is competitive but not a leader today

Where Apple might go next week

Changing the narrative

Quick thoughts: on Apple’s subtle machine learning improvements

A blog about consumer technology from Jackdaw Research