A year and a half with Alexa

13 min read Original article ↗

Adrian Sanabria

Overall, if success means that my family regularly uses Amazon’s voice-enabled concierge, then yes — this product is a success. If success means that we’ve used Alexa to purchase things from Amazon, then the answer is also yes. We’ve spent about $30 on spontaneous music purchases with it since we received it almost a year and a half ago and have used it as a faster and more convenient way to order routine items we get from Amazon.

It isn’t without flaws and challenges, however. Though I’d say our experiences have been positive for the most part, this post will focus on some of the unexpected side effects and areas where this technology in general (not just Alexa) needs to improve in the future.

Press enter or click to view image in full size

“Ain’t no thing like me except me!”

Also, before we continue:

  • I use Echo and Alexa interchangeably to refer to the Amazon product that is the focus of this post. I know, it’s confusing, but Amazon does it constantly, so I feel like it kinda isn’t my fault.
  • Any puns you suspect or perceive to exist as you read this post were probably intended. No, I can’t really help it.
  • When I talk about wake words, they’ll be in all CAPS, because I’m shouting. Why am I SHOUTING? Because that’s how you’re supposed to TALK to voice activated ROBOTS and FOREIGNERS.

The technical challenges of living with an always-on voice-enabled digital assistant

Random interruptions

Living with Alexa is a bit like living next door to a high-energy child with no concept of time or the boundaries a neighbor’s closed door should represent. In mid-conversation, she’ll often activate. Sometimes we’ll trace it back to a phrase, like “acting out” that could have maybe sounded like Alexa if you squint your ears (can ears squint?). A lot of the time, there’s nothing we can identify that’s even remotely like the device’s wake word that should have activated it.

“But you can train Alexa to better recogni-”

Sorry to cut you off, but I know I can spend every day training Alexa. I just don’t want to. This isn’t the scope on a sniper rifle — most of us reach a point with a consumer product where it isn’t worth investing our own time in improving and adjusting and fiddling anymore. If it’s gonna work, it’s gonna work. If not, we’ll sell it or send it back. In the US, no one is going to take a class on something marketed as intuitive. I signed up to beta test a year and a half ago — we’re long past that now.

Press enter or click to view image in full size

“Shut up, Alexa.” “No one was talking to you Alexa.” “Alexa, cancel. Cancel. CANCEL!”

Also, the 8-microphone-array on Alexa is really good. If anyone says Alexa on TV, she’ll activate. If Alexa is said in a song, she’ll activate. If one of my kids has a friend named Alexa (my daughter does), she’ll activate every time she (the friend) comes up in conversation. Every time. Often, there’s nothing to do but press the hardware mute button on top of the device, which turns Alexa an angry red.

On days when the kids are watching YouTube on the XBox, we often get a double dose of unwelcome listeners. The Kinect also has sensitive microphones and enables hands free control through voice commands. The XBox however, seems to think everything sounds like “XBox”. Anyone in the room is treated to a litany of “STOP LISTENING!” punctuated by an occasional “CANCEL!” — the commands to tell the XBox and Alexa, respectively, to cancel active listening.

Little by little, science fiction technology becomes reality…

In fact, it is more common to hear someone shout “STOP LISTENING” or “CANCEL” in our house than “GO TO BED” or “BRUSH YOUR TEETH”. It’s just a thing now. It could be worse, I suppose — or better. Imagine if the voice commands to activate and deactivate listening mode were configurable? We could be shouting “AUTOBOTS” to wake Alexa up and “TRANSFORM AND ROLL OUT” to put her back to sleep. Or “COMPUTER” to wake her up and “KHAAAAAAN” to silence her.

Why only two wake words?

UPDATE: It has been brought to my attention that there are actually three options now — “Echo” was added as a third some time after my original purchase.

Jan 14, 2017 UPDATE: Just shy of nine months from when I originally published this essay, Amazon has enabled COMPUTER as a wake word!

This brings me to what must easily be the most requested feature for Amazon’s Echo — the ability to set a custom wake word. The wake word is the only reason Alexa doesn’t think you’re constantly talking to her instead of other people in the room, someone on the phone, or the voices in your head. Without a wake word, the device wouldn’t know when to expect a command.

Currently AMAZON and ALEXA are the only two options. I suspect both are probably equally annoying in terms of random accidental wake-ups, but people are practically frothing at the mouth to have this thing activate when they utter “COMPUTER!” in their best Picard. Trekkies everywhere would be crossing off bucket list items. Half the current Echo owners would probably change the wake word to COMPUTER overnight if they could. That wouldn’t end the problems with accidental wake-ups though. I suspect even CTHULHU would result in the occasional accidental wake (intended).

Jan 14, 2017 update: Computer is finally an option!

I can only suspect Amazon is well aware of accidental wakes, and is afraid that some well-meaning engineer will unintentionally make things even worse by changing the wake word to MARY, because his wife’s name is Mary, and he wants to surprise his wife by showing her how much he loves her.

“Hey, welcome home Mary, how was your CANCEL! day?”

“Before you get started on something, Mary, I had a CANCEL! surprise for you…”

“You know what, Mar<ahem>Wife that I Love, why don’t you go ahead and do whatever you were planning a moment ago. I’ve got to figure something out.”

It creeps out my wife

It is no coincidence I’m bringing this up immediately after mentioning random interruptions. Sometimes the house is completely silent and Alexa activates. My wife looks up from her Nook at me and says, “I really do need you to take a closer look at this thing”. My background is in computer security, so she’s referring to my technical due diligence skills. In other words, “could you please, please make sure this thing isn’t spying on us?”

Learning to use Alexa can be an exciting and empowering experience

Alexa’s greatest achievement and greatest flaw are the same thing — the alternating ability to convince you it could answer just about any question and inevitable failure to do so. The fact that the ability to answer new questions is constantly being updated encourages us to try things. When it works, it is a beautiful thing, and you beam like a parent that just witnessed their kid learn something new. You want to tell everyone about it.

The magic of discovering a new feature

My wife and I were planning to finally go see Deadpool. I had seen the movie without her and knew she’d love it. The day before, she reminded me I needed to check times. I knew if I opened a new tab on my laptop to check movie times, I was risking getting sucked into a rabbit hole of everything else that’s new in movie-land, or unfinished tasks and projects represented by other browser tabs (like this article). Instead, I decided to try Alexa.

“Alexa, when are showings for Deadpool tomorrow?”

Alexa promptly stated the nearest movie theater and started reading off times. It then continued with the next closest movie theater and respective times. She then asked if I wanted to hear times for theaters farther away. I said no, as she had already given me exactly what I needed to know.

Yessss.

FIRST TRY!

In another example, my wife had been reminding me for a few days that we needed to order more coffee, which we buy from Amazon (Kicking Horse’s Kick Ass). Upset at myself for repeatedly forgetting, I opened up a new tab and stopped — “wait, isn’t this exactly what Alexa is for?” So I tried it:

“Alexa, order more coffee.”

“Judging from your ordering history, it looks like Kicking Horse Coffee, Kick Ass, Whole Bean Coffee, 2.2 Pound is the best match. Would you like to order it now for $29.00?“.

Worked on the first try. Magic. This is the perfect example of what you hope the technology of the future to be like. You think something, you say it, it happens.

The kids use the heck out of it too. They use it to settle debates, check the weather, set timers, play music and make each other laugh. When friends are over, they love to demonstrate it. Kids are really the best at selling this sort of technology. Their enthusiasm is infectious as their curiosity drives them to explore what Alexa can and can’t do.

The value of simple interfaces for the… ‘focus-challenged’ among us

This is me. ©2009–2014 H. Caldwell Tanner

There’s a chance this observation might only be relevant to an adult with ADHD like myself, but the significance of it is very important to me. Typically, if I have a single task to do that requires using a computer (laptops and phones qualify), I have to be extremely diligent not to get distracted. The siren song of the Internet is one I’m particularly susceptible to.

I originally made fun of smartwatches. Over time, however, I realized how a smartwatch could potentially reduce the problem illustrated so well in this comic and others. In the coffee example above, I realized Alexa was addressing this same problem. Both are simple, more elegant interfaces for accomplishing simple, Internet-related tasks. I wonder how far we can take this…

Learning to use Alexa can be like playing a game of Simon Says where Simon’s name could be practically anything

…this device has done amazing things, so when Alexa fails to answer a question or handle a request, your first instinct is that it might be your fault for not asking it correctly.

Sadly, it doesn’t always go so smoothly, and that’s Alexa’s greatest failing. On the one hand, you’ve just been completely amazed that something you didn’t know she could do two minutes ago worked on the first try. Then, you try something else, and that glee turns to disappointment. Why? Because this device has done amazing things, so when Alexa fails to answer a question or handle a request, your first instinct is that it might be your fault for not asking it correctly. This often results in a bizarre guessing game, as you ask the same question again and again, moving around words, commands and grammar, looking for the format Alexa might be expecting.

Here’s the exchange that actually got me sitting down to write this post.

My son was sitting at the table, inspecting a Brazilian ‘real’ (Brazil’s smallest bill). He wondered aloud what it was worth in US dollars. I wonder aloud if Alexa could tell us. My son asked first.

“Alexa, how much is one Brazilian real?”

“Sorry, I couldn’t find the answer to your question.”

Fair enough, he wasn’t specific. Undeterred, he tried again.

“Alexa, what’s the conversion rate from real to American Dollars?”

“Searching for How many American dollars?”

“$1 is worth, for example, €0.89 or ¥110.93 JPY.”

With the benefit of my son’s two tries to think a bit about how to word things, I decide to give it a try.

“Alexa, what is one US dollar worth in brazil?”

“$1 is 3.57 Brazilian reals.

Not too bad — we got there after three tries. Not satisfied with letting Alexa off that easy, my son posed a followup question.

“Alexa, what is an American dollar worth in Honduras?”

“Sorry, I couldn’t find the answer to your question.”

Next try.

“Alexa, what is one American dollar worth in hunder us?” (this is what Alexa heard, not what my son said)

“Sorry, I couldn’t find the answer to your question.”

“Alexa, how much is one dollar in Honduras?

“Sorry, I couldn’t find the answer to your question.”

We can’t even keep straight what syntax we’ve tried and haven’t tried before long.

My son moves on to Norway: “Alexa, what currency does Norway have?”

“Sorry, I couldn’t find the answer to your question.”

“Alexa, what form of currency does Norway use?”

“Sorry, I couldn’t find the answer to your question.”

No dice. I step in again: “What unit of currency is used in Norway?”

“Norway’s official currency is the Krone”

My son moved on to Iceland’s currency and struck out twice again before giving up.

“But Adrian”, you might say, “they send out newsletters and have online help and reference…”

The thing is, you’ll have it right one day, and it’s easy to completely forget the correct syntax the next. It’s a lot like writing code — “I know I missed a bracket or parens somewhere…” In fact, I’ve had to avoid turning into ‘that guy’. I encourage my family to use Alexa, but then I started admonishing them for not recalling the precise sequence of words that unlocked the dusty tomb filled with knowledge of exchange rates and types of currency.

“No”, I say, “you’ve got to put the command first when you say it!My kids and I are okay with the learning curve — I’ve even written down commands to remember how to say them. My wife though, she’ll ask once or twice, and if it doesn’t work, she’ll shrug and do the task some other way.

It’s a reminder that we’re not all so romanticized by the idea of a digital voice-enabled assistant, that we’re willing to work for it. After all, who’s supposed to be assisting whom here?

Is the future we hope for forever doomed to be flawed?

Once again, fragmentation and walled gardens sour the sweetness of innovation.

The expectations for this technology are high. We’ve all seen Star Trek. We already have preconceived notions about how this should work. Intuitiveness is what’s on sale here — something like the Amazon Echo doesn’t work if you need a manual. Complicating this is that I just started using Google’s equivalent in the car. I’ve quickly discovered that after activating it with the “Ok, Google” watchphrase, Alexa’s lexicon comes out of my mouth, which often fails. I realized it is ME that has been trained. Now, if I want equivalent capabilities in the car, I have to learn Google’s command language. I’ve had guests come over and yell “Siri” at Alexa.

Once again, fragmentation and walled gardens sour the sweetness of innovation. Why? When innovation comes from business, it has to be competitive and make a profit. Businesses want their products to be better than the competition, so they differentiate. “Our product is betterhas this thing that theirs doesn’t have.”

And then, at the end of this evening, my son sums it up in just a few words: “this thing can’t handle vocal ambiguity”. <proudparent>As I recovered my jaw from the table </proudparent>, I realized he was exactly right, and that digital assistants have a ways to go before it they can succeed in handling the majority of questions we casually toss at it — no matter how we might phrase it.