The Pavlov’s Beep post got me thinking a bit about how user expectations can set a product up for failure out of the gate… and that made me think of the late, lamented (by some), lambasted (by many) Newton.

Ignoring the obvious hype surrounding the rollout—something the product could never live up to—one of the things that was coolest about the Newton was also the thing that caused the most frustration: handwriting recognition.

Handwriting and speech recognition are two areas that have seen a lot of advancement in the past years—for example, the recognizer that comes with Windows XP Tablet PC Edition 2005 Elite Pro Extreme (or whatever that product is called) is exceptional, and from all reports is improving in its next iteration. And, according to those who use these things routinely, like David Pogue, the latest Naturally Speaking (ex-Dragon, now ScanSoft) is similarly surprisingly accurate.

All that said, though, they’re doomed to teeny market segments, because no matter how accurate, when they make mistakes they frustrate their users.

Why? Because we’ve asked the computer to play by our rules, rather than asking us to play by its.

When a device imposes an unnatural input method on its user, any mistakes you make are your own. Consider Graffiti, as used on Palm OS: it specifically requires you to make certain “letter shapes” if you want them to be recognized. It doesn’t try to recognize your handwriting. If a letter comes out wrong, you might be frustrated, but you know it’s “your fault”—you didn’t make the shape right.

But, with the Newton, it was supposed to recognize your handwriting. Not an alphabet of its own design. And, when it doesn’t, the first thing that runs through your head is: this thing sucks! Even if it got 90% right, which is pretty remarkable considering many people can’t read their own handwriting!

Speech is hard, too—especially if used in an “unrestricted vocabulary” scenario. Not only do you have to “speak” all punctuation (try it sometime), but correction is an incredible pain. And speech is such a natural part of most people’s daily interaction that it’s near impossible to rise to an acceptable level of performance.

Newton, near the end of its life, tried to get around this problem by taking a bit of a Graffiti approach: it encouraged people to print, and there’s a lot less letter variation in printed characters. Suddenly, with the new Rosetta printed character recognizer (still in use in Mac OSX today), people started saying the Newton did a good job recognizing handwriting when—in actuality—it just shifted the burden onto the user some more by restricting input.

Similarly, speech is becoming more common in situation where the vocabulary is highly restricted and expectations are low: package tracking #s; pick 1-2-3 menus; interacting with “speed trade” stock systems. Correction is pretty easy and, given the relatively robotic nature of the interaction, expectations are low. (You’ll note that as the systems got better, they started using more animated, natural voices at the other end—raising expectations they knew, from testing, they could meet.)

So—if you’re designing a UI, remember that the way you “frame” the interaction often sets user expectations.

Leaving aside speech and handwriting, free form input can be really, really cool—see Simson Garfinkle’s SBook and Microsoft Outlook’s date/address fields for examples.

But if you put up a free-form field , you better make damn sure that it accepts all sorts of wacky variations. If date-based, expect things like “tomorrow”, “next wednesday”, etc—every one you miss, given the free-form nature of the field presented, is the fault of the program, from the user’s perspective. Similarly, what about international addresses? And how do you tell the user you’ve made the inevitable mistake?

A lot of words to say something obvious, I know. But it’s a mistake I make all the time… e.g. “Safety Clone”