adaptive vocabulary


Suppose that you're issuing a complex command and you make an error when specifying one part of the command. Most programs would simply beep. But as you reissue the command you're justifiably a little annoyed at having to be bothered, so you don't make the original error, but you now make an error in another part of the command. The program again beeps.

So once again you issue the command, but this time you do the human thing and focus only on the part that caused you trouble last time, completely forgetting the rest of the command. As far as you're concerned, you've clarified your meaning completely because you're used to talking to people. Based on the context of the previous mangled command, they would usually understand exactly what you mean. The program, however, simply beeps again.

Why couldn't the computer see that if you mess up a command the very next thing you're likely to do is to repeat the command? Further, why doesn't it know that if you mess up one part of a command, you're likely to focus only on that part when you repeat the command? If it kept at least that much context, it would know enough to compare the first two command versions you issued and perhaps be able to extract the correct form of the command you intended.

This search problem arises because programs typically don't pay attention to sequences of commands.

Today's computers typically pay no attention to the sequence of search requests from a particular user. If a user, for example, first searches for information about the Kon-Tiki expedition, then immediately searches for "Heyerdahl", that user is probably interested in pages describing Thor Heyerdahl, the leader of the Kon-Tiki expedition, and not Jim Heyerdahl, a used car salesman in Akron, Ohio.

Ignoring sequence is one of the big reasons computer appear so mindless (another is their utter inflexibility about simple spelling and other errors, and a third is their utter ignorance of any context for any command).

Web search engines, for example, are prime examples of this mindlessness. They frequently return on the order of a million hits. What is the user to make of that? On the other hand, if a search returns zero hits then again there's something wrong because the user is not making up queries at random, so it's likely that there is indeed something to be found.

In the first case the search engine should analyze the user's past searches to find more restrictive constraints on the search before reporting so many hits. In the second case the search engine should automatically expand the search by using a thesaurus and should also parse the search request looking for misspellings before reporting that no results were found.

That isn't enough, however, since there still remains the problem of educating the user about the space of pages---or, rather, to put it the right way around, educating the user's computer about the user's way of viewing the space of pages.

The system's vocabulary of search terms should adapt to the vocabularies that its user typically uses. Suppose, for example, the user is searching for the string "copy machine" but the system cannot find such a string, nor anything even remotely similar---even assuming misspellings.

In such a case the system should ask the user for alternate strings. If any of those alternates result in the user finding the desired information, then all the alternatives (starting with the first failure in the chain, "copy machine") should be stored as aliases of the string that led to the successful search (perhaps the successful search string was "Xerox machine"). In future, the user, or other users sharing the same system, when searching for the same thing will be much more likely to find it in one try, all without the annoyance of being forced to learn the computer's puny and rigid vocabulary.



last | | to sitemap | | up one level | | next