Software

Software

The hardware by itself is clearly not enough for the aide we were building, there also needed to be some driver software to integrate the aide for use by the client. We decided to write that software in an authoring system called CanDo™. It allows programming in high level language and provides full support for hardware interrupts, multitasking, and process communication. This gave us an extremely flexible environment and greatly eased our integration tasks. CanDo™ is preprogrammed with information about the hardware on the Amiga computer and has provisions for both interrupt and buffered handling of exterior devices. It also is able to start and communicate with exterior processes in a multitasking environment.

We decided to make the user interface be a menuing based on eight buttons. The software goal was to maximize capability without sacrificing ease of use. We wanted to make the output of the aide as normal and comfortable (as compared to human speech) as possible. We also wanted a non-computer literate facilitator to be able to easily modify and program this communication aide.

The first step to developing the software for the aide was to break down the selection strategy for efficient use with the eight buttons (see Figure 6). In keeping with the familiar machine, Michael’s Touch Talker™, we decided to have the same forty-nine base selections made by pressing two key combinations, to be referred to as Quick Keys; see Table 2 (we used a system of denoting selection based on the button layout to increase ease of use and allow for a consistent methodology). We copied the initial contents of these Quick Keys from the setup of the Touch Talker™ system.

Figure 6 : Speech Device Layout

Table 2: Quick Keys Selection Formation Based on Client Interface

First Button Goes From

Main Menu to:

Submenu 1	Submenu 2	Submenu 3	Infinite Menu
Submenu 4	Submenu 5	Submenu 6	Submenu 7

Second Button Goes From

Submenu N to Selection:

Selection [N,1]	Selection [N,2]	Selection [N,3]	Goto Main Menu
Selection [N, 4]	Selection [N,5]	Selection [N,6]	Selection [N,7]

note: to convert Selection [N,M] to a number 1-49

number = (N-1)*7 + M

This means that the first button pressed takes you to a submenu and the second button finalizes a selection which is then spoken. The upper right button was set to default to return to the main menu. The only time that the upper right button does not return the user to the main menu is when the user is on the main menu in which case it takes the user to the Infinite Menu; see next paragraph for detailed information on the Infinite Menu.

The Infinite Menu is provided for conveying communication beyond the limited 49 selections available from the Quick Keys. The infinite menu consists of six infinite list selections (corresponding to 6 lists of letters/words/phrases), return to main menu, and a key which is reserved for the client to edit entries (left for later implementation) as shown in Table 3.

Table 3: Infinite List Selection

Letters	Words	MWords	Goto Main Menu
Numbers	Phrases	MPhases	reserved

The six choices are defined by what they build:

Letters for building words
Numbers for expressing quantities
Words for building phrases
Phrases for building or expressing complete thoughts beyond the forty-nine Quick Keys combinations
Mwords for words the client input, built from letters or numbers, and decided to keep
Mphrase for phrases that the client built, from words or phrases, and decided to keep

Each one of these choices takes the client to a different list, of arbitrary length, for further selection. Having created a new extension to the existing selection method it is necessary to have a new methodology for selecting entries from the list and building them into the desired words/phrases/ideas. The first step we took was to transform the list, which is alphabetically sorted, into a matrix as demonstrated in Table 4. This matrix is N columns and N rows where N is the ceiling function (integer formed by rounding up if there is any fraction) of the square root of the number of entries. The client is placed in the "center" of the list. This is done by setting the current selected entire set to by the entry at (length of list)/2 spot (lady in Table 4). The client’s interface is now configured so that four of his buttons navigate him in one of four directions within the matrix. Each time one of these buttons is pressed it "moves" the cursor in the corresponding direction. The buttons are assigned as shown in Table 5.

Table 4: List To Matrix Conversion

Initial List			Transformed List
big		big	dad	happy
dad	Þ	home	lady	man
happy		mom	sad	small
home
lady
man
mom
sad
small

Table 5: Button Assignment Within Infinite List Mode

right 1 word	up 1 row of words	add current to output	return main menu
left 1 word	down 1 row of words	say current output	add output to list

After having navigated to a desired entry the client then presses the "add current" button. This allows him to add items one at a time in a building fashion. When he has finished adding selections to the statement he wishes to be said he presses the "say" button and the communication aide will say whatever it statement the client has built. If he wishes to use this statement again he can then press the "add output" and it will be stored either in the Mwords or Mphrases list.

The next part of the software we considered was the facilitator interface. We began with the main menu, shown in Figure 7. We made this menu fully configurable; the names of the submenus, the number of seconds that a spoken phrase will appear on the LCD screen (0 seconds causes the LCD not to display the spoken phrases), and whether or not a beep will sound will be made after a button is pressed is fully programmable by the facilitator on this menu.

Figure 7: Main Menu Screen

We made the facilitator interface so that activating the button portion of the menu takes the facilitator to the applicable submenu (i.e., pressing the left mouse button within the box above Michael in the MainMenu will take you to the Michael submenu, Figure 8).

Figure 8: Editing Screen From Quick Keys Selection

The submenus corresponding to Quick Keys selections allow the editing (via a special editing screen, see Figure 9), of the response phrases when the appropriate button areas are selected.

Figure 9: Michael Submenu Screen

This enables the facilitator to bring up the editing screen for any of the seven button combinations under a submenu simply by selecting the button area that corresponds to it. To edit the Hello message from the Michael submenu, Figure 8, the facilitator would press the mouse button in the box above Hello and the editing screen would appear as is shown in Figure 9. The editing screen has two windowed areas, one for what will be said and one for what will be displayed on the LCD. This is particularly handy when one is saving disk space by using a single, digitized word for multiple words which have the same pronunciation (i.e., two, to, and too) or when spelling phonetically. Additional boxes provide for non-digitized speech, saving changes, and other convenient functions.

When the client or facilitator selects the Infinite Menu from the main menu they are taken to the screen shown in Figure 10.

Figure 10: Infinite List Selection Screen

They then select the list they wish to work with- Letters, Number, Word, Phrases, Mword, or Mphrase. This puts them on the Infinite Options screen (Figure 11) with the appropriate previously selected list loaded in.

Figure 11: Infinite List Mode Screen

To modify a list entry, the facilitator selects that entry with the mouse and is taken to the list entry editing screen (Figure 12). This editing function here is nearly identical to that of the Quick Keys, except that it has a delete entry button. Note that it is also possible to put anything into any list (e.g. put a phrase into the letters list) in keeping with maximum flexibility. Also on this screen are eight boxes corresponding to the eight buttons, which allow the facilitator to simulate client navigation and use. Figure 12 shows an example in which the facilitator has built the statement "Michael is a good boy".

Figure 12: Editor Screen From Infinite List Mode

We designed a system that has forty-nine Quick Keys phrases and six Infinite Lists to select from, mapped onto our eight arcade buttons. From the main menu, a button selection takes the client to a submenu or to Infinite Menu. On a submenu, a button selection results in a Quick Keys phrase being selected and said. In the case of Infinite Menu, the second button selects the list the client will be working with. These two key mappings are shown in Figure 13.

Figure 13: Client Interface Mapping

The numbered circles correspond to the (1-49) Quick Keys selections and the L1-L6 correspond to the six Infinite Lists. Every entry is fully configurable by a facilitator. The system is simple to use; training time among non-disabled test subjects (from an informal study of 10 people at various demonstrations) was under 5 minutes to learn to use the client interface, and under 15 minutes to learn to use the facilitator interface. We consider this sufficient proof to claim that the system is easy to use (training time of 20 minutes cannot be achieved on a system that is not easy to use). It is fast for the client to use, requiring only 2 buttons or key selections to activate one of the Quick Keys phrases. The delay between selection and vocalization of selection is no greater than a few seconds. We maintained capability by the use of the Infinite Lists, which require a reasonably small number of buttons selections to be made. This facility is capable of saying anything the client wishes.

The LCD continuously displays feedback information. It has two display modes, one for showing the current button definitions and one displaying the phrase that was just spoken. In the Quick Keys section this means breaking the display into 10 areas, eight corresponding to the buttons and two informing the client of the current submenu on which that they are working. In the Infinite Lists, this means dividing the screen into 7 sections (4 on top and 3 on bottom) which show:

the selections corresponding to the 4 "directions" buttons
the current selection, (where the client is within the list)
the current built up statement
the current menu name

When the aide is displaying the current statement it displays the text assigned to the display box of the selected entry. This produces a display layout looking like the following:

Table 6: LCD Display General Layout

Quick Keys Display

Item 1	Item 2	item 3	: MainMenu	\| Current
Item 4	Item 5	Item 6	Item 7	\| menu

Infinite List Mode Display

Phrase right	Phrase up	current phrase		\| cur list
phrase left	phrase down	currently build up super-phrase

Displayed Speech Window

The last thing said by the speech device

Examples taken from the testing of the communications aide include:

Table 7: LCD Examples

Michael Submenu

Want	: Remember	: Mess	: MainMenu	\| Michael
Bathroom	: Warm	: Walk	: Walk	\|

Infinite Words List

birth	: a	: boy		\| words
cold	: happy	Michael is a good boy

Last Phrase

I have to go to the bathroom.

The speed of conversation varies greatly, depending on how it is calculated. The factors involved are:

Number of button pushes required.
Disk access delay.
Speed at which words are said.
Number of words in selection.

A 1 second delay per button push, to allow the LCD to display the new information, will be used for calculations. This mean that Quick Keys selection will be much faster than Infinite Lists selection because they require fewer button presses. Quick Keys require 2 button presses and therefore a 2 second delay will be used to calculate time for button presses. Infinite Lists require 3 button presses plus as many as square root (number of entries) + 1 ("add" button) per selection. Based on a target list length of 200-225 entries, the aide requires a maximum of 15 button presses per selection and an average of 8. This is an efficient number of button press, but we feel that further research should be able to reduce this number in half. The disk access delay is between 0 seconds (words in memory) and 15 seconds (time to load speech synthesizer), with an average of .4 seconds delay per unique word loaded. The words are spoken at full speed with no additional delay with an average of .5 seconds in length. Any selection can be as many words as desired, but the average phrase length was 7 words. Putting it all together we get 50 words per minute from the Quick Keys, and 25 words per minute for the first selection from the Infinite Lists (phrase selections) and 31 words per minute for each subsequent utterance from the list. The speed of Infinite Lists word selection (based on 7 words) is 4.3 words per minute, and letter selection (based on 5 letter words) is 4.5 words per minute. This represents the capabilities of the system; it doesn’t count the time it takes the client to press the buttons (if greater than 1 second). We feel that these results are good, but that there is still room for improvement. By rearranging the selection method and adding a word fill algorithm we should be able to improve the letter selection to about 20 words per minute.

File space was a very big consideration as the whole system had to fit onto a single 880k floppy disk; however by adding $200 to the system cost we can increase the storage of the aide by a factor of 100. With this in mind, we needed some supplementary programs to conserve space.. The first program to save space, was a lossy sound compression program (the output is similar but not exactly the same as the original input). Many techniques were tried, but the two that worked were decreasing the resolution and transforming the sound into a series of exponential differences; both gave a two-to-one compression factor. The resolution of the sound is the number of bits that make up each sound sample (music CD’s have a resolution between 16 and 18 bits). The words were initially digitized with 8 bits of resolution and then reduced to 4 bits by ignoring the least significant 4 bits. This produced an acceptable result where the words were clearly recognizable, but there was some noise that did not occur in the original 8 bit sample. This noise took the form of static and was much more profound on poorer amplifiers. The second method is based on the fact that the samples, when put together, make a waveform (see Appendix A). This means that the current value tends to be a "small" offset from the previous value. This method takes the difference between the current sample and the previous one, and applies the log₂ function to the result. The result is that small differences are very accurately portrayed and large differences are not. The net effect of exponential differences is slightly better at low sampling rates than decreasing resolution (i.e. the 8,000 samples per second we were using) and it produces nearly perfect results at higher sampling rates (such as the 44,000 samples per second that CD’s use). Because of its performance exponential differences was the method used in the aide delivered to the client.

We found other programs within the public domain and shareware fields that were useful. These included Playsound, a flexible program that will play a list of digitized sounds based on a "command line" of Playsound sample1, sample2, ..., sampleN. Cache-Disk, a program that speeds up repetitive drive access by setting aside a space in memory to keep track of previous disk accesses. Nuke, a program that transparently compress and decompress files in a very quick fashion; and finally Turbo Imploder, another compression program that decompresses in a fast, transparent fashion. We would like to thank the authors of those programs for permission to use them, all free of charge, and would like to congratulate them on the overall quality of their programs.

Previous Section Next Section Return To Thesis Home Return To Home Page