Mathtalk: Usable Access to Mathematics

Robert D. Stevens

Alistair D. N. Edwards

Department of Computer Science
The University of York

ABSTRACT

This paper describes the design of the user interface to the Mathtalk program, which aims to give visually disabled readers an active reading of standard algebra notation. The paper introduces the themes of enhancing external memory and control of information flow as the guiding principles behind the design of the user interface. Fast and accurate control of the information flow is vital for active reading. Mathtalk uses structured browsing functions and a specially developed command language to achieve this active reading. Finally, an audio glance called algebra earcons is introduced that enables readers to get a high-level view of an expression and plan the reading process.

INTRODUCTION

This paper describes the design of a user interface that aims to enable a visually disabled reader to gain an active reading of standard algebra notation using synthetic speech and non-speech audio. Two concepts are central to this paper,"external memory" and "information flow control." The aim is to relieve the reader's mental demands and to make them active in reading complex material, in this case algebra. The paper describes the Mathtalk program, which uses prosodic cues in synthetic speech to improve the usability of the auditory presentation and browsing to make the reading process active. Two facilities are important in making the reading process active: browsing and planning. Mathtalk includes browsing functions that are powerful but easy to use and an "audio glance" that facilitates the planning of the reading process. It is hoped that the design principles used in the design of the Mathtalk program can be extended to interfaces that facilitate reading many types of complex information.

The following section describes the problem in more detail. The next two sections then describe in some detail two of the most important elements of the solution as embodied by the Mathtalk program: control of the reading process through "browsing" and planning of the reading process through the provision of an "auditory glance." Finally there is a summary and some conclusions are drawn.

THE PROBLEM

There is a significant need to make access to mathematics easier for visually disabled people. It is a compulsory subject in all school systems. Though many children may find it hard, many blind students do not have a fair chance to develop mathematical skills simply because the notations used are inaccessible to them (Rapp and Rapp, 1992; Kim and Servais, 1985; Cahill and Boormans, 1994). The mathematical abilities of visually disabled people are not doubted; there is no reason to believe that they do not (potentially) have the same range of mathematical skills as the rest of the population; they simply do not have the means to use and develop those skills. This is the reason that visually disabled people are so poorly represented in mathematical, scientific and technical subjects at all levels in education and employment.

There is a need to improve access to mathematics in the area of reading, writing and manipulation. However, merely giving access is not enough. The way in which access is provided has to be usable. Several computer systems have been built that can transform an unambiguous representation of algebraic grouping such as LaTeX (Lamdport, 1985) into an accessible form such as Braille (Arrabito, 1990) and synthetic speech (Stevens and Edwards, 1993; Raman, 1992). There is now a need to take such access systems a stage further and create usable, interactive systems for reading, writing and manipulating algebra that have been fully tested for their usability. The basis for the design of the Mathtalk process comes from an examination of the high-level processes involved in visual reading. This is not an examination of the cognitive processes of reading, but the external, physical processes that enable visually based reading to take place in an active manner. Printed algebra notation, and all other print, acts as an "external memory" for a reader (Larkin, 1989). The printed expression is a permanent representation that relieves the reader from the onerous task of remembering a large amount of complex information. A mathematician will write down intermediate steps in a calculation because there is too much to be remembered, or held in the head, between steps. This is particularly true in the case of algebra notation (Larkin, 1989), where a large amount of often complex information is present and the omission of any single item of that information will lead to miscomprehension.

An important feature of printed algebra is the two-dimensional nature of the notation, and this is one of the features that is difficult to reproduce in non-visual forms. For example, the grouping of one or more characters as a superscript may denote exponentiation. Kirshner, (1989) showed that these spatial cues acted as implicit parsing marks that aided the parsing process for many readers. The visual system allows the reader to have very fast and accurate control over the reading process, with little cognitive overhead. The eyes can move from any part of an expression, to any other, aided by the print cues with very little conscious thought. This control also allows different views of the expression to be obtained by the reader; a high-level view of an expression that allows overall shape and complexity to be gauged and a low-level view, where attention is focused upon the detail of an expression (Ernest, 1987; Ranney, 1987). Such different views allow a visual reader to plan his or her reading of an equation, thus making the whole process more effective and efficient.

For the sighted reader the availability of an external memory together with fast and accurate control of the information flow from that external memory to the reader's internal cognitive processes are the twin processes that make reading active. For a visually disabled reader, working in the auditory modality, there is often an inadequate external memory and poor control over the flow of information from that source. A classic example of this is recorded speech.

Listening is essentially a passive process (Rayner and Pollatsek, 1989) and users of taped books often complain of lapses in concentration and an overwhelming amount of information to try to remember (Aldrich and Parkin, 1988b; Aldrich and Parkin, 1988a). Taped speech is an external memory in that it is a permanent record of information external to the reader, but it is ineffective because of the poor control over the flow of information from that external source. The controls afforded by a tape recorder do not allow fast and accurate control over which part of the external source is currently being used. Reading (or more accurately listening) often defaults to a passive reception of information at a pace dictated by the speech. This lack of control also means the listener has no overall view of what is to be read until it has been read. This important part of the reading interaction is completely missing for the listening reader.

This section has described the main problems of making written mathematical notations accessible in non-visual forms. The next section describes some of the facilities that have been developed within Mathtalk to address the problems.

CONTROLLING THE INFORMATION FLOW

A compensation for the poor external memory and provision of fast and accurate control over information flow are the two main issues addressed in the design of the Mathtalk program. Prosody (varied pitch, timing and rhythm of speech) is used to enhance the speech based presentation of algebra using a synthetic voice (Stevens and Edwards, 1993; Edwards and Stevens, 1993; Stevens et.al., 1994b). The use of prosody replaces some of the visual parsing cues present in print and makes an expression easier to remember and generally increases its usability.

To take advantage of this enhanced speech presentation, the reading is made active by adding browsing functions. Browsing enables the reader to visit any part of the expression quickly and accurately, at a pace dictated by the reader, not the external device. By making it possible to visit any part of the expression quickly, some effects of the transience of speech maybe overcome. If it is easy to access information, the burden of remembering that information does not fall on the reader.

The browsing is based on the structural components of an expression. This is meant to mimic the type of browsing a sighted reader may undertake based on the spatial features of a printed expression. The explicit and implicit parsing marks within an expression described by Kirshner, (1989) provide prominent features by which the reader can direct his or her gaze through an expression. For instance the use of a distinctive typeface signals to the reader the beginning and end of an expression; white space divides an expression into terms; horizontal juxtaposition implies multiplication and vertical juxtaposition and the fraction line indicate division while diagonal juxtaposition indicates exponentiation.

Providing this type of structure-based browsing would give a visually disabled user very accurate control of the information flow that is based upon the sorts of reading/mathematical tasks a reader would have to undertake. For example, to evaluate a polynomial for a given value, a reader will probably want to move term-by-term through an expression evaluating each term in turn. For this reason a basic set of browsing moves would be to move forward and backward through an expression term-by-term. Another example would be the requirement to move straight to a parenthesized sub-expression in order to evaluate its contents prior to the rest of the expression. All the Mathtalk browsing functions are based on moving to structural features within an expression.

These browsing functions have the potential to make the reading active. Here a difficult point is encountered. The control used by the sighted reader is essentially subconscious. Moving the gaze from one expression to another or within an expression itself seemingly requires no mental effort or explicit issuing of a command. In this case the control process can be said to be internal. For a visually disabled reader any means of control is, by necessity, external. This means that the control of information flow will itself intrude into the reading process, disrupting the flow of information and imposing a further cognitive load on the reader. In addition, the interaction via browsing is necessarily complex in order to be effective.

The structural nature of the browsing gives accurate control suitable for the tasks involved, but does require the reader to use a set of labels that describe the structure of an expression.

Mathtalk covers the core of algebra notation and already has nine labels or syntactic targets that are used to direct browsing. As the algebra covered by the Mathtalk program increases so will this pool of labels that must be learned. A further problem with the external nature of such browsing is that the labels have to be consistent with those used by the reader and those used in the reader's environment. Whether such a system is usable by a reader can only be found by evaluation of the system.

Given that there will be some cognitive burden placed on the reader by using an external means of control, this burden must be made as small as possible. The command language used within Mathtalk to control the browsing and therefore the reading process has been designed to be issued quickly, to be easily learnable and readily extended to other tasks within and without the algebraic domain. A basic principle of the design is to keep the reading process of primary importance. The control process is itself essentially the reading process and how this control is manipulated by the reader must be transparent or the reading will be disrupted.

An important feature of Mathtalk that makes the reading task easier is the folding of syntactically complex items. A complex item has more than one term grouped by explicit parsing marks or spatial location. A term is a group of one or more operands separated by a least precedence operator. During browsing Mathtalk speaks all simple items in full, but only reveals complex items by referring to their type. So instead of speaking all the contents of a fraction, Mathtalk simply states that the current item is "a fraction" and allows the reader to control when and how the contents of such items are spoken. Such a mechanism greatly reduces the amount of speech that may be spoken as the result of a single browsing move.

This folding of complex items influences how a listening reader views an expression. This hiding of complex items is the first stage in the creation of an overview or glance, allowing the reader to see the overall structure without necessarily having to deal with all the detail.

The folding of complex items also proves useful in the default reading style provided by Mathtalk. This strategy allows the reader to move term-by-term through an expression at a pace determined by the reader. If any one term holds a complex item, the speech stops at that item and utters its type. The next stage takes the reader into the complex item and unfolds it term-by- term, reducing the whole expression to its constituent simple parts. For instance, the formula for solution of a quadratic equation would be described as follows (with a pause between each item during which the user would press the space-bar to signal that he or she wants to hear the next component):

- x;

- equals a fraction;

- numerator negative b;

- plus or minus the square root of a quantity;

- the quantity b squared;

- minus four a c;

- denominator 2a.

The Mathtalk command language is based upon a command consisting of an action word and a target word. The action words used are "speak," "current," "next," "previous," "into" "out-of," "beginning" and "end." The target words used are "expression," "term," "item," "superscript," "quantity," "fraction," "numerator level" and "denominator." For example, the command "next expression" is used to move to the next expression in the list of expressions. "Current term" causes the current term to be spoken and "end expression" allows attention to be moved to the last item in the expression. A reader can move "into quantity" to explore the contents of a complex item or use "speak quantity" to reveal the contents without moving inside the sub-expression.

This command language was designed to be easy to learn and extendible by the user. All commands can be generate from a relatively small set of command words. The commands fall naturally into a spoken form that should be easy to both learn and teach. Within the algebra domain the user should be able to generate appropriate commands simply by knowing the actions and targets. For example, knowing that "end expression" works, then "end fraction" should also work in a similar manner. The mnemonic keyboard mapping implementation of this language should be quick to use, and combined with the accuracy of the browsing, gives the reader good control over the information flow.

In addition to these speech facilities, Mathtalk also uses non-speech sounds. The use of algebra earcons is discussed in the next section, but simple sound cues are also used to indicate when the user has moved to the end of an expression or of one of the internal complex structures.

Like all other components of the Mathtalk interface, the browsing functions and associated command language have been evaluated. The techniques used for this evaluation is co-operative evaluation (Monk et al., 1993). In this sort of study a small number of subjects are used to elicit basic usability problems with the interface. The design recommendations are then implemented and retested. At this stage answers to some quite narrow questions were required:

- Is the language both teachable and learnable?

- Do the browsing functions cover all the reading moves a user may want to make?

- How well were navigation and orientation maintained by the reader?

Generally the results from the experiment were positive. Users were able to learn the core of the command language very quickly and use their knowledge of the actions and targets to generate new appropriate commands. Even after a short period of time readers developed strategies for reading expressions that were more effective than simply listening to the whole expression. The default browsing strategy was particularly popular.

The evaluation was useful in revealing several usability problems and software bugs that have now been resolved. The only usability issue discussed here is that of navigation and orientation. If interrupted during a task or when reading a complex expression, users would often become unsure of their orientation within an expression. There is obviously a need for reorientation facility that acts like a map. The glance - which is discussed below - might be one component of such a facility.

The browsing functions and associated command language offer the reader the opportunity to become the active agent in the reading process. All parts of an expression can be reached with speed and accuracy. This component of Mathtalk has been evaluated to ensure the browsing functions allow the reader to perform reading tasks and that the command language fulfills basic usability requirements.

In the evaluation, the non-speech sounds marking the ends of equations and structures proved useful, but users could not discriminate between types of complex item and in particular whether an end sound was the end of an expression or not. These cues have been redesigned, taking advantage of the environmental information used in the audio glance described below, to aid orientation within an expression. We believe that well designed non-speech sounds can be a useful supplement to speech. The next section discusses the main use of such sounds: algebra earcons, to provide an auditory glance.

PLANNING THE CONTROL OF INFORMATION FLOW

In order to use the browsing functions effectively and efficiently a planning facility is needed. Some idea of the syntactic complexity of an expression is required so that the reader can choose strategies for using the browsing functions available. Mathtalk uses earcons to provide an audio glance, allowing the reader the possibility of viewing an expression quickly and efficiently without being overwhelmed by detail.

Ernest, (1987) proposes planning and decision-making as the first stage in the process of reading an expression visually. Ernest suggests that part of this planning is to scan the expression to judge complexity and length and observe any features unfamiliar to the reader. This scanning or glancing is a vital stage in the reading process. Even a simple notion of the size of an expression would prove enormously useful to a visually disabled reader. A very simple expression such as a + b needs no special treatment and could be apprehended by a simple full utterance, but a long expression would simply overwhelm a listener if spoken all at once in that way. The listener has no notion of even the length of an expression without hearing the whole expression. The situation can be summed up as a "catch 22" of not knowing how to read an expression until the expression has been read. In such a situation it would be difficult to develop effective and efficient strategies for reading an expression.

A blind reader needs a glance to make the best use of the browsing functions available. Even the ability to choose between a full utterance and a term-by-term strategy would make the reading more usable. This is demonstrated by subjects in the evaluation of the command language who developed a strategy that began the reading of an expression with the current level command. This speaks the whole expression, but folds complex items and refers to them only by their type. For example, at the top level, the formula for the solutions to a quadratic equation would be described as "x equals a fraction." In a syntactically complex expression this approach can greatly reduce the amount of speech and was the only way in which the readers could get even a partial overview of an expression. The use of this strategy shows that there is the need for a glance. A glance is a high-level view of the expression to be read. A glance shows the general shape, length and complexity of an expression. To choose an appropriate browsing or reading strategy the nature of the structure needs to be known. Simply being able to quickly assess the length of an expression would enable a reader to choose between a full utterance and a term-by-term unfolding. A more detailed impression of structure would enable the need for other tactics to be anticipated and grouping ambiguities to be resolved.

Finally, a detailed glance could provide a cognitive framework into which detail gained from reading an expression could be slotted as it was read. The glance does not need to show any of the detail of the expression, for example, the exact nature of the letters, numbers and operators contained within the expression. It is only required to know that a fraction is present and gain a rough idea of its size, in order to plan the reading. Such detail is only of real importance when the expression is read in full. These are the basic requirements fora glance: it needs to be quick and at least indicate length and complexity of an expression.

The need to remove the detail from the glance led to the use of non-speech audio. The number and syntactic types of the content must be preserved in the glance if all the criteria described above are to be fulfilled. If speeded-up speech is used, some of the prosodic form, and therefore the structure, is preserved, but too much information about the type of structure can be lost. Speech was also rejected as a means of providing this glance because a description would often be longer than the expression itself. One of the principles of Mathtalk is that the interface should perform no mathematical interpretation, as is the case with a printed expression. This rules out a high-level mathematical description as a glance. Instead, non-speech sound was explored as an option for an audio glance. Non-speech sounds have the potential for communicating complex messages to a listener quickly, in a form that does not interfere with speech. Also non-speech audio gives an abstract presentation of an expression, which is non-interpretive and hides the expression's content, both necessary components of a glance.

Blattner et al., (1989) define earcons as non-verbal audio messages that are used in the computer/user interface to provide information to the user about some computer object, operation or interaction. Earcons are composed of motives that are short, rhythmic sequences of pitches with variable intensity, timbre and register (Brewster, 1992a). An adapted form of earcons was developed as the technique to provide a glance. The rules which had already been devised for the prosody of spoken algebra were reapplied as the basis for structuring earcons. Some interesting similarities emerged between the guidelines for earcon design (Brewster et al., 1992) and those for algebraic prosody. Both earcons and prosody are described by the same parameters of rhythm, pitch, intensity, duration, register and dynamics.

The principle construction within an earcon is the "motive." In algebraic prosody a tone unit carries one basic unit of information (Halliday, 1970), which is the term, the equivalent of a motive. To improve recognition of earcons, a pause is used to separate motives. The tone unit or term in speech is separated from the next by a pause, improving the parsing into terms and retention of information.

The first note of a motive is usually emphasized, as is the first item in a spoken term. It is recommended to lengthen the final note of a motive, and this also happens in the spoken term. Finally, pitch is used in earcons to discriminate between different elements in the message, the same is true of the role of prosody in speech. An ordinary earcon is abstract, the structure bears no relation to the information.

In an algebra earcon the rules for construction are indirectly related to the prosodic rules, and directly related to the syntactic structure. Algebra earcons work by representing only the syntactic type and not the instance of an item in an expression. Different musical timbres were used to represent the basic syntactic types within an expression. The sounds used are shown in the table below. The timing, pitch and amplitude characteristics of these sounds were then manipulated according to the rules below. A priority was to establish a rhythm by which a listener could group items together, enabling algebraic structure to be presented. Pitch and amplitude cues also helped in this task.

"Item"
"Timbre"

Base-level operands Acoustic Piano Binary Operators Silence Relational operators Rim-shot Superscripts Violin Fractions Pan pipes Sub-expressions Cello

The rules for constructing algebra earcons are too complex for discussion here, but the following example demonstrates the process. As discussed above an algebraic term is the equivalent of a tone-unit in speech and a motive in an earcon. So the basic unit of an algebra earcon is the term and the bar length of the algebra earcon is determined by the term length.

Details of the term-length calculation can be found in Stevens et al.,(1994a). The expression 3x + 4 = 7 has three terms making a three bar algebra earcon. The first term '3x' has a length of four beats: a note of one beat for the '3', two beats for the'x' and one silent beat for the '+' which separates it from the following term. The second term '4' has a length of three beats and the final term '= 7' has a length of four beats. Therefore, the bar length of this earcon is four beats. The first and third terms already fit into this bar length. The second has an extra silent beat added to make it fit this length. This bar length determines the rhythm of the algebra earcon, an important part of the earcon's usability (Deutsch, 1982; Brewster et al., 1992).

The next stage is to assign pitch and timbre to each item in the algebra earcon. A piano note at C3 is used for the '3' and one at B4 for the 'x'. For the start of the new term, the note representing '4' is again played at C3 . The rim-shot timbre used for `=' is played at A4 . To emphasize the pitch fall at the end of the expression, the piano note for '7' is played two notes below this at F4 .

The example 3(x + 4) = 7 has the same lexical content as the previous expression, but a different syntax and therefore a different earcon. Now there are two terms, '3(x + 4)' and '= 7'. The sub-expression '(x + 4)' has a length representing the two internal terms but with no separation for the '+', giving a length of four beats. The coefficient '3' adds a further beat and a silent beat is added to separate this term from the next. As before the '= 7' has a bar length of four beats. No adjustment for bar length is needed as there are only two terms. The piano timbre used for '3' is played at C3 . The sub-expression is played as a single note at A6 with a cello timbre. Finally the '= 7' is played as before.

The two algebra earcons sound very different, despite the same lexical content. The first 3x + 4 = 7 has a group of two piano notes, a single piano note, a rim-shot sound for the equals and a final single piano note. In contrast 3(x + 4) = 7 has a single piano note for the coefficient, a long cello note at a low pitch for the sub-expression and a space followed by the same rim-shot and piano note.

Two experiments were performed to evaluate the ability of algebra earcons to present high-level syntactic information to a listener. These experiments are reported in full in Stevens et.al.(1994a). The first experiment used a multiple choice paradigm to probe the basic ability of listeners to recover syntactic information from an algebra earcon and use that information to recognize an expression. The multiple choice design allowed all aspects of the rules for algebra earcons to be tested and the experiment revealed several errors. Despite the errors the algebra earcons seemed to be successful in enabling listeners to recognize syntactic structure.

A second experiment was performed to test the redesigned rules. However before the recognition part of the experiment, the listener was asked to recall what he or she could about the expression just presented. This technique was used to investigate the type of representation a listener could derive from an audio glance. Evaluation of the data from the recall part of the experiment suggested that the following types of representation were derived from the audio glance:

- Idea of complexity or length.

- Low-level knowledge of complexity: equation or expression, balance of left and right and sides. Some knowledge of syntactic items.

- Knowledge of major syntactic features, some detail and knowledge of their order.

- Detailed representation of structure. A framework into which detail could accurately be placed during reading.

All of these representations could be useful as a glance because they would indicate the syntactic complexity of an equation. However, a strong, but inaccurate framework has the potential to mislead a reader. As algebra earcons were only designed to provide a glance, such inaccuracies would not be too great a problem because any glance is not supposed to be entirely accurate. A good representation of the equation would be a bonus for the reader. Recovering information from the glance may be a difficult task, as described by many subjects, but this may be exacerbated by the novelty of the audio glance and the artificial nature of the experiment. In addition, the difficulty of using the audio glance has to be balanced against having to use a full utterance to guide the reading process.

These experiments showed that algebra earcons work. However discovering if this audio glance is useful and usable for reading will have to wait for the evaluation of the full Mathtalk program. However there is a need for such a glance and algebra earcons seem to provide a good attempt at such a facility.

SUMMARY This paper has described the design of the Mathtalk interface, which aims to promote the reading of algebra using speech from a passive listening to an active reading process. A simple analysis of why the visual reading process is active results in the key notions of external memory and fast and accurate control of information flow. The permanent print representation is knowledge in the world, which, being quickly accessible by the visual system, does not have to be remembered by the reader. The order of precedence instantiated in the spatial printing of the algebra, combined with the fast and accurate control of the information flow by the reader makes the sighted reading process active. In contrast the listening reader is usually passive. Typically, the pace of the interaction is dictated by the external device, for example a tape recorder. The control of information flow from such a device is slow and inaccurate, making the reading process tedious and frustrating. The taped speech is an external memory, but the frequently poorly spoken versions of algebra expressions, together with the lack of control reduce the quality of the external memory.

The ability to generate an accurate computer-based representation of an algebraic expression gives the opportunity to design a reading interface that avoids the problems of poor external memory and lack of control. The interface of the Mathtalk program compensates for the lack of external memory by enhancing the synthetic speech presentation using prosody. Advantage is taken of the improved auditory display by making the reading active, giving the listening reader control over the information flow using browsing.

Prosody rules have been implemented in the Mathtalk program so that any algebraic expression can be spoken with a voice synthesizer using prosody. The effects of prosody were then evaluated experimentally and shown to promote the recovery of syntactic structure, enhance the retention of lexical content and reduce the mental workload involved. The prosodic cues act in much the same way as the rules governing the printing of algebra, instantiating the order of precedence and thus facilitating parsing. By increasing the usability of the synthetic speech the reading task is made easier.

The external memory only becomes truly effective with the addition of control of the information flow. Unless a reader can access any part of the structure with speed and accuracy, the burden of memory is placed on the reader, rather then the external memory. The browsing functions available in Mathtalk enable any part of an expression's structure to be visited. In contrast to the sighted reader a visually disabled reader has to mediate his or her control of information flow externally. A command language has been developed and evaluated for the Mathtalk program that enables fast and transparent manipulation of the browsing functions.

The key to efficient use of the browsing functions and development of reading strategies to control information is planning. A sighted reader uses a glance to assess the basic nature of an expression and formulate a strategy for reading the expression. An audio glance combining the cues used in speech to indicate structure and the ability of abstract sounds called earcons to present structure non-verbally provide this glance in a form called algebra earcons.

Evaluation of algebra earcons has demonstrated that listeners can derive a considerable amount of structural information from an audio glance. At best the listener gains a detailed structural representation into which he or she could place lexical information as the expression was read. Such a representation should allow the reader to gain a quick overview of an expression and make decisions about how to use the available browsing functions, without having to tediously listen to a potentially long expression, then reread it by browsing.

An issue emphasized in this paper is the evaluation of the interface being developed. Each component of the Mathtalk interface has been evaluated separately to ensure that the task that component performs is achieved. Eventually the whole Mathtalk interface will be evaluated in a task-based manner to ensure its usability. The Mathtalk program only tackles the problem of designing an interface to promote active reading of algebraic notation. A project called "Maths" is being funded by the European Tide (Technology for the Integration of Disabled and Elderly People) Initiative to develop an algebra workstation. (See a paper on Maths in this issue of "Information Technology and Disabilities.")

This will be a multi-media system that will enable visually disabled school children to read, write and manipulate standard algebraic notation. The workstation will use speech, non-speech audio and Braille as output media. The writing and manipulation of algebra is even more complex than the interactions described here in the Mathtalk program. However the principles outlined for the design of a reading interface form a firm base for this future development.

Work described in this paper has been supported by Research Studentship 91308897 from the UK Engineering and Physical Sciences Research Council and by the European Union through its Tide Initiative (Project Number 0133).

REFERENCES

Aldrich, F. and Parkin, A. (1988a). Improving the retention of
aurally presented information. In Gruneberg, M., Morris, P., and
Sykes, R., editors, Practical Aspects of Memory: Current Research
and Issues. Chichester, England: Wiley.

Aldrich, F. K. and Parkin, A. (1988b). Tape recorded textbooks
for the blind: a survey of producers and users. The British
Journal of Visual Impairment, VI(1):3-6.

Arrabito, R. (1990). Computerized Braille typesetting: Some recommendations on mark-up and Braille standard. Technical report, The University of Western Ontario.

Beech, C. M. (1991). Interpretation of prosodic patterns at
points of syntactic structure ambiguity. Journal of Memory and
Language, 30:643-663.

Blattner, M., Sumikawa, D., and Greenberg, R. (1989). Earcons and
icons: Their structure and common design principles. Human
Computer Interaction, 4(1):11-44.

Brewster, S. A. (1992a). Providing a model for the use of sound
in user interfaces. Technical Report YCS169, University of York.
Department of Computer Science.

Brewster, S. A., Wright, P., and Edwards, A. (1992). A detailed
investigation into the effectiveness of earcons. In Kramer, G.,
editor, Auditory Display: The Proceedings of the First
International Conference on Auditory Display. Addison-Wesley.

Brewster, S. A., Wright, P. C., and Edwards, A. D. N. (1993). An
evaluation of earcons for use in auditory human-computer
interfaces. In INTERCHI'93, pages 222-227. ACM Press,
Addison-Wesley.

Brewster, S. A., Wright, P. C., and Edwards, A. D. N. (1994).The
design and evaluation of an auditory enhanced scrollbar. In
Adelson, B., Dumais, S., and Olson, J., editors, Proceedings of
CHI'94, pages 173-179. ACM Press Addison Wesley.

Cahill, H. and Boormans, G. (1994). Problem analysis: A formative
evaluation of the mathematical and computer access problems as
experienced by visually impaired students. Technical Report Tide
Maths project 1033 D1, Tide Office, Brussels, University College
Cork, Ireland.

Deutsch, D. (1982). Psychology of Music. Academic Press, London.

Edwards, A. D. N. (1991). Voice Synthesis: Technology for
Disabled People. Paul Chapman.

Edwards, A. D. N. and Stevens, R. D. (1993). Mathematical
representations: Graphs, curves and formulas. In Burger, D. and
Sperandio, J.-C., editors, Non-Visual Human-Computer
Interactions: Prospects for the visually handicapped, pages
181-194. Proceedings of the INSERM Seminar Non-visual
presentations of data in human-computer interactions, John Libbey
Eurotext.

Ernest, P. (1987). A model of the cognitive meaning of
mathematical expressions. British Journal of Educational
Psychology., 57.

Gaver, W. (1986). Auditory icons: Using sound in computer
interfaces. Human Computer Interaction, 2(2):167-177.

Halliday, M. K. (1970). A course in spoken English: intonation.
Oxford University Press.

Kim, Y. and Servais, S. B. (1985). Vocational, educational, and
recreational aids for the blind. In Webster, J. G., Cook, A. M.,
Tompkins, W.J., and Vanderheiden, G. C., editors, Electronic
Devices for Rehabilitation, pages 101-115. Chapman and Hall.

Kirshner, D. (1989). The visual syntax of algebra. Journal for
Research into Mathematics Education, 20(3):274-287.

Lamdport, L. (1985). LaTeX- A Document Preparation System-Users
Guide and reference Manual. Addison Wesley, Reading.

Larkin, J. H. (1989). Display Based Problem Solving. In Complex

Information Processing: The Impact of Herbert A. Simon, Chapter
12, Page 319.

Monk, A., Wright, P., Haber, J., and Davenport, L. (1993).
Improving Your Human Computer Interface: A Practical Technique
BCS Practitioner Series. Prentis Hall.

O'Malley, M. H., Kloker, D. R., and Dara-Abrams, B. (1973).
Recovering parentheses from spoken algebraic expressions. IEEE
Transactions on Audio and Electroacoustics, AU-21:217-220.

Ostendorf, M., Shattuck-Hufnagel, S., and Fogg, C. (1991). The
use of prosody in syntactic disambiguation. Journal of the
Acoustic Society of America, 19(6).

Raman, T. V. (1992). An audio view of LaTeX documents. TUGboat,
13(3):372-377.

Raman, T. V. (1994). Audio Systems for Technical Reading. PhD
thesis, Department of Computer Science, Cornell University, NY,
USA.

Ranney, M. (1987). The role of structural context in syntax in
the recognition of algebraic expressions. Memory and Cognition, 15(1).

Rapp, D. W. and Rapp, A. J. (1992). A survey of the current
status of visually impaired students in secondary mathematics.
Journal of Visual Impairment and Blindness, 26(Feb):115-117.

Rayner, K. and Pollatsek, A. (1989). The Psychology of Reading.
Prentis Hall.

Stevens, R. D. (1992). A sound interface for algebra. Internal
Report, Department of Computer Science, University of York.

Stevens, R. D., Brewster, S. A., Wright, P. C., and Edwards, A.D.
N. (1994a). Design and evaluation of an auditory glance at
algebra for blind readers. In Kramer, G., editor, Auditory
Display: The Proceedings of the Second International Conference
on Auditory Display. Addison-Wesley.

Stevens, R. D. and Edwards, A. D. N. (1993). A sound interface to
algebra. In Proceedings of the IEE Colloquium on Special Needs
and the Interface. IEE Digest no. 1993/005.

Stevens, R. D., Wright, P. C., and Edwards, A. D. N. (1994b).
Prosody improves a speech based interface. In Cockton, G. and
Draper, S. Weir, G., editors, Proceedings of HCI'94, People and
Computers IX. Cambridge University Press. Short paper.

Streeter, L. A. (1978). Acoustic determinants of phrase boundary
representation. Journal of the Acoustical Society of America,
64:1582-15:1 92.

Stevens, R. D. & Edwards, A. D. N. (1994). Mathtalk: Usable access to mathematics. Information Technology and Disabilities E-Journal, 1(4).