Digital Talking Books On A PC: A Usability Evaluation Of The Prototype DAISY Playback Software

Sarah Morley
National Centre for Tactile Diagrams
University of Hertfordshire, Hatfield, Herts, UK

This paper describes the design and evaluation of the first system to play digital talking books on a PC: the DAISY Playback Software V1.0. The features of the software for navigating through structured digital audio are described. A detailed usability evaluation of this prototype software was designed and conducted to assess its current usability, in which 13 blind/partially sighted participants completed a series of realistic tasks and answered detailed usability questions on the system. Recommendations for improvements are presented which might inform designers of similar systems, such as other digital talking book systems or WWW browsers.


Blind readers generally have to rely on Braille books or audio-taped books (called talking books) for their study, work or leisure, and there are many known problems in using these media for structured material such as text books [see 1, 2, 12]. For example, a large number of Braille volumes or audio-tapes may be required for one book, and readers often find it difficult and time-consuming to navigate through and locate specific information especially from a serial audio recording on one or many tapes. It is also hard to skim-read audio-taped talking books in a useful manner. Many readers would also appreciate a higher quality of talking book output.

Digital talking books are the next generation of talking books. Now a whole book or even several whole books can be recorded onto one compact disk (CD) instead of onto many audio-tapes, with high-quality recording and output, and readers can jump quickly to different parts of a book. Intended to support the high demands of rapid access to complex, highly structured material, additional goals of digital talking book development are to provide a system which is easily integrated into existing talking book production systems and which is able to make use of future technologies, to provide equal, or better, access to the information in a book than a fully able reader of a print book.

User requirements have been drawn up by members of the European Blind Union [6], which have been influential in the design of digital talking book systems. These cover a range of considerations, from the type of material recorded and the format used to store the material, the organization of library and distribution services, the design of players, and identifies the need for developing standards in the structuring and presentation of digital books. Similar research work is also being undertaken by The National Library Service for the Blind and Physically Handicapped, Library of Congress in the USA, and the National Information Standards Organization (NISO) in order to specify performance characteristics and to find suitable standards for such systems [4, 10].

As part of a global initiative involving many major blindness organizations and industrial companies, a system for recording and coding of structured digital audio was developed [by 7] at the request of the Swedish Braille and Talking Book Library called DAISY: Digital Audio-Based Information System. The DAISY recording system allows digital audio to be recorded and coded at several levels of detail: phrases, groups of phrases and section headings. This allows readers rapid non-serial access to information on a digital talking book. There are currently two prototype systems for playing and navigating DAISY Talking Books. They are different in their design to support a range of potential end-users.

a) The PlexTalk Player [developed by 13]. This stand-alone device is similar to the current talking book machines, in that it is portable and has a small number of buttons on the casing. It allows readers to play the DAISY book, jump through sections, phrases or paragraphs, use page and time information, and to insert simple bookmarks. This system is likely to appeal to readers who require simple and/or portable access to their books. It has been evaluated in a world-wide field trial coordinated by the Royal National Institute for the Blind, UK.

b) The DAISY Playback Software for a PC [developed by 7]. This software allows users to read and navigate DAISY books on their Windows PC. The Software has its own digitized system messages and can therefore be used without a screenreader. Readers can jump through headings, phrases and groups of phrases, insert bookmarks, use page information, search the section headings, and perform many other useful functions. This system is expected to appeal to readers who require more sophisticated access to their talking books.

Both systems have a digitized female voice for the interface prompts and commands, so the entire reading system consists of digitized human voices which are very pleasant to listen to.

One of the further goals of digital talking book development is the addition of linked synchronous text files so that readers have simultaneous linked access to the full text of their digital talking book. This concept was demonstrated in the EU DigiBook Project, using digital audio books and SGML-encoded electronic documents [5].

The Sensory Disabilities Research Unit was asked by the Royal National Institute of the Blind, UK, (a member of the DAISY initiative) to conduct a usability evaluation of the prototype DAISY Playback Software. The evaluation results presented in this paper can be used to improve the Software.


Books recorded using the DAISY approach are typically structured to match the organization of the printed book as far as possible, usually including page numbers in the recording, making it very easy for blind and sighted readers to use the same information. A range of DAISY books have been recorded to suit different types of readers, from novels with little structuring, to highly structured books such as a cookbook, tour guide, textbook, and the four gospels.

Books vary in their levels of sectioning. For example, a novel might only have 6 chapter headings, and no further sub-headings. A textbook on the other hand, might have 15 chapter headings, and each chapter is divided into many sub-sections, each of which is also broken into further sub-sections. DAISY allows access to all these levels of section headings. Once the reader finds the required heading they can simply start reading the section. In addition, DAISY books are encoded during recording at the levels of a phrase (approximately a sentence) and a group of phrases (approximately a paragraph). Readers are therefore able to read phrase by phrase, and group by group. The text of headings is encoded in DAISY, allowing users to perform text searches in the headings. Currently the body of the sections are not encoded at the text level, and therefore full text searches cannot be performed, but this is under development.


The DAISY Playback Software was designed for advanced users of digital talking books who require fast and effective access to a range of structured talking books on their PCs for education, work or leisure. There are several advantages provided by the Software over the PlexTalk Player. Firstly, the Software is easily integrated into the usersí PC and produces all required output including system messages in digitized audio, and therefore a separate screenreader is not required. Secondly, the Software is usable by blind, partially sighted and sighted users, since it is keyboard and mouse driven, with a display capable of a range of font sizes. Thirdly, text searches in the headings can be performed (not possible with the PlexTalk Player since it has no alpha-numeric keyboard).

The DAISY Playback Software commands were designed to be consistent and easy to learn, and are available using both the standard keyboard and the numeric keypad (numpad) for most commands. A description of the commands and their keyboard equivalents (in italics) is presented below. Each of these commands can also be executed using the visual display with a mouse (see Figure 1). The commands provide users with many of the features shown in other research to be necessary for navigating through and controlling audio material [2, 11, 12].

Figure 1:

The DAISY Playback Software Visual Interface, Version 1.0

Showing some of the section headings in "The Facts about Alcohol, Aggression and Adolescence" by Coggans and McKellar.

The reading position is in Chapter 1.

Sub-sections are shown indented from their parent section.

The section headings are searchable.


Readers have fine control over the amount of information presented to them with one reading command, and can specify whether to read a phrase, jump to a new paragraph, or jump to the next heading on the same level, to a sub-heading or a parent heading. Readers execute the ëPlayí command to read continuously, and can still jump through the material using the relevant keys and the output will continue reading. To simply navigate rather than play continuously, readers execute the ëStopí command, and while in ëstop modeí the navigation commands output the heading or the phrase and then stop. If the reader tries to jump to the next sub-section or the next group when in fact there is no further sub-section or group, the system jumps to the next heading it can find. Thus users never find ëdead-endsí - the system automatically takes them to the next suitable item for reading.

Play/Stop. (Spacebar) Starts continuous playback from the start of the phrase. Stop halts the output immediately.

Next Phrase. (Right arrow / numpad 6 or plus) Jumps to the next phrase in the book.

Previous Phrase. (Left arrow / numpad 4 or minus) Jumps to previous phrase in the book.

First Phrase. (Home / numpad) Jumps to the first phrase of the current section.

Last Phrase. (End / numpad 1) Jumps to the last phrase of the current section.

Next Group. (Down arrow / numpad 2) Jumps to the first phrase of the next group (paragraph).

Previous Group. (Up arrow / numpad 8) Jumps to the first phrase of the previous group (paragraph).

Next Page. (PgDn / numpad 3) Jumps to the first phrase of the next printed page.

Previous Page. (PgUp / numpad 9) Jumps to the first phrase of the previous printed page.

Next Section. (Ctrl + down arrow / Ctrl + numpad 2) Jumps to next section heading in current level. e.g. from 1.0 to 2.0, or from 1.1 to 1.2, etc

Previous Section. (Ctrl + up arrow / Ctrl + numpad 8) Jumps to previous section heading in current level. e.g. from 2.0 to 1.0, or from 1.2 to 1.1, etc

Level Up. (Ctrl + left arrow / Ctrl + numpad 4) Jumps to the parent heading. e.g. from 1.2 to 1.0, or from 1.2.5 to 1.2, etc

Level Down. (Ctrl + right arrow / Ctrl + numpad 6) Jumps to the child heading. e.g. from 1.0 to 1.1, or from 1.1 to 1.1.1, etc

First Section. (Ctrl + Home / Ctrl + numpad 7) Jumps to the first heading at current level. e.g. from 5.0 to 1.0, or from 1.9 to 1.1, etc

Last Section. (Ctrl + End / Ctrl + numpad 1) Jumps to the last heading at current level. e.g. from 1.0 to last section 5.0, or from 1.1 to last section 1.9, etc.

Jump Forward 15 Sections. (Ctrl + PgUp / Ctrl + numpad 3) Jumps forward 15 section headings (includes any sub-sections). e.g. from 1.0 to 15.0 if no sub-sections between them.

Jump Back 15 Sections. (Ctrl + PgDn / Ctrl + numpad 9) Jumps backwards 15 section headings (includes any sub-sections). e.g. from 15.0 to 1.0 if no sub-sections between them.


Users have fine control over the audio output and the speed at which the book is presented.

Mute. (Esc / Enter / numpad Enter) Halts both book and interface output immediately.

Speeds: Low, Standard, Brisk, High, Turbo. (L / S / B / H / T) Elongates or reduces length of pauses between phrases. Some speech compression is used at the higher speeds.


Overview information is provided by the commands: ëSection Informationí which outputs current section heading, and repeated commands output all parent headings; ëPage Informationí which outputs the current printed page number; and ëCurrent Positioní which announces an assigned section number to the current section. This command can be a useful orientation aid if the author did not include section numbers in the original book.

Section Information. (I / numpad 5) Announces the heading of the current section. Repeated keypresses announce all parent headings.

Current Position. (C / Ctrl + numpad 5) Assigns and announces a number to each section (not related to actual numbering in print book).

Page Information. (P) Announces current printed page number (if recorded).

Goto Page. (G) Opens dialog box, type page number to jump to. Enter executes, Esc cancels. Jumps to first phrase on page.


Readers can insert up to 9 numbered bookmarks with a shortcut command, and jump to them with a single keypress. The search task currently looks for strings within the headings of the book, after the current reading position. Readers can also jump to the next and previous occurrence of the located string.

Set Bookmark. (Ctrl + (1-9) ) Places the specified bookmark number at the current reading position.

Goto Bookmark. ((1-9) ) Jumps to phrase where the specified bookmark was set.

Find String. (F) Opens dialog box where user types text string to search in headings after current reading position. Alt + R to read the entered string, Esc to cancel. Enter to find string. Jumps to first occurrence.

Find Next. (V) Jumps to next occurrence of string

Find Previous. (R) Jumps to previous occurrence of string.


The Software provides a DAISY-style Audio Help System, giving readers headings for the common activities, leading to lists of keyboard commands for each task. This is navigated in the same way as a normal DAISY book. The Windows-based Visual Help System for low-vision users contains the same information but is not spoken.

Audio Help. (F1) Jumps to Audio Help System, providing keyboard commands for all commands, presented as a DAISY book. Esc to leave.

Visual Help. (Shift + F1) Jumps to Windows-based DAISY Playback Help System for keyboard command information. Esc to leave.

Open. (O) Opens dialog box for selecting drive and title of book. Arrow keys to select a title/initial letter for drive, Enter to open, Esc cancels.

Quit. (Q) Prepares to quit DAISY Playback Software. Press Y to confirm, any other key to cancel.


The main aims of this usability evaluation of the DAISY Playback software were to:


The evaluation had a multi-faceted design using a range of complementary objective and subjective usability measures found to be useful in other research [8, 11, 12]. 13 blind/partially sighted participants took part in the evaluations (9 males and 4 females, aged 18-75) , which is above the recommended 8-10 participants to conduct usability evaluations [see 9], and all participants used the software non-visually. Participants were representative of potential end-users of the system, having a range of experience of audio books and complex auditory navigation, thus increasing the chance that most usability problems would be revealed, as well as identifying preferences of both novice and expert users for a range of tasks. The participants included those who were employed, retired and studying, and almost all used computers every day for work, education and/or leisure, and 9 participants were frequent audio-tape talking book users. Each participant completed a training session, a practice session, and the evaluation session consisting of tasks in 2 or 3 different books, and a questionnaire-based interview. Usability issues and proposed improvements to the system were discussed in detail with participants.


The evaluations were mainly conducted on a 100MHz Pentium PC under Windows 95, with 32MB RAM and a 2GB hard disk. A standard training procedure was developed and participants were introduced to the concepts of structured digital talking books. Participants completed a range of practice tasks to familiarize themselves with the system. A set of evaluation tasks was devised to test realistic use of 2 or 3 different books using all the commands. The order of the tasks was not pre-determined, but each task was to be completed while reading each book. Examples of these tasks include: "Letís skim through the main chapter headings to see what this book contains"; "Can you get to Section 2.5 and put a numbered bookmark there?"; "Show me how to read phrase by phrase, and skip group to groupíí; "Can you find out the heading of the section youíre reading, and its parent sections?"; "What page number are we on, and can you now jump to page 10?"; etc. Detailed notes were taken of the participantís performance on these tasks and were used to supplement the ratings participants gave about the Software and the books.

A standard usability questionnaire was developed using 5-point rating scales, open-ended and closed questions in order to record in detail: ratings and opinions on usability, problems encountered and on proposed improvements to the software; participantsí own suggestions for improvements; and any specific problems encountered with different books.


Results are based on observed task performance and the participantsí own ratings and comments on the system. All ratings given are the means (m) on a 1 (low) to 5 (high) scale, for the given number of participants (n).


Overall the DAISY Playback Software was well-received, with participants praising the level of access and potential for using structured books. They liked the stand-alone nature of the Playback Software which meant they could use it without necessarily having a Windows 95 screenreader.

As shown in Table 1 participants were generally very impressed with the Software. They were able to start using it independently quite quickly, and rated it is as fairly easy to learn, quite easy to use, quite easy to remember the commands, and they liked using the Software very much. Most participants preferred using the standard shortcuts rather than the numpad, since the shortcuts were more memorable and more easily located in distinct groups on the keyboard, although both methods were rated as equally easy to use.

Table 1: General Mean Usability Ratings

Usability Issues



Ease of Learning to Use



Overall Ease of Use



Ease of Remembering Commands



How Much Liked Using Software



Participants were keen to express their pleasure at using the DAISY Playback Software, and said that it was easy not only to read books using the basic commands, but also to access highly structured information in non-serial manner. The positive features mentioned most frequently by participants include:

Some participants commented that they would have found a Braille printout of the Table of Contents useful as an overview of the bookís contents and structure. All participants stressed that they require character and word navigation (which will later be possible using a synchronous text file and speech synthesis).

Despite the many positive features of the Software, there were distinct problems in navigation caused by a lack of information and feedback, which means the system could not be used to its full potential. However, many simple solutions to these problems were proposed and participants gave their opinions on these improvements. These are discussed in the following sections.


Generally, participants found reading very easy and rated all the implemented reading commands to be very useful and important (overall mean rating for ease of reading was 4.72, n=13) ; the ratings for the different reading commands are presented in Table 2. Participants remarked that the reading commands were consistently implemented, and they were able to remember and locate the commands easily. However, when reading group-to-group, participants expected to hear the whole group rather than just the first phrase of the group which became very confusing.

Table 2: Mean Usability Ratings for Reading and Navigation Commands


Ease of Use



Play/Stop Command




Phrase-Phrase Reading




Group-Group Reading




Navigating Around Headings in Book




Jump to First/Last Section in Level




Jump Back/Forwards 15 Sections




Navigation through the headings in the book was very difficult for participants (m=2.46) [see Table 2], who stressed that effective navigation was essential. In addition, when navigating through headings participants found it difficult to know whether a particular section contained sub-sections. Also, when reading the body of a section, participants were unable to distinguish between a phrase and a heading. The weaknesses of the system are described in detail below followed by proposed solutions. Participants felt that although the ëJump Back/Forwards 15 Sectionsí command was easy to use, it was not very useful and most participants never used this command in the tasks.


The main weakness of the DAISY Playback Software when used non-visually is the lack of information about the bookís structure. Sighted users can easily recognize the organization of the book into sections shown by the indenting of sub-sections (see Figure 1). This visual cue leads sighted users to choose the correct commands for navigating through sections and sub-sections, and can distinguish a heading from a phrase simply by matching the output with the heading displayed on the screen.

However, auditory readers are given no similar cues, and consequently find it extremely difficult to navigate effectively between section levels, and cannot identify section headings easily when reading the body of a section.


An additional problem which increased participants' confusion is that the reading and navigational commands jump to another group or section level when a "dead-end" is reached (i.e. if there are no further phrases, groups or headings in the current level). For example, if a reader tries to jump to a sub-section which does not exist, instead of indicating to the reader that there are no sub-sections to jump to (i.e. it is a dead-end) , the system jumps to the next heading - which might be a heading at the same level or even at a higher level. Thus users believe this heading to be a sub-section, when in fact it may not be. Although this "jumping" of reading and navigational commands might have been designed to reduce the number of commands required for navigating, participants found it extremely confusing and disorienting and often ended up with a completely incorrect model of the structure of the book.


These two navigational problems can be broken down into the information required by auditory readers, which had already been included in the visual display:

(a) Is this phrase a section heading and what level is it?

(b) Does this section contain any sub-sections?

Participants rated both of these as being extremely important to know while reading and navigating, and discussed different proposals for presenting the information. The proposed use of non-speech sounds was enthusiastically accepted for providing short, unobtrusive and non-verbal information. For example, to indicate a heading and its level, a different number of "bips" could be played. For example, using Figure 1, the level one chapter heading might be presented as "bip" Chapter 2: Drinking Patterns, and a sub-section (a level two heading) might be ëbipbip How much do young people drink?í, and a further sub-section (a level three heading) might be ëbipbipbip Methodological issuesí. However, this solution would not be useful in a book with many levels of headings, and does not indicate the actual section number.

Some test scenarios have found that using the appropriate numbers of tones to indicate section numbering is ineffective and requires too much effort to comprehend [e.g. 14]. Other possibilities are to use different tones, rhythms or instruments to indicate the level of a heading [e.g. 3] which is likely to be more effective in a highly structured book. In this way, very short subtle sounds could be played quickly in front of each heading. Care should be taken that the sounds are short - but that they contain the maximum information possible without requiring concentrated effort to comprehend.

The proposed use of non-speech sounds to indicate whether a heading contains any sub-sections was also well-liked. Some participants favoured a sound played after a section heading to indicate the presence of further sub-sections, whereas others favoured only a ëdead-endí sound when a section heading had no sub-sections.

Participants were adamant that the reading and navigational commands should not ëjumpí and that the system should instead inform them of a ëdead-endí, using a non-speech sound. In this way, readers would be confident that the commands only accessed the appropriate information which would enable them to build up the correct structure of the book.


When inserting a DAISY book the system jumps to the location at which was last read. This was well-liked and rated as being very useful. Some participants requested that they could choose to open the book at their last location or at the beginning, possibly with separate commands.

While navigating, participants used various commands to help them find out where they were when they were disoriented (due to the problems outlined above). They made good use of the orientation commands provided by the Software, and all participants used the ëSection Informationí, ëCurrent Informationí, ëPage Numberí and ëGoto Pageí commands frequently to determine their current location and to re-orient themselves. The ëSection Informationí was very useful especially since it reports all parent headings with each keypress and was very helpful for orientation.

Participants were also very pleased with the page information and page navigational facilities and commented that all books should include the printed page information in the audio recording, and that it would be useful to hear the current page number with the total number of pages. The mean usability and usefulness ratings for the orientation commands are given in Table 3.

Table 3: Mean Usability Ratings for Orientation Commands


Ease of Use



Remembers location when insert disk




Section Information




Current Position




Page Number




Goto Page




Page-to-Page jumps




Most of these commands were easy to use, but the lack of structural information and feedback sometimes confused readers. In addition, the assigned ëCurrent Positioní number bears no relation to the actual sectioning of the book, which can be very confusing and not very useful: e.g. Chapter 1 might be the 5th heading in the book, and therefore is assigned the number ëSection 5í. The proposed solution is to classify introductory headings starting with ëIntro 1í and then classifying sections from the first main heading as ëSection 1í, to make the structure more useful to readers.


Participants were very impressed with the bookmark facility, especially that they could set them at the phrase level. They found setting and locating bookmarks very easy and very useful, see Table 4. Most participants said that 9 bookmarks would be sufficient for everyday reading, but would like a command indicating the number of their highest set bookmark, the ability to use more bookmarks, and to have a bookmark manager to assign names and review their bookmarks.

When setting bookmarks participants requested that the system reads the phrase at which it was set as well as the bookmark number as confirmation. In addition, participants suggested that when reading the book, the system should indicate the number of any set bookmarks when they are reached.

Table 4: Mean Usability Ratings for Bookmarks and Search Commands



Ease of Use



Set Bookmark




Goto Bookmark








Search Next/Previous




Participants were very pleased to have a search function on their talking book, and a basic search for text in the headings was not difficult to do (see Table 4). However, participants requested access to the full book, not just the headings. The search next/previous commands were found slightly hard to use because participants could not remember the keys to use. The spatial arrangement needs to be made clear in Help and other documentation for these keys to be memorable.


Participants welcomed the capability to change the reading speed of the book (usefulness rating m=4.23, n=13) , but were disappointed that the speeds merely changed the length of the pauses rather than changing the actual speed of the output. Shorter/longer pauses were appreciated at the different speeds, but participants stressed that they would also require the speed of the speech output to increase/decrease. In addition, participants requested that a single control to increase/decrease the speed was used, rather than 5 different shortcut keys for the 5 speeds. A volume control was requested using a similar increase/ decrease control. The Help System was well-received (usefulness rating m=4.85, n=13) especially since it was presented in a DAISY book format and the system guidance was helpful in leading new users through the help system.


There are several commands which are not provided by the Playback Software which participants reported they would find useful. Some of these commands were proposed to participants for their assessment, and others were suggestions made by the participants themselves. Mean ratings of potential usability are given where possible (m=mean rating, n=number of participants).

  • Repeat. (No ratings given). Most participants requested a repeat command to re-play the most recent output (whether the bookís contents or system information). This would save readers having to go back and forwards to hear the information again.
  • Overview of Structure of Whole Book. Potential usefulness m=4.38, n=13. This command would provide an overview of the bookís sectioning, indicating the number of levels of heading for example.
  • Overview of Structure of Current Main Section. Potential usefulness m=4.54, n=13. A slightly more useful command might be detail about the structure of the current section.
  • Overview of Bookís Contents. (No ratings given). Some participants suggested that the producers might write a short overview of the bookís contents to give readers a summary of what the book is about.
  • Jump to Start/End of Book. Potential usefulness m=4.46, n=13. Almost all participants said that they would have liked a command which jumped them straight to the very start or very end of the book. These commands should be similar to those used in other software.
  • Time Information. Potential usefulness m=3.23, n=13. Half the participants requested that the system could inform them of time passed, time remaining, to give them an impression of their location in the book. Some participants thought this was information they would not need very often.
  • Bookmark Management. (No ratings given). Several participants requested the facility to (a) set more than 9 bookmarks, (b) assign names to their bookmarks not just numbers, (c) view a list of their bookmarks and jump to them from the list as well as by number, and (d) hear the highest bookmark number already set.


These usability evaluations showed that the DAISY Playback Software is fairly easy to use and supports a range of users working with a variety of books for different tasks. All participants expressed their keen interest in using the system again for both work and leisure, and were looking forward to a general release of the Software and books.

The current serious navigational problems caused by the lack of structural information can be easily remedied with the inclusion of non-speech sounds, and together with the addition of a few new commands, the system will become very usable and widely acceptable. To summarize the main areas requiring attention:

  • Addition of non-speech sounds to indicate: (a) level of section, (b) further sub-sections, (c) ëdead-endí, (d) play/stop toggle, (e) ëplease waití
  • Prevent commands cycling - provide ëdead-endsí
  • Improvements to several commands to increase their usability
  • Additional overview commands
  • Additional navigational commands
  • Additional facilities e.g. bookmark management
  • Word and character spelling with speech synthesis
  • Full text search
  • User configuration for sounds used and default settings
  • Consistent recording and coding of books

The prototype system as it stands provides excellent access to structured information, but these improvements would significantly increase the usability and acceptability of the Software. Further user-involvement and evaluations are strongly encouraged to ensure appropriate design. Similar considerations may inform designers of other audio interfaces.


