470 likes | 487 Views
Explore the use of hearing and touch in human-computer interaction through Sensational Computing, addressing the digital divide with speech audio interfaces, non-speech sound, handwriting recognition, and tangible computing. Discover how the Simputer project offers user-friendly, multilingual computing to aid the illiterate. Learn about speech recognition systems and the power of different senses in computing interfaces.
E N D
Outline of Unit 13: Sensational Computing • The digital divide • Speech audio interfaces • Non-speech sound • Handwriting recognition • Tangible computing and gesture computing • Ubiquitous computing Arab Open University - Riyadh
Sensational computing Studying the use of the senses of hearing and touch to represent information, and mediate in human–computer interaction. You have seen in Unit 12 how to represent information using the visual sense. Why is it important to use such other senses to represent information?
Sensational computing Some reasons for using other senses to represent information are: To make information accessible to as many people as possible. Exploit these senses and modes of information representation to overcome the digital divide. Increase the richness of communication between humans and computers. In some situations, especially some special kind of information like music is best represented using non-visual forms.
The digital divide Theinformation gapbetween developedcountries and developing ones. It also describe theinformation gapbetween people of the same country. Major reason for such gap is the poor availability of computers and telecommunication infrastructure necessary to use the information.
The digital divide Other reasons for such gap: The cost of computers and the lack of technical infrastructure. Physically largecomputer with space to site and use it. The need for electricity distribution system. Many people in the developing countries are illiterate. High levels of illiteracy is due to that most languages especially African ones do not have a written form and many languages are not supportedby computer applications. There are still many people all around the world who are not familiar with English language.
The Simputer To develop a solution to the digital divide, we need user-friendly interfaces that don’t rely on text, but utilize different senses. One of the solutions is theSimputer. Simputer stands for Simple inexpensive multilingual people’s computer. The Simputer project has been led by a group of Indian scientists and engineers.
The Simputer The Simputer is a self-contained, hand held computer, designed for use in environments where computing devices such asPC are deemed inappropriate. Due to thelow cost, it was also deemed appropriate to bring computing power to the developing countries. Designed to help the poor and illiterate join the information age.
The Simputer The Simputer has software that reads web pages aloud in native Indian languages, so that the 35% of Indians who cannot read can find out about aid projects targeted at them. To keep the cost down an open standard has used. Linux operating system was used.
Information Markup Language The primary interface of the Simputer is a browserthat can render the Information Mark-up Language (IML). IML is a new XML application that has been developed by the project team. One of IML’s main roles is to specify how pages should be displayed on the Simputer and what text on a web page should be read out. The text can be turned into an artificial sounding but nevertheless understandable speech in languages like Hindi, Kannada and Tamil using the library of sounds stored on the computer.
Outline of Unit 13 • The digital divide • Speech audio interfaces • Non-speech sound • Handwriting recognition • Tangible computing and gesture computing • Ubiquitous computing
Speech audio interfaces Speech becomes important when providing interfaces for illiterate and poor literacy skills people. Speech is also important to people with visual impairment. Speech is important in situations where keyboard can't be used or eye can’t read a computer display. Speech recognition: computers recognize spoken words. Speech synthesis: computers utter recognizable speech.
Speech recognition There are two uses for speech recognition systems: Dictation: translation of the spoken word intowritten text. Computer Control: control of the computer, and software applications by speaking commands. Speech recognition is one of the desired assistive technology systems. People believe speech recognition is a natural and easy method of accessing the computer.
Speech recognition While speaking; the microphone transform the sound waves in to analoguesignal. The analogue signal is then converted in to digital one. The digital signal is then split into words. (Low signal means a break between a word and another.) Once the words are separated, they must be recognized. This is done by aspeech recognition system.
Speech recognition Types of speech recognition systems: Simple speech recognition systems: The telephone answering system is an example, where key presses are replaced with spoken numbers. A simple speech recognition system recognizes the individual words (numbers). Such recognition systems are called Isolated Word Recognizers, designed to recognize individual words only. A break between the word and the other allows isolating each word and then recognizing each alone. Another category of speech recognizers is called Speaker-Enrolment Systems, where the software has been trained to recognize a single individual. In general, speaker-independent systems will recognize much less vocabulary than those systems which have been trained to recognize one person’s speech.
Speech recognition Advanced speech recognition systems: There might be a large number of candidate words that match the word spoken by the user, due to background noise, lack of clarity on the part of the speaker, or the conversion process from sounds to electrical signals. So that a sophisticated speech recognition system needs to be given large databases of words, language and grammar rules, information on the frequency with which words are used in the user’s language and probabilities that a certain word follows another word, in order to identify likely words from a range of possibilities. As an example: suppose that I have dictated the words ‘The dog barked in the morning’ and the speech recognition system has identified that the first two words were ‘the’ and ‘dog’ and that possible candidates for the third word were; ‘barged’, ‘barked’, ‘barred’ and ‘boiled’. Rules in the speech recognition system could reveal that the probability of the word ‘barked’ following the noun ‘dog’ is the highest compared to the other words. Barked would then be chosen as the recognized word.
Speech recognition Speech recognition is not speech understanding; this is a common misconception. New computer systems withArtificial Intelligenceare attempting to possess some understanding of the meaning of words. This understanding is based on common sense: knowledge that we take for granted when determining the meaning of words. Common sense knowledge aboutbarkingwill be linked with common sense knowledge aboutdogs, and the linking goes on. The drawback of the common sense approach is that it requires many millions of pieces of knowledge, incredibly sophisticated programming and enormous amounts of computing power to correctly interpret that knowledge.
Speech synthesis Speech synthesis: a machine to reproduce human speech. Automata(forerunners of modern robots) are one of the early inventions that were capable of sounding individual vowels and consonants. Difficulties associated with speech synthesis systems are related to formulating the rules for converting the source text into speech. Speech using stored fragments: a computer stores fragments of speech which are assembled as required to complete sentences. (e.g. telephone service that tells the time)
Speech synthesis techniques Techniques used by speech synthesis systems: UsingPhonemesto produce speech. UsingDiphonesto produce speech. Model-based speech synthesisto produce speech
Speech synthesis techniques UsingPhonemesto produce speech. The individual sounds produced by humans are called phonemes. Each language has a number of phonemes (English uses about 45 phonemes while Chinese use about 2000 ones) A speech synthesizer joins together appropriate phonemes in order to construct words. Example:CATword can be constructed by joining the 3 phonemes: K , A and T. The speech system concatenates phonemes to produce speech
Speech synthesis techniques UsingDiphonesto produce speech. Diphones are fragments that span two phonemes. They stretch from the middle of one phoneme to the middle of the following phoneme. If we continue with our ‘cat’ example, a diphone consisting of the second half of the ‘k’ phoneme and the first half of the ‘a’ phoneme, would be concatenated together with a second diphone consisting of….you get the picture. Then Diphones are joined together to form sentences. Produces much smoother speech than the phoneme approach.
Speech synthesis techniques Model-based speech synthesis to produce speech: Most advanced techniques. Relies on modeling the way in which human speak. It simulates the human vocal tract (produces the sound and then shapes it in order to speak).
Speech Synthesis Problems facing the speech systems that convert text into speech are: How to pronounce a word of text. Ambiguous words which are spelt identically but have different pronunciation, called homographs (e.g. read/read). Computer has no understanding of what it is reading, so it cannot infer the correct pronunciation while speaking the text. Overcoming these problems produce more advanced speech synthesis systems.
Outline of Unit 13 • The digital divide • Speech audio interfaces • Non-speech sound • Handwriting recognition • Tangible computing and gesture computing • Ubiquitous computing
Different types of sound The different types of sound may be categorized according to the type of information the sound contains, the ways in which the sounds are used, or how they support our interactions with a computer. Music: which can accompany other things to enhance enjoyment or create atmosphere, and for itself. Alerts: sound effects such as beeps used for attention getting. Warnings: loud sound effects used for attention grabbing. Noise: unwanted sounds that can appear in different frequencies and amplitudes.
Music Digital technologies are becoming an increasingly important part of music technology. One reason, music stored in digital form can be easily copied without any loss of quality. With analogue form this is not true there will be a difference in qualitybetween an original tape and copied one. Recording is a type of representation medium for music whether it is stored in an analogue way ordigital one (in this unit we will be concerned in digital one).
Music Sampling is the technique used to convert analogue sounds in to digital one. Digital sounds are then stored in CD or MP3 format. CD recording is a higher fidelity. MP3 recording provides smaller file; enables easy transmission.
Manipulating digital music After storing sound we need to manipulate it. Computer system enables the recording technician to easily join parts of different performances. Unwanted noises such as coughs can be removed from a recording. Correct old recordings.
Musical Instrument Digital Interface Another way of storing and manipulating music by computers is using MIDI interface. MIDI: Musical Instrument digital Interface, widely used in music industry. Contains instructions that electronic instruments (such as electronic keyboard) can interpret in order to play individual notes. Define an interface standard for connecting electronic instruments to your PC that allows playing back or even recording music through these instruments. A piece of music can be orchestrated for different instruments. File can be edited and individual notes can be changed.
Digital composition Digital synthesizers are typically controlled by an electronic keyboard like the piano keyboard. Digital synthesizers work on a digitized sound source. The sound is then transformed by changing the frequency or by filtering and then converted into analogue one suitable for loudspeakers. 2 ways in which the computer is used in music composition (capturing and processing the notes): A computer program is used to input musical scores using direct annotation of notes (the keyboard and the mouse can be used to input notes). Use MIDI input format and this allow capturing the data and also editing it later.
Using sound effects in computer interfaces Hearing is our richest sense after sight, so we need to increase the applications that uses sound in user interfaces Our visual and auditory senses are independent but they work well together. Sound reduces the load on the user’s visual system. Sound reduces the visual attention that must be paid to a device. Sound is attention grabbing. Sound helps computers to be more usable by people with visual impairment. For these reasons researchers are working in the use of non speech sound in human-computer interfaces. The termEarconsis used to describe the non verbal messages that is used in interfaces to tell the user some information about computer operations.
Outline of Unit 13 • The digital divide • Speech audio interfaces • Non-speech sound • Handwriting recognition • Tangible computing and gesture computing • Ubiquitous computing
Writing systems Most computers are programmed to respond to mouse clicks and keystrokes of a keyboard. Keyboard is a device that contains a number of keys arranged in a randomway called QWERTY keyboard (the same as early mechanical design of manual typewriters) For us who use Latin alphabets (26 letters, 10 numbers and few characters) this keyboard is perfect. this keyboard is not suitable for many other alphabets like Japanese which is made up of many thousands of characters. An alternative way beside the keyboard is needed Handwriting
Handwriting recognition In the last few years the market for computers that has only become viable is the hand–held or pocket computer. Decreasing the size of device means decreasing the size of keyboard, which creates an input problem (bad usability). Handwriting recognition via a touch-sensitive screen is the solution.
Handwriting recognition Difficulties facing writing recognition: Wide diversity of writing systems (Latin, Arabic,…), each language has a rule for writing direction. Large individual differences in writing style; each one write the characters in different shape. Human beings are extremely good at resolving ambiguity in characters (we find it simple to distinguish the number 5 from letter S) but generating a programming solution for this problem is difficult. Humans rely on the common sense knowledge (we don't expect to have a number 5 in the middle of a word then it is S ) this is difficult to codify in a computer program.
Simplifying handwriting recognition Techniques and conventions used to simplify the task of handwriting recognition are: Restricting the range of symbols that can be used, to just the uppercase letters. Requiring that characters are written in predefined boxes. Accepting handwritten characters that are not joined up. Redesigning the interface so that it is very clear what input is required.
Neural networks and their use in handwriting recognition Many handwriting recognition systems use the technologyofneural networksto overcome the difficulties. Neural networks is a terms to refer to the network of neurons inside the nervous system of human beings. Also it refers to the Artificial neural networks (programming constructs that mimic the properties of neurons of nervous system). Each Artificial neural must be trained first before they become useful, this is done by presenting the neural with known data and recording its respond. If the network produces correct answer, it moves to the next example, else the software involves repeated test until the answer is correct . This technique has been used byhandwriting recognition systems.
Good example of performing handwriting recognition with neural networks isNewton MessagePadreleased by apple in 1993. It uses a powerful neural net software to interpret handwriting. The Newton has 2 advantages: no need to change your handwriting style and it would learn to recognize your writing. When user enters a word the system attempt to match it with the words from its internal dictionary, if it is found then it recognizes the word. Else the user can tell the Newton to add the word to its internal dictionary. As time went, Newton become more and more accurate. Neural networks and their use in handwriting recognition
With Newton there are still some problems that have to be overcome in order to produce a viable handwriting recognition system. Handwriting recognition system must run on a pocket computer, and this system requires a difficult computing tasks that needs large amount of memory and processing power. large amount of memory and processing power greatly reduces the battery life . Palm computing wanted to produce a pocket computer with lower price and a battery life of weeks or even months by using a slow microprocessor and small memory. Simplifying the way of entering letters will simplify the task of recognition. The solution isGlyphs Neural networks and their use in handwriting recognition
Graffiti – an alternative to handwriting recognition A glyph is an element of writing. The glyphs Palm company used are highly stylized equivalents to letters, numbers and common punctuation characters. Most glyphs can be completed in a single stroke of the stylus and each is sufficiently different from all the others to make the recognition process tractable, even on a relatively slow microprocessor. Palm called their handwriting recognition system graffiti. Users needed to first learn the graffiti alphabetbefore using the Palm system. Experiments shows that Palm system overcome most Problems produced by Newton system.
Outline of Unit 13 • The digital divide • Speech audio interfaces • Non-speech sound • Handwriting recognition • Tangible computing and gesture computing • Ubiquitous computing
Tangible computing and gesture computing two related means of communicating and interacting with computers using our sense of touch: tangible computing, which involves devices that can be used to interact with representations of information in the digital world; gesture computing, where computers are programmed to interpret human gestures and movements. This area of human–computer interaction is sometimes known as haptic computing. In contrast to the visual and auditory senses, which are primarily used for from computer to the user, Haptic computing is a bi-directional one.
Tangible computing Tangible interface is an interface that gives a physical form to digital information. Physical object can be both a representation of digital information and controller for such information. PDA, personal digital assistant is an example of tangible user interface in which extra controls and sensors are added to it so that physically manipulate the PDA. Tilting or squeezing it for instance controls the display of information on the PDA screen. Some devices can provide a feedback through the sensation of resistance to movement (driving simulators that gives the user feel resistance through the steering wheel when turning a corner too fast) .
Gesture computing Another way of communication that is appropriate when it is necessary to use computer without keyboard or screen. When a user is not able to hear or even to speak. Most commonly used language for communication is probably American Sign Language (ASL). It is based on recognizing the special signs done by the user and responding to them. So the problem is to develop a recognizer that recognize the sign language.
Gesture recognition Developing such recognizer is not an easy manner for several reasons especially that the sign language is done free form, in the air, primarily by hands. 2 solutions for such problem: Person making the signs gestures to wear special gloves, which make it easier for an image-recognition system to track the hands against a general background. The signer needs to wear special sensors which allows the computer to track the position of the hand in 3 dimensions. New researches are done nowadays on the movement made by the human eye and trying to recognize it.
Outline of Unit 13 • The digital divide • Speech audio interfaces • Non-speech sound • Handwriting recognition • Tangible computing and gesture computing • Ubiquitous computing
Ubiquitous computing Making many computers available throughout the physical environment, while making them invisible to the user. Sensor inside the washing machine,…… The computers is embedded inside the physical environment and other equipment. Computer will be small, unlike conventional computers. the computer will be invisible in the sense that user will not be aware that they are using a computer.
What’s next? • Unit 14 : Hiding data: an introduction to security. • Check out TMA04