StudentShare
Contact Us
Sign In / Sign Up for FREE
Search
Go to advanced search...
Free

Speech Recognition Technology - Essay Example

Cite this document
Summary
From the paper "Speech Recognition Technology" it is clear that engineers are frustrated with how background noise as well as other “noises” like out-of-vocabulary words and sounds, dialect variations and other related variables could cause speech recognition technologies to fail…
Download full paper File format: .doc, available for editing
GRAB THE BEST PAPER96.7% of users find it useful
Speech Recognition Technology
Read Text Preview

Extract of sample "Speech Recognition Technology"

SPEECH RECOGNITION TECHNOLOGY Introduction When Apple released the Iphone 4S, one of its application called Siri became an instant success. In its succeeding campaigns for its flagship phone, Apple would come to highlight this technology as a selling point in order to further grab a sizable competitive margin. The attention on Siri and its efficiency and convenience for users underscore the current advances on voice recognition technologies. More people are becoming aware of such technology, which could propel its development and contribute further to the productivity of other fields since unbeknownst to many, speech recognition is vital in the medical field and in research. The historical development of speech recognition technology spans at least 50 years. These years can be divided into decades, which most researchers would identify as generations. The 1950s and the 1960s was the period of first generation speech recognition technologies. The market condition for the technology during and immediately preceding periods was not favorable because it was only during the latter part of the 1990s when the technology became cheap and made available to many consumer markets. The late 1960s through the 1970s saw the second-generation technology; the late 1970s through 1980s were for the third generation and so forth. One of the earliest speech recognition technologies include the system for isolated digit recognition developed by the American company Bell Laboratories as well as the technology developed by RCA Laboratories that recognized distinct syllables spoken by a speaker. (Chen and Jokinen, 2010, p. 2) These technologies, including the succeeding attempts of various laboratories were classified as Automatic Speech Recognition or ASR systems, which are primarily based on acoustic phonetic systems. The second generation technologies entailed several breakthroughs. Most of the systems developed used dynamic programming methods such as the Viterbi algorithm, which became indispensable technique in ASR. (Chen and Jokinen, p. 3) Many companies began developing their own speech recognition technologies such as IBM and others companies overseas such as Japan and the then Soviet Union. By 1980s, the third generation has already perfected technologies that that could recognize a fluently spoken string of connected word in addition to the development of various other models such as the statistical modeling and the continuous speech recognition concept developed by DARPA. (p. 4) From the 1990s to the present the development became robust as other technologies that needed it developed rapidly. Speed and accuracy became the focus. In recent years, the so-called fourth generation of speech recognition is undergoing intensive development. The focus is to simulate the way humans recognize speech, which means less errors as voices and speech are interpreted by machines. A Case for Necessity Essentially, what drove the emergence of speech recognition field was the technological development that led to an increasing interaction between humans and machines. It was not created by a demand for the technology. Instead, it was part of technological development process because the transcription or the expression of human speech into a language understood by machines enables the successful automation of numerous tasks. The core principle at work here is communication. It is the main tool by which people interact with each other and it facilitates tasks especially when they require the cooperation of individuals. This is also true in the case of the relationship between humans and machines today. In a highly detailed investigation, Adelheid Voskuhl (2004) spent three months studying the design and use of three ASR speech recognition technologies. He was able to posit that the technologies studied “exemplify the production and reproduction of human qualities in machines insofar as they are designed to mimic distinctive capacities to ‘listen’ and ‘understand’. (p. 394) This is important because the machines are perceived to be prone to errors because they are not like humans who are social in nature. The ability to understand and engage in a dialogue is considered crucial in the efficacy by which human communicate because of the capacity accommodate the vague norms in social interaction. This is significant because the man-machine interaction can be considered social interaction. If one fails to understand the other well, then he is at a disadvantage and could misinterpret the conversation, affecting his capability and that of the conversation, compromising even that of the other’s capacity to convey and understand messages. According to Voskuhl, interactions between humans and machines are deeply social activities not unlike the classic social interactions wherein conversation play an important role in the production of knowledge. (p. 394) This is especially true given the fact that existing technologies have strong grammatical constraints and that they have limited capability to understand “context” in speech. During the time of his research, the technology was often about transcription and less often about conversation. Realizing the concept of “conversation” was seen as the Holy Grail for the scientists he observed. Today, after several years, this objective is slowly being realized. Apple’s Siri is an excellent example of this breakthrough. Dubbed as “intelligent personal assistant”, Siri is a speech recognition software for the iPhone’s iOS operating system. It owes much of its technology to the research gains of the Defense Advanced Research Projects Agency (DARPA), one of the most aggressive developer of speech recognition technologies since the 1950s. A user employs Siri to ask questions, find directions, store appointments, ask for recommendations and a number of other tasks relating to iPhone use. Initially, the technology appears to be a simple technology executing user commands based on a set of pre-programmed actions. But what is interesting is that the software was designed in order to store all information provided by the user so that later on, the answers, recommendations and actions in its interactions with the iPhone owner becomes personalized. If Siri is a simple speech generation software, a command given to two iPhones would yield the same response. But if Apple is to be believed and the case may be true in the personal experiences of a number of people who own iPhones, then Siri is indeed a sophisticated application capable of “understanding” humans, conversing and interacting with them effectively to the point that it can perceive contexts and respond to it intelligently. According to Apple, Siri “knows what you mean”. (Apple 2012) The application is also not a passive participant in its interaction with an individual, timidly awaiting commands. It can initiate dialogue or volunteer information without being prompted. What the Siri application demonstrates is the fact that speech recognition has become very sophisticated, efficiently helping people to do daily tasks and address the previous problem of the inability to understand “contextual” language. The implication of this kind of technology is very important because this means it can provide meaningful contributions to society. Today, it is already helping fields such as healthcare and defense. For example, many believe that electronic medical records will function better if speech recognition technology will be adopted especially as a faster data-entry mechanism. (Carter, p. 12) Speech recognition technology is also increasingly becoming important for defense-related activities and practices. It is particularly being adopted in eyes-busy and hands-busy environments such battle management, the interaction between a personnel and a knowledge database or resource management system and manufacturing equipment control, among others. McTear, 2004, p. 10) Even the diplomatic world use speech recognition technology as a tool. For example, there is the project called “DIPLOMAT”. This particular technology is a subject of Frederking et al.’s research (2000). They found that the project was able to create a rapid-deployment, wearable bidirectional speech-translation systems that are able to perform translations in the shortest possible time, making communications and interactions within a multilingual organization such as the United Nations easier and more efficient. (p. 27) Development Issues and Trends Speech recognition technology is continually being developed. A schematic account of the current available technology involves three fundamental stages: 1) the perception of acoustic signal; 2) the generation of an internal representation of the signal; and, 3) the recognition that emerge as a result of a comparison between the representation and the set of representations of a specific vocabulary it has in its storage. (Voskuhl, p. 396) One of the most critical technological challenges in the development is the problem about “noise.” According to Voskuhl, engineers are frustrated with how background noise as well as other “noises” like out-of-vocabulary words and sounds, dialect variations and other related variables could cause speech recognition technologies to fail. (p. 398) This is one of the grueling tasks that scientists must address. Another challenge involves the validation of the speech database and the body of language that has been collected in the past years. This was revealed in a research undertaken by Van Bael, va den Hevel and Strik (2007). According to their investigation, there are two fundamental problems that needs to be addressed: 1) no reference transcription could fully represent the phonetic truth; and, 2) phonetic transcriptions are often generated to serve various purposes and that none of these are considered when the transcriptions are compared to a reference transcription that was not made with the same purpose in mind. (p. 129) It is on these notes that Levinson (1995) offered his criticisms. He argued that advanced speech recognition technology is necessary and the current state is far from such desired level because what is available are experimental systems with significant magnitude in error rate. (p. 9953) Some of the solutions that he posited were: an increase on research; and, to lessen the attempt to prematurely commercialize the technology. His concerns were, of course legitimate. However, they do not cloud the current surge of speech recognition as an important technology. Apple’s Siri demonstrated this best. The application also highlighted the expected demand from a rapidly growing and profitable sector, which is the mobile computing technology. It should help spur research and development because speech recognition is compatible with the current trends in this telecommunications and information technology sectors; hence, its prospects are positive and exciting. Conclusion All in all, speech recognition technologies are important today. It underpins the growing interaction between man and machine. It is one of the most important platforms by which such relationship enable participants to understand each other, participate in a dialogue and, finally, achieve a social interaction that can result in a meaningful production of knowledge and accomplishment of tasks. Many fields such as healthcare and defense are starting to turn to it because of its enabling capability. These are the reasons why it is gaining popularity nowadays and would inevitably be one of the factors that will shape the technologies of the future. References Carter, Jerome. (2001). Electronic medical records: a guide for clinicians and administrators. ACP Press. Chen, F. and Jokinen, K. (2010). Speech Technology: Theory and Applications. Berlin: Springer. Frederking, R., Rudnicky, A., Hogan, C. and Lenzo, K. (2000). "Interactive Speech Translation in the DIPLOMAT Project." Machine Translation. vol. 15, pp. 27-42. Levinson, S. (1995). "Speech Recognition: A Critique." National Academy of Sciences. vol. 92, no. 22, pp. 9953-9955. McTear, Michael. (2004). Spoken dialogue technology: toward the conversational user interface. London: Springer. "Siri: Your wish is its command." (2012). Apple. Retrieved from. Van Bael, C, va den Hevel, H and Strik, H. (2007). "Validation of phonetic transcriptions in the context of automatic speech recognition." Language Resource and Evaluation, vol. 41, no. 41, pp. 129-146. Voskuhl, A.. (2004). "Humans, Machines, and Conversations". Social Studies of Science, vol. 34, no. 3, pp. 393-421. Read More
Cite this document
  • APA
  • MLA
  • CHICAGO
(“Technology - Voice Recognition Essay Example | Topics and Well Written Essays - 1500 words”, n.d.)
Technology - Voice Recognition Essay Example | Topics and Well Written Essays - 1500 words. Retrieved from https://studentshare.org/miscellaneous/1589249-technology-voice-recognition
(Technology - Voice Recognition Essay Example | Topics and Well Written Essays - 1500 Words)
Technology - Voice Recognition Essay Example | Topics and Well Written Essays - 1500 Words. https://studentshare.org/miscellaneous/1589249-technology-voice-recognition.
“Technology - Voice Recognition Essay Example | Topics and Well Written Essays - 1500 Words”, n.d. https://studentshare.org/miscellaneous/1589249-technology-voice-recognition.
  • Cited: 1 times

CHECK THESE SAMPLES OF Speech Recognition Technology

Interacting with Computer Technology

Speech Recognition Technology moves users away from keyboard/mouse interfaces and toward an interface that is more natural for the user.... In the paper 'Interacting with Computer Technology' the author discusses biotechnology, which exists in the form of fingerprint recognition and speech recognition programs.... oore's law is very important because the computer technology of the 1960s would need to increase capacity to handle programs such as speech recognition and facial recognition programs....
3 Pages (750 words) Essay

Computer Will the cost and power of personal computers continue on the current trend

Advancement in computer technology has brought about a corresponding reduction in the cost of computers.... These students are using the technology that was employed in the manufacture of Apple II (The History of the Computer: First PCs and the Future Computer Timeline, 2008).... The claim of these students is that they are building the computer with a view to making technology available to everyone.... All this transpires, because the technology becomes less costly and consequently affordable....
4 Pages (1000 words) Essay

Marketing Game Simulation Exercise

Speakeasy, one of the nation's leading broadband, voice (VoIP), data, and IT service providers, joined the Best Buy family in 2007 and now works with Best Buy for Business as advocates of technology and communication solutions for small business.... (Together we're the only technology solution small businesses need: Best buy for business, 2010).... The main purpose of the exercise is to enforce marketing strategies that could allow the manager to make strategic decisions on various dynamics in the marketing of voice-recognition devices software (VRD) in the domestic market....
9 Pages (2250 words) Essay

Voice Command Technology

Through the vast increase in the computing power, backed up with relatively growth in the mobile communication technologies, more renewed interests into voice and Speech Recognition Technology have occurred.... Voice Command TechnologyIntroduction Voice command technology or voice recognition technology refers to the alternative to the process of typing on a keyboard.... Conclusion Looking at the future of technology in society and individual capacities, voice recognition technology will be able to revolutionise on the manner of which people will be conducting their activities and businesses....
2 Pages (500 words) Essay

Human Interface Techniques for Computers

The paper "Human Interface Techniques for Computers" states that interface accords the user the flexibility to play around with and manipulate multimedia objects, text and graphics almost at will.... An object can be stretched, re-oriented, zoomed into and zoomed out of only with the use of the hands....
6 Pages (1500 words) Coursework

Human-Computer Interface and Usability

The author of this paper "Human-Computer Interface and Usability" examines the full understanding of possible challenges that are faced by disabled individuals in their attempts to use computers, basic knowledge on the capabilities of such disabled persons has to be known.... .... ... ... It is after the need identification of the disabled will it be possible to effectively apply universal designs to computer interfaces....
5 Pages (1250 words) Assignment

Achievements, Examples, and Limitations of Artificial Intelligence

Machines that use the Speech Recognition Technology include the computer installed in aircraft.... The described human actions include functions that include visual perception, speech recognition abilities, and decision-making skills.... speech recognition is another field that AI dominates largely.... Modern society consists of sophisticated machines developed from artificial technology.... In lieu of this, various arguments arise with concerned viewpoints on advanced technology....
5 Pages (1250 words) Essay

Project The Development of the Project

ith the development of the new technology, cost-cutting objectives will become attainable once this tool is fully functional since the Speech Recognition Technology eliminates the need for more workers in the business organization; this reduces labour costs (Koivikko, Kauppinen & Ahovuo, 2008) for the business owner.... As stated above impaired speech in terms of clients' voices is a major drawback for many business owners with call centres in their organizations....
15 Pages (3750 words) Report
sponsored ads
We use cookies to create the best experience for you. Keep on browsing if you are OK with that, or find out how to manage cookies.
Contact Us