StudentShare
Contact Us
Sign In / Sign Up for FREE
Search
Go to advanced search...
Free

Adaptive Multi-Rate Narrow-Band Coders - Assignment Example

Cite this document
Summary
"Adaptive Multi-Rate Narrow-Band Coders" paper focuses on adaptive multi-rate (AMR) narrowband coders that are audio data compression schemes that are used for speech coding. They were introduced in 1998 as the basic speech codec by 3GPP, and today they are applied in GSM. …
Download full paper File format: .doc, available for editing
GRAB THE BEST PAPER94% of users find it useful

Extract of sample "Adaptive Multi-Rate Narrow-Band Coders"

R СОDЕRS Student’s Name Professor’s Name Course Title Date Adaptive multi-rate narrow-band coders Introduction Adaptive multi rate (AMR) narrow band coders are audio data compression schemes that are used for speech coding. They were introduced in 1998 as the basic speech codec by 3GPP, and today they are applied in GSM. AMR can be found in 12.2,10.2,7.95,7.40,6.7,5.90,5.15 and 4.75kb/s. AMR are adapted from on frames that contain one hundred and sixty samples and are twenty milliseconds in length. AMR apply distinct methods which encompass: Algebraic Code Excited Linear Prediction (ACELP) compression, Discontinuous Transmission (DTX), Voice activity Detection (VAD) Comfort voice generation (VAD). The application of AMR needs optimized links adaptation that chooses the perfect codec model to meet the local radio channel and capacity needs. AMR is also a file format that is applied to record spoken audio (Sauter,13). Numerous mobile phones nowadays store audios in AMR format. AMR applies a split band methodology that the information signal at sixteen Kilohertz is divided into 2 equal frequencies ranging between 0-4kHz and 4-8kHz of that is decimated into an eight kilohertz sampling rate. The lesser band is coded with AMR narrow bands voice coders, channel coding for the numerous GSM channels and dynamic rate they adopt in an easy manner to meet all the chosen criteria and are put second in the 3GPP AMR wideband selection testing. In addition to the high performance, another advantage of the implanted divided band methodology encompasses ease of putting it into effect, deduced complexity and simplified inoperability with narrow speech coders. Pre-processing of speech. Pre-processing of speech is an imperative step in the creation of an effective and efficient speaker or speech recognition system. It entails the division of voiced parts from the silent parts of a captured signal in the creation of a voice recognition system. This is done since a large part of the speech characteristics are contained in the voiced part of the speech signals. A common way of labelling a speech signal is through the three state representation i.e. the silent region; where no sound is produced, unvoiced region; where the resulting wave form is random in nature and the voiced region where the vocal chords do not vibrate and the resulting wave is a quasi-periodic one as the vocal chords are tensed, and as a result quasi periodic and hence vibrate periodically (16). The short term based energy methodology that is applied to divide the silence/ unvoiced region from the voiced portion are generally fast however they are hindered by the fact that the minimum amount required to be executed selected on an adhoc means. This means that the recognition system has to be retuned every time there is a small change in the ambience. LPC analysis (linear predictive coding analysis) LPC is a model of voice signal production that argues that voice signal is offered by a certain model. In LPC a coding scheme that is closely correlated to the model can result in an efficient representation of the signal. Speak signals are created by twisting of the excitation source and the time varying vocal tract system components (22). LPC is most commonly applied in voice recognition and verification, voice storage, voice coding voice combinations. For deconvolution of the voice into excitation and components of a vocal tract system, methods that are associated on homomorphic examinations are created. To reduce the computational complications at source and systematic components from a certain time domain requires to formulate the linear predictive analysis. This methodology is imperative for speech recognition of arbitrary speakers. In this methodology the ratio of two vocal tracts that communicate to a new speaker and a reference speaker are ascertained from the training speech information of several typical vowels .Normalized LPC measures on these ratios can be ascertained for any speech information. Compared to other ways of speech measure normalization, this methodology needs not estimate formant frequencies and is simplistic and reliable. LPC analysis divided speech in to two parts namely; a) A filter function that constraints of LPC coefficients hinged on the assumptions that the source function has been filtered via a single variable cross-section tube. b) A source format can be the original signal inverse through the filter function or a stylized version that consist of white noise or pulse train at the necessary pitch for voiced speech. The LP methodology gives very accurate estimates of speech, and it is very effective and efficient. The LP methods can be used in controlling information and are also called methods of system estimation. They contain covariance, autocorrelation, lattice, inverse filler correlation, spectral filler formulation, maximum likelihood methods. The main advantage of this method is that it is derived from simplified tract model and source model filter with a speech production system. The limitation of this method is that its harmonic spectrum produces spectral aliasing which is especially seen in high pitched signals which affect the harmonics of the signals . Quantisation Quantization is a process of changing an array of data values into little sets of product outcome that are approximately near the initial data. It may also refer to the changing of a sampled signal that is discrete in time but continuous in value into a signal that is discrete in value. Quantization creates the range of a signal discrete so as the quantized signal takes a discrete and finite set of values (37). Quantization cannot be reversed, and any attempt to reverse it leads to loss of data. Quantization creates a sampled signal digital and ready to be processed by a computer. Computers operate in ones and zeros; these makes the conversion of analogue to digital a close approximation. For instance the case of music, the quantized signal must maintain close succession and amplitude of value as well as the timing. For instance, when recording in a music studio, the mic picks up analog music sound waves that are then changed into digital. The audio signal can be sampled at fourty four, one hundred hertz and quantized at 8,16,24 bits larger bits giving more information that facilitates more proximate conversion and reproduction of the original wave form. The main choices in quantization are the figure of discrete quantization levels that is in use. The case quantization is the quality of the signal compared to the data required to show all the samples. The basic type of quantization is the zero memory quantization in which the quantization of a sample is autonomous from other samples. Here the signal amplitude is simply shown using some countable figures of amounts of independent of the sample time and independent of the neighbouring samples. Principles of the adaptive multi-rate speech encoder The Amr codec contain of eight source codec consisting of bit rates of. Amr codec is modelled on the (CELP) coding model code-excited linear predictive. AMR encoder derives its data as a thirteen-bit homogenous PCM sign originating from the audio section of the Mobile or on the telecommunication network part, from the PSTN via an eight-bit to thirteen-bit homogenous PCM changeover (13). The encrypted voice at the outlet of the encrypted is taken to a channel encrypted unit. In getting direction, the contrary process is achieved. The codec is discovered on (CELP) coding model (13). A tenth order (LP), or short-term, combination filter is applied. The pitch combination filter is executed by applying the adaptive codebook methodology. In the CELP voice combination model the excitation signal at the data of the short-term LP combination filter is made by aggregating two excitation vectors from adaptive and fixed (innovative) codebooks. The speech is combined by feeding the 2 effectively selected vectors from the codebooks via the short-term combination filter. The most favourable excitation succession in a codebook is selected by applying an analysis-by-combining search criterion that gives the error between the initial combined voices is reduced in accordance to the weighted distortion measure. The coder utilizes on voice frames of twenty minutes that is analogous to one hundred and sixty voice samples at the frequency of eight thousand samples per second. Each of the one hindered and sixty samples, the voice signs is studied to derive the measures of the CELP model. These measurements are encrypted and transferred. On the decrypted, these measurements are then decrypted, and the voice is combined by sieving the remade excitation signal via the LP combination filter. LP investigation is done two times per frame for the twelve points two kbit/s mode and once for the other module. For the 12.2 kbit/s mode, the 2 sets of LP measures are transformed to align with the spectrum pairs (LSP), and concurrent quantized applying (SMQ) with thirty eight bits. For the other modes, the single set of LP measures is changed to align the (LSP), and vector quantized applying split vector quantization. The voice is divided into four sub frames of four ms each. The adaptive and fixed codebook measures are transferred to the sub frame. The quantized and quantized LP measures or their interposed types are applied according to their sub frame. An open-loop pitch lag is ascertained in every other sub frame according to the weighted voice sign. Principles of the adaptive multi-rate speech decoder The AMR decoder removes transferred indices that are delivered from the bitstream. The indices are decrypted so as to get the code amounts at each frame. These amounts are the LSP vectors, the fractional pitch lags, the ingenious code vectors and the pitch and ingenious code vectors and pitch and ingenious advantages. The LSP vectors are transformed to the Filter coefficients and interpolated to get LP filters at each sub frame (13). At each sub frame: The excitation is created by aggregating the adaptive and ingenious code vectors ascertained by their profits. The voice is made by sieving the excitations Lp combination filter. Finally, the recreated voice sign is passed through an adaptive post filter. Works Cited Sauter, Martin. From GSM to LTE: an introduction to mobile networks and mobile broadband. John Wiley & Sons, 2010. Read More

The short term based energy methodology that is applied to divide the silence/ unvoiced region from the voiced portion are generally fast however they are hindered by the fact that the minimum amount required to be executed selected on an adhoc means. This means that the recognition system has to be retuned every time there is a small change in the ambience. LPC analysis (linear predictive coding analysis) LPC is a model of voice signal production that argues that voice signal is offered by a certain model.

In LPC a coding scheme that is closely correlated to the model can result in an efficient representation of the signal. Speak signals are created by twisting of the excitation source and the time varying vocal tract system components (22). LPC is most commonly applied in voice recognition and verification, voice storage, voice coding voice combinations. For deconvolution of the voice into excitation and components of a vocal tract system, methods that are associated on homomorphic examinations are created.

To reduce the computational complications at source and systematic components from a certain time domain requires to formulate the linear predictive analysis. This methodology is imperative for speech recognition of arbitrary speakers. In this methodology the ratio of two vocal tracts that communicate to a new speaker and a reference speaker are ascertained from the training speech information of several typical vowels .Normalized LPC measures on these ratios can be ascertained for any speech information.

Compared to other ways of speech measure normalization, this methodology needs not estimate formant frequencies and is simplistic and reliable. LPC analysis divided speech in to two parts namely; a) A filter function that constraints of LPC coefficients hinged on the assumptions that the source function has been filtered via a single variable cross-section tube. b) A source format can be the original signal inverse through the filter function or a stylized version that consist of white noise or pulse train at the necessary pitch for voiced speech.

The LP methodology gives very accurate estimates of speech, and it is very effective and efficient. The LP methods can be used in controlling information and are also called methods of system estimation. They contain covariance, autocorrelation, lattice, inverse filler correlation, spectral filler formulation, maximum likelihood methods. The main advantage of this method is that it is derived from simplified tract model and source model filter with a speech production system. The limitation of this method is that its harmonic spectrum produces spectral aliasing which is especially seen in high pitched signals which affect the harmonics of the signals .

Quantisation Quantization is a process of changing an array of data values into little sets of product outcome that are approximately near the initial data. It may also refer to the changing of a sampled signal that is discrete in time but continuous in value into a signal that is discrete in value. Quantization creates the range of a signal discrete so as the quantized signal takes a discrete and finite set of values (37). Quantization cannot be reversed, and any attempt to reverse it leads to loss of data.

Quantization creates a sampled signal digital and ready to be processed by a computer. Computers operate in ones and zeros; these makes the conversion of analogue to digital a close approximation. For instance the case of music, the quantized signal must maintain close succession and amplitude of value as well as the timing. For instance, when recording in a music studio, the mic picks up analog music sound waves that are then changed into digital. The audio signal can be sampled at fourty four, one hundred hertz and quantized at 8,16,24 bits larger bits giving more information that facilitates more proximate conversion and reproduction of the original wave form.

The main choices in quantization are the figure of discrete quantization levels that is in use.

Read More
Cite this document
  • APA
  • MLA
  • CHICAGO
(Adaptive Multi-Rate Narrow-Band Coders Assignment Example | Topics and Well Written Essays - 1750 words, n.d.)
Adaptive Multi-Rate Narrow-Band Coders Assignment Example | Topics and Well Written Essays - 1750 words. https://studentshare.org/engineering-and-construction/2056451-amr-coders
(Adaptive Multi-Rate Narrow-Band Coders Assignment Example | Topics and Well Written Essays - 1750 Words)
Adaptive Multi-Rate Narrow-Band Coders Assignment Example | Topics and Well Written Essays - 1750 Words. https://studentshare.org/engineering-and-construction/2056451-amr-coders.
“Adaptive Multi-Rate Narrow-Band Coders Assignment Example | Topics and Well Written Essays - 1750 Words”. https://studentshare.org/engineering-and-construction/2056451-amr-coders.
  • Cited: 0 times
sponsored ads
We use cookies to create the best experience for you. Keep on browsing if you are OK with that, or find out how to manage cookies.
Contact Us