Menu English Ukrainian russian Home

Free technical library for hobbyists and professionals Free technical library


ENCYCLOPEDIA OF RADIO ELECTRONICS AND ELECTRICAL ENGINEERING
Free library / Schemes of radio-electronic and electrical devices

Speech coding in digital cellular communication systems. Encyclopedia of radio electronics and electrical engineering

Free technical library

Encyclopedia of radio electronics and electrical engineering / Mobile telephony

Comments on the article Comments on the article

The article recalls the general principles of digital speech coding in telecommunications. The author covers in sufficient detail the very complex coding processes used in digital cellular communication systems. Theoretical research and original engineering solutions have made it possible to create an elegant small-sized subscriber radiotelephone. The complex processes taking place in it, which users and even many telecommunications specialists do not even know about, the reader will learn from this article.

The mysteries of speech signals attracted the attention of researchers long before the advent of electrical communication. Back in the 1707th century, one of the greatest mathematicians, academician of St. Petersburg Leonard Euler (1783-16), wrote in a letter to a German princess dated June 1761, XNUMX: , the most important invention ... The addition of such a machine does not seem impossible to me.

The idea of ​​inventing a talking machine excited the minds of many creators, who not only sought to create it in the form that Euler imagined, but also as a means of transmitting speech over a distance. For example, the inventor of the telephone, A. G. Bell (1847-1922), was engaged in the design of such a machine. However, in the end it turned out that speech transmission over a distance can be carried out without such a machine. This was achieved quite simply. Using a microphone, the air vibrations carrying speech were converted into electric current vibrations, which were transmitted through wires, and at the receiving end, using a telephone, they were again converted into air vibrations.

This method of transmission is called analog because of the obvious analogy between air vibrations that carry sound and electrical vibrations that transmit sound. Studies of analog speech transmission with amplitude modulation have shown that a frequency band from 300 to 3400 Hz is sufficient for normal speech reproduction quality. Such a band was adopted as an international standard, and the worldwide telephone network was built on its basis. The principle of operation of this network today is familiar not only to every signalman, but also to the general public.

Digital speech transmission in wired communication networks

Fundamental changes in approaches to the organization of telephone communications arose when the means of communication were transferred to digital technology. The advantages of digital transmission methods are widely known. Let us recall only the most important of them - digital technology makes it possible to provide any predetermined quality of communication. For digital speech transmission, it is necessary to perform an analog-to-digital conversion of the speech signal: subject the analog signal to sampling, quantization and coding. The combination of these operations is called pulse code modulation (PCM). To accurately describe the shape of a speech signal, according to the Kotelnikov theorem, it has to be sampled at a frequency of 8 kHz (i.e., take samples every 125 μs), and to obtain normal speech reproduction quality, each sample must be quantized on a scale divided into 8192 levels ( when choosing a uniform quantization scale). It takes 13 bits to encode each sample value as a binary number.

As a result, to transmit a telephone conversation using a sequence of binary pulses, a speed of 8x13 = 104 kbps is required (which corresponds to a frequency band of 52 kHz with optimal coding). Comparing this number to the 3100 Hz bandwidth required for analog transmission, one cannot help but be struck by the enormous increase in required bandwidth that comes at the cost of the benefits of digital transmission. It is natural to try to reduce the transmission rate when implementing a digital transmission system.

The first step in this direction is fairly obvious. Quantization into 213 levels is necessary because the levels of analog speech signals can vary in the range of 60 dB. In this case, high-level signals with a uniform quantization scale are quantized with the same step as low-level signals. But since the perception of signals by human hearing organs is proportional to the logarithm of the signal level, it would be natural to quantize high-level signals more roughly, and low-level signals more accurately. By applying non-linear quantization using a logarithmic law, eight bits per sample can be dispensed with, while maintaining almost the same transmission quality. As a result, the bit rate will be 64 kbps. It is this speed that has become the most widely used, it is fixed in CCITT recommendation C.711, and PCM equipment operates on it in many countries.

Can the speed be reduced further?

The analog signal has a lot of redundancy. This allows you to predict the next sample and transmit only the difference between the actual and predicted value of each sample. If you apply a good prediction scheme, the change in the amplitude of the signal increment will be less than the change in the amplitude of the signal itself, which will lead to a decrease in the amount of information transmitted. This principle is used to build differential PCM (DPCM) and adaptive differential PCM (ADPCM), which makes it possible to reduce the speech rate to 32 kbit/s and lower due to the further complication of the transceiver equipment. Continuing to complicate the equipment, it is possible to increase the speech rate to 100-300 bps. One can imagine, for example, a speech-to-text converter on the transmitting side, and a reading machine on the receiving side.

There are known ways to further reduce the speed of speech transmission, but we will not dwell on this. The fact is that the equipment for digital voice transmission at a speed of 64 kbit / s satisfied everyone because it turned out to be efficient when using the simplest symmetrical cables with a pair of twists. The IKM-30 equipment began its triumphal procession with the compaction of connecting lines between city telephone exchanges. Where earlier it was possible to organize a connecting line over a cable pair to transmit only one conversation, the IKM-30 equipment made it possible to organize the transmission of 30 conversations over the same pair. The best use of such a pair with the help of analog equipment for multi-channel communication was out of the question.

Later, the IKM-120 equipment and other high-performance systems operating on coaxial cables and optical fibers appeared, and the acuteness of the issue of reducing the transmission rate of spoken signals below 64 kbps in wired communication networks was practically removed. Even the numerous developments of digital transmission equipment with a speed of 32 kbps, implemented in many countries on the basis of the ADPCM principle (including the development carried out in our country under the leadership of M. U. Polyak), have not been widely used. The balance between the increase in the bandwidth of the channel-forming equipment and the complexity of the terminal equipment in wired communications has not yet tilted in favor of the first solution.

Speech coding in digital cellular radio systems

Quite different perspectives opened up in the late 1980s and early 1990s, when cellular digital radiotelephone systems began to develop. Unlike wired networks, where bandwidth expansion is possible by laying new lines, i.e., renewing bandwidth resources, radio networks have a strict law of air tightness, and you have to deal with a non-renewable radio frequency resource. True, the idea of ​​cellular communications is precisely to renew the radio frequency resource by repeating the transmission frequency in the territory to which the signal of the same frequency from the interfering radio station does not reach. But the possibilities of such a renewal of the resource are limited here too, so further complication of the equipment for the sake of reducing the transmission rate turns out to be justified.

For example, in the GSM digital cellular communication system adopted in most European countries, the standard voice rates are 13 and 6,5 kbps. To implement such a transmission system, it was necessary to turn to the old idea of ​​​​the Euler machine and a deeper penetration into the mechanism of speech production.

As is known, one of the most important results of the modern theory of information transmission is the recommendation to separate the tasks of source coding and channel coding. The task of encoding the source of information includes describing the transmitted message in the most economical form, i.e., removing redundancy in the message. The compressed message thus received becomes more vulnerable to interference and may be corrupted during transmission. Therefore, after source encoding, channel encoding is applied, the task of which is to protect the transmitted message from interference. Channel coding requires some redundancy to be introduced into the transmitted message, but not random, which was present in the original message, but strictly justified theoretically and which guarantees the specified transmission quality.

So far, we have considered only source encoding problems, which we will now approach from more general positions.

So, there is a digital version of an analog speech signal, i.e. a function that describes, for example, the law of current change with time. It is necessary to try to remove redundancy from such a signal. This problem can be solved in several ways. One of them is to try to find redundancy by a purely mathematical analysis of the function in question. Another way to solve the problem is to analyze the acoustic characteristics of this function (from the point of view of its perception by the organs of hearing). Finally, one can look for redundancy by modeling the process of speech production itself. It is the last of these methods that has found application in modern digital radio communication systems.

The mechanism for the formation of speech sounds is that the harmonic-rich sound of the vocal cords, which changes its strength and fundamental frequency, is further processed in the oral cavity. The latter works, firstly, as a resonator, which, when rebuilt, highlights certain frequencies - formants that determine the differences between vowel sounds. Secondly, the movements of the tongue, teeth and lips modulate the sound, producing different consonants. In the 1930s, the Bell Telephone Laboratories (USA) built a machine based on the idea of ​​Euler, the principles of which were based on attempts to simulate the functioning of the human speech organs.

In order to synthesize speech at the receiving end of a communication system, we need a rich spectrum audio frequency generator, a white noise generator, a set of formant filters (their number is small, since there are few vowels, and each of them is quite well defined by two formants), and modulating circuits . With such a set of equipment at the receiving end, it is possible to transmit over the communication channel not a speech signal, but only commands that control the process of speech synthesis. Thus, the practical task is to find a way to generate the necessary commands. It is this problem that is solved by the designers of cell phones.

In the GSM system of the first releases, the original digital stream of a speech signal with a transmission rate of 104 kbps is divided into separate blocks of 160 samples, which are recorded. Each of these blocks takes a time interval of 20 ms (in other words, sequences of 160x13=2080 bits are stored). The registered sequences are analyzed, as a result of which for each of them eight filtering coefficients are found that determine the corresponding resonances, and an excitation signal. It is this information that is transmitted to the receiver, which reproduces the original speech signal from it, similar to how it happens in the human speech organs (this organ, as it were, is adjusted using eight parameters, and then sound is obtained when it is excited).

However, the mentioned analysis covers comparatively short periods of time and cannot detect long vowel sounds that involve neighboring blocks. Therefore, long-term prediction is used to eliminate redundancy when pronouncing long vowels. To this end, the transmitter stores the transmitted sequences with a duration of 15 ms, with which the current sequences are compared. From those already transmitted, the sequence that has the highest correlation with the current one (i.e., most similar to the current one) is selected, and only the difference between the current and selected sequences is transmitted. Since the sequences recorded in the transmitter are known to the receiver, it is only necessary to transmit an indication of which of the recorded sequences the comparison was made with. Thus, a further reduction in the amount of transmitted information is achieved. As a result of the described processing, a 20 ms digital speech signal block is obtained, containing 260 bits and having a transmission rate of only 13 kbps (i.e., eight times lower than the original one). The described procedure is called regular pulse excitation with long-term prediction (English abbreviation PRE-LTR, which stands for Regular Pulse Excitation - Long Term Prediction).

At the next stage, channel coding comes into play, the task of which is to protect against interference in the communication channel. Modern coding technique is based on deep ideas of algebra and probability theory. Based on these ideas, various and very effective coding methods have been developed that solve certain problems in each specific case. We confine ourselves here to a brief review of some of the ideas used in the GSM system.

Code protection can serve either only to detect the fact of an error, or to correct errors that have occurred. The first option is much easier to implement, but less useful, because in this case it is necessary to request a retransmission of the message block in which an error was detected, or to take into account the presence of an error in some other way. Since individual bits in the digital speech signal obtained in the course of the source encoding procedures described above are not of equal importance, they are divided into three subclasses and subjected to different protection methods during channel encoding. Of the 260 bits of the received block, the most important are the bits that carry information about the filtering parameters, the amplitude of the block signal, and the parameters of the long-term prediction. These bits belong to the so-called subclass Ia (50 bits). Then comes subclass Ib (132 bits containing pointers and information about regular excitation pulses, as well as some parameters of long-term prediction). The remaining 78 bits are class II.

Speech coding in digital cellular systems

To protect the described block, two coding methods are used. First, a block code is used to detect errors that remain uncorrected. This code belongs to the class of cyclic ones, in which each code combination is obtained by a cyclic permutation of elements. When encoded with this code, three more check bits are added to the bits of subclass Ia, by which the decoder can determine whether this subclass contains uncorrected errors. If the decoder detects transmission errors in subclass Ia bits, the entire 260-bit conversational frame is discarded. In this case, the lost frame is reproduced by interpolation based on information about the previous frame. It was found that with this solution, the transmission quality is better than in the case of reproduction of erroneous bits of subclass Ia. Secondly, a convolutional code is applied to correct errors. This name of the code is explained by the mathematical operation of convolution applied to the functions that describe the processing of the encoded bit sequence. Unlike a block code, a convolutional code is continuous in the sense that when it is applied, the encoding and decoding processes are performed not on fixed blocks, but on a continuously running sequence of symbols.

The convolutional code is applied both to the bits of subclass Ia, together with the check bits, and to the bits of subclass Ib. These two sequences are combined and increased by four bits (see below in Fig. 2), which take on zero values. The latter serve to return the encoder to its original state after encoding. The applied code is characterized by the parameters r=1/2 and K=5. The coefficient r=1/2 indicates that for each bit entering the encoder input, exactly two bits are obtained in the encoded sequence, and K=5 denotes the length of the connection, which is covered by the convolution operation. These characteristics can be understood from the convolutional coding scheme shown in Fig. 1, which also shows the modulo 2 addition scheme (logical operation "exclusive OR"). Thus, as a result of encoding, 189 bits are obtained from the incoming 378 bits, and unprotected class II bits are added to them, as a result of which the total block length is equal to 456 bits (Fig. 2). This is exactly eight sub-blocks of 57 bits. From such sub-blocks, bursts of radio transmission with time division are formed.

Speech coding in digital cellular systems

This article is devoted to the issues of coding speech signals, and, as can be understood from what has been described, the share of the processor placed in a small-sized handset accounts for a rather large amount of their digital processing. However, the tasks of the processor are far from exhausted. As you know, instead of voice transmission, a cellular communication system allows you to organize a data transmission channel, which is encoded according to completely different rules. But, in addition to logical channels for transmitting useful (paid) information, a large number of logical channels for transmitting control signals are organized in a cell phone. Each of these logical channels is subject to specific requirements for encoding information, and, accordingly, each such channel contributes its share to the processor load.

A general idea of ​​the coding schemes, as well as the formation of flashes for the transmission of all logical channels in the radiotelephone communication system, is given in Fig. 3.

Speech coding in digital cellular systems

Here, ten different logical channels are shown at the top level, indicating the sizes of message blocks in these channels (in the form of specific numbers or letters - P0, N0, etc. - where these numbers can change). The next level shows the first stage of encoding for different logical channels, indicating the number of bits of the original sequence and the sequence obtained after encoding. If a cyclic error-detecting code is used for the speech channel, then various error-correcting cyclic codes are used for the remaining channels, including the Fire cyclic code that corrects series of errors. At the second stage of encoding, the already mentioned convolutional code is applied. Further (stage 3), to distribute the received 456 bits among individual bursts (each carrying two blocks of 57 bits), the operations of bit mixing and block permutation (direct or diagonal transposition) are applied.

The total volume of signal processing in a cell phone is estimated at millions of operations per second. Thus, unlike a conventional telephone, a cell phone is a miniature, but very productive computer. On the one hand, it analyzes "its own" speech signal, developing control commands for speech synthesis in the interlocutor's apparatus, and on the other hand, this computer implements Euler's idea, synthesizing the interlocutor's speech according to control commands coming from the communication channel.

Author: V. Neumann, prof., doctor of tech. Sciences, Moscow

See other articles Section Mobile telephony.

Read and write useful comments on this article.

<< Back

Latest news of science and technology, new electronics:

Artificial leather for touch emulation 15.04.2024

In a modern technology world where distance is becoming increasingly commonplace, maintaining connection and a sense of closeness is important. Recent developments in artificial skin by German scientists from Saarland University represent a new era in virtual interactions. German researchers from Saarland University have developed ultra-thin films that can transmit the sensation of touch over a distance. This cutting-edge technology provides new opportunities for virtual communication, especially for those who find themselves far from their loved ones. The ultra-thin films developed by the researchers, just 50 micrometers thick, can be integrated into textiles and worn like a second skin. These films act as sensors that recognize tactile signals from mom or dad, and as actuators that transmit these movements to the baby. Parents' touch to the fabric activates sensors that react to pressure and deform the ultra-thin film. This ... >>

Petgugu Global cat litter 15.04.2024

Taking care of pets can often be a challenge, especially when it comes to keeping your home clean. A new interesting solution from the Petgugu Global startup has been presented, which will make life easier for cat owners and help them keep their home perfectly clean and tidy. Startup Petgugu Global has unveiled a unique cat toilet that can automatically flush feces, keeping your home clean and fresh. This innovative device is equipped with various smart sensors that monitor your pet's toilet activity and activate to automatically clean after use. The device connects to the sewer system and ensures efficient waste removal without the need for intervention from the owner. Additionally, the toilet has a large flushable storage capacity, making it ideal for multi-cat households. The Petgugu cat litter bowl is designed for use with water-soluble litters and offers a range of additional ... >>

The attractiveness of caring men 14.04.2024

The stereotype that women prefer "bad boys" has long been widespread. However, recent research conducted by British scientists from Monash University offers a new perspective on this issue. They looked at how women responded to men's emotional responsibility and willingness to help others. The study's findings could change our understanding of what makes men attractive to women. A study conducted by scientists from Monash University leads to new findings about men's attractiveness to women. In the experiment, women were shown photographs of men with brief stories about their behavior in various situations, including their reaction to an encounter with a homeless person. Some of the men ignored the homeless man, while others helped him, such as buying him food. A study found that men who showed empathy and kindness were more attractive to women compared to men who showed empathy and kindness. ... >>

Random news from the Archive

Cheap water treatment 20.06.2018

Bob Tilton and Todd Pshibizen, two professors of biomedical engineering and chemical engineering at Carnegie Mellon University, have come up with an interesting way to filter water. It is based on the seeds of the Indian oil tree.

Experts suggested to use the seeds of moringa oilseed. The tree grows well in tropical and subtropical regions and is cultivated for food products and oils. Its seeds have previously purified some types of water, but it has been observed that such water can only remain pure for 24 hours.

The scientists decided to combine the method of sand filtration and the method of isolating protein from moringa seeds. So, by isolating proteins and adsorbing them on the surface of particles of silicon dioxide, the main component of sand, they created "f-sand" ("f-sand"). It is noteworthy that this method of stonecrop is not only cheap, but also universal, since it is suitable for water of any hardness. In addition, the water that has undergone such purification is enriched with useful microelements and "charged".

Other interesting news:

▪ Imagination Catapult RISC-V processors

▪ Azulle Access3 keychain computer with Intel processor

▪ Winchester Hitachi Deskstar 7K2000

▪ Iogear GUD3C02 Portable Docking Station

▪ Loadix Autonomous Agricultural Loader

News feed of science and technology, new electronics

 

Interesting materials of the Free Technical Library:

▪ section of the site for the Builder, home craftsman. Selection of articles

▪ article Construction of a semi-copy model. Tips for a modeller

▪ article How Libraries Originated? Detailed answer

▪ article Basic requirements for first aid personnel

▪ article Electronic event recorder. Encyclopedia of radio electronics and electrical engineering

▪ article Charger for sealed lead-acid batteries. Encyclopedia of radio electronics and electrical engineering

Leave your comment on this article:

Name:


Email (optional):


A comment:





All languages ​​of this page

Home page | Library | Articles | Website map | Site Reviews

www.diagram.com.ua

www.diagram.com.ua
2000-2024