Basic introduction
Digital signal processing is the theory and technology of digitally representing and processing signals. Digital signal processing and analog signal processing are subsets of signal processing.
The purpose of digital signal processing is to measure or filter continuous analog signals in the real world. Therefore, it is necessary to convert the signal from the analog domain to the digital domain before digital signal processing, which is usually realized by an analog-to-digital converter. The output of digital signal processing is often converted to the analog domain, which is achieved through a digital-to-analog converter.
Digital signal processing algorithms need to use computers or special processing equipment such as digital signal processors (DSP) and application specific integrated circuits (ASIC). Digital signal processing technology and equipment have outstanding advantages such as flexibility, accuracy, strong anti-interference, small equipment size, low cost, and high speed, which are unmatched by analog signal processing technology and equipment.
Overview
Digital signal processing is developed around the theory, implementation and application of digital signal processing. The theoretical development of digital signal processing has promoted the development of digital signal processing applications. Conversely, the application of digital signal processing has promoted the improvement of digital signal processing theory. The realization of digital signal processing is the bridge between theory and application.
Digital signal processing is based on many disciplines, and it covers a wide range. For example, in the field of mathematics, calculus, probability and statistics, stochastic processes, and numerical analysis are all basic tools for digital signal processing, and are closely related to network theory, signal and system, cybernetics, communication theory, and fault diagnosis. Some newly emerging disciplines, such as artificial intelligence, pattern recognition, neural networks, etc., are inseparable from digital signal processing. It can be said that digital signal processing takes many classic theoretical systems as its theoretical basis, and at the same time makes itself the theoretical basis of a series of emerging disciplines.
Implementation methods
Generally, there are several implementation methods of DSP:
(1) Use software (such as PC) on a general-purpose computer (such as PC) Fortran, C language);
(2) Add a dedicated accelerated processor to the general-purpose computer system;
(3) Use a general-purpose single-chip computer (such as MCS-51 , 96 series, etc.), this method can be used for some less complex digital signal processing, such as digital control, etc.;
(4) Realize with a general programmable DSP. Compared with the single-chip microcomputer, the DSP chip has more suitable software and hardware resources for digital signal processing, and can be used for complex digital signal processing algorithms;
(5) It is realized by a dedicated DSP chip. In some special occasions, the required signal processing speed is extremely high, which is difficult to achieve with general-purpose DSP chips, such as DSP chips dedicated to FFT, digital filtering, convolution, and related algorithms. This chip integrates the corresponding signal processing algorithms The chip is implemented in hardware without programming.
Among the above methods, the disadvantage of the first method is that it is slower and can generally be used for the simulation of DSP algorithms; the second and fifth methods are highly specific, and their applications are greatly affected. Restrictions, the second method is not convenient for the independent operation of the system; the third method is only suitable for implementing simple DSP algorithms; only the fourth method opens up a new situation for the application of digital signal processing.
History
The world's first single-chip DSP chip should be the S2811 released by AMI in 1978, and the commercial programmable device 2920 released by Intel in 1979 is a DSP chip. A major milestone. Neither chip has the single-cycle multiplier necessary for modern DSP chips. In 1980, the μPD7720 introduced by NEC Corporation of Japan was the first commercial DSP chip with a multiplier.
Current situation
After this, the most successful DSP chip is a series of products from Texas Instruments (TI). TI successfully launched its first-generation DSP chip TMS32010 and its series products TMS32011, TMS320C10/C14/C15/C16/C17 in 1982, and then successively introduced the second-generation DSP chip TMS32020, TMS320C25/C26/C28, and the third Generation DSP chip TMS320C30/C31/C32, fourth generation DSP chip TMS320C40/C44, fifth generation DSP chip TMS320C5X/C54X, improved second generation DSP chip TMS320C2XX, high performance DSP chip TMS320C8X integrating multiple DSP chips And currently the fastest sixth-generation DSP chip TMS320C62X/C67X, etc. TI summarizes the commonly used DSP chips into three series, namely: TMS320C2000 series (including TMS320C2X/C2XX), TMS320C5000 series (including TMS320C5X/C54X/C55X), and TMS320C6000 series (TMS320C62X/C67X). Today, TI’s series of DSP products have become the most influential DSP chips in the world today. TI has also become the world's largest DSP chip supplier, and its DSP market share accounts for nearly 50% of the world's share.
Features
Consider an example of digital signal processing, such as a finite impulse response filter (FIR). In mathematical terms, the FIR filter is a series of dot products. Take an input and an ordinal vector, multiply between the coefficient and the sliding window of the input sample, and then add up all the products to form an output sample.
Similar operations repetitively occur in large numbers in the digital signal processing process, so that the devices designed for this must provide special support, which promotes the shunt of DSP devices and general-purpose processors (GPP):
Support for intensive multiplication operations
GPP is not designed to do intensive multiplication tasks. Even some modern GPPs require multiple instruction cycles to do a multiplication. The DSP processor uses specialized hardware to implement single-cycle multiplication. The DSP processor also adds an accumulator register to handle the sum of multiple products. The accumulator register is usually wider than other registers, and extra bits called result bits are added to avoid overflow.
At the same time, in order to fully reflect the benefits of specialized multiplication-accumulation hardware, almost all DSP instruction sets include explicit MAC instructions.
Memory structure
Traditionally, GPP uses the von Neumann memory structure. In this structure, only one memory space is connected to the processor core through a set of buses (an address bus and a data bus). Normally, 4 memory accesses will occur for one multiplication, which consumes at least four instruction cycles.
Most DSPs use the Harvard structure, which divides the memory space into two to store programs and data respectively. They have two sets of buses connected to the processor core, allowing simultaneous access to them. This arrangement doubles the bandwidth of the processor memory, and more importantly, provides data and instructions to the processor core at the same time. Under this kind of layout, DSP can realize the MAC order of single cycle.
There is another problem, that is, the typical high-performance GPP actually contains two on-chip caches, one for data and one for instructions, which are directly connected to the processor core to speed up runtime The speed of access. Physically speaking, the structure of this on-chip dual memory and bus is almost the same as that of Harvard. However, logically speaking, there are still important differences between the two.
GPP uses control logic to determine which data and instruction words are stored in the on-chip cache, which the programmer does not specify (or may not even know). In contrast, DSP uses multiple on-chip memories and multiple sets of buses to ensure multiple accesses to the memory in each instruction cycle. When using DSP, the programmer must clearly control which data and instructions are stored in the on-chip memory. When a programmer writes a program, he must ensure that the processor can effectively use its dual bus.
In addition, DSP processors hardly have data caches. This is because the typical data of a DSP is a data stream. In other words, after the DSP processor calculates each data sample, it is discarded and is almost never reused.
Zero-overhead loop
If you understand a common feature of DSP algorithms, that is, most of the processing time is spent on executing smaller loops, it is easy to understand why Most DSPs have dedicated hardware for zero-overhead loops. The so-called zero-overhead loop means that when the processor executes the loop, it does not need to spend time checking the value of the loop counter, the condition is transferred to the top of the loop, and the loop counter is decremented by one.
In contrast, the GPP cycle is implemented by software. Some high-performance GPPs use transition prediction hardware, which almost achieves the same effect as the zero-overhead loop supported by hardware.
Fixed-point calculations
Most DSPs use fixed-point calculations instead of floating-point. Although the application of DSP must pay great attention to the accuracy of numbers, it should be much easier to do it with floating point, but for DSP, low cost is also very important. Fixed-point machines are cheaper (and faster) than corresponding floating-point machines. In order not to use a floating-point machine and to ensure the accuracy of the numbers, the DSP processor supports saturation calculation, rounding and shifting in both the instruction set and hardware.
Special addressing mode
DSP processors often support special addressing modes, which are very useful for common signal processing operations and algorithms. For example, module (cyclic) addressing (useful for implementing digital filter delay lines), bit-reversed addressing (useful for FFT). These very specialized addressing modes are not often used in GPP, and can only be realized by software.
Prediction of execution time
Most DSP applications (such as cellular phones and modems) are strictly real-time applications, and all processing must be completed within a specified time. This requires the programmer to determine exactly how much processing time is required for each sample, or, at least, how much time is required in the worst case.
If you plan to use a low-cost GPP to complete the task of real-time signal processing, the execution time prediction will probably not be a problem, because the low-cost GPP has a relatively straightforward structure and is easier to predict the execution time. However, the processing power required by most real-time DSP applications cannot be provided by low-cost GPPs.
At this time, the advantage of DSP over high-performance GPP is that even if a cached DSP is used, it is up to the programmer (not the processor) to decide which instructions will be put in, so it is easy Determine whether the instruction is read from the cache or from the memory. DSP generally does not use dynamic characteristics, such as branch prediction and inference execution. Therefore, it is completely straightforward to predict the required execution time from a given piece of code. This allows the programmer to determine the performance limits of the chip.
Fixed-point DSP instruction set
The fixed-point DSP instruction set is designed according to two goals:
·Enable the processor to complete in each instruction cycle Multiple operations to improve the computational efficiency of each instruction cycle.
·Minimize the memory space for storing DSP programs (because the memory has a great impact on the cost of the entire system, this problem is particularly important in cost-sensitive DSP applications).
In order to achieve these goals, the instruction set of the DSP processor usually allows the programmer to specify several parallel operations within one instruction. For example, a MAC operation is included in one instruction, that is, one or two data moves at the same time. In a typical example, one instruction contains all the operations required in the section to calculate the FIR filter. The price paid for this high efficiency is that its instruction set is neither intuitive nor easy to use (compared to GPP's instruction set).
GPP programs usually don't care whether the processor's instruction set is easy to use, because they generally use high-level languages like C or C++. For DSP programmers, unfortunately, the main DSP applications are written in assembly language (at least partly optimized in assembly language). There are two reasons for this: First, most widely used high-level languages, such as C, are not suitable for describing typical DSP algorithms. Secondly, the complexity of the DSP structure, such as multiple memory spaces, multiple buses, irregular instruction sets, and highly specialized hardware, makes it difficult to write efficient compilers for it.
Even if the C source code is compiled into DSP assembly code with a compiler, the task of optimization is still very heavy. Typical DSP applications have a lot of computational requirements, and there are strict overhead restrictions, making the optimization of the program essential (at least for the most critical part of the program). Therefore, a key factor in considering the choice of DSP is whether there are enough programmers who can better adapt to the instruction set of the DSP processor.
Requirements for development tools
Because DSP applications require highly optimized code, most DSP manufacturers provide some development tools to help programmers complete their optimization work. For example, most manufacturers provide processor simulation tools to accurately simulate the activity of the processor in each instruction cycle. Whether for ensuring real-time operation or optimizing code, these are very useful tools.
GPP vendors usually do not provide such tools, mainly because GPP programmers usually do not need detailed information at this level. The lack of simulation tools accurate to the instruction cycle of GPP is a big problem faced by DSP application developers: it is almost impossible to predict the number of cycles required by high-performance GPP for a given task, so it is impossible to explain how to improve the performance of the code.
Applications
The demand for data communication in modern society is developing towards diversification and personalization. And wireless data communication, as a powerful means to provide the public with rapid, accurate, safe, flexible and efficient data communication, its market demand is becoming increasingly urgent. It is under this situation that 3G, 4G, and 5G communications will continue to be introduced, but whether it is 3G, 4G or 5G, future communications will be inseparable from DSP technology (digital signal processor). This kind of powerful special microprocessor is mainly used in high-speed mathematical operations and real-time processing of data, voice, and video signals. It can be said that DSP will play a pivotal role in the future communication field.
In order to ensure that future communications can work freely and efficiently in various environments, this requires that the DSP that constitutes future communications must have a very high speed of processing signals in order to achieve various complex calculations, Unzip and compile code. At present, DSPs can be divided into fixed-point DSPs and floating-point DSPs according to their functional focus. Fixed-point DSPs are known for their low cost, and floating-point DSPs are known for their fast speed. If only one type of DSP is used, the potential of future communications cannot be maximized. In order to combine the advantages of fixed-point and floating-point, and break through the bottleneck of DSP technology, people have introduced an advanced multi-processing structure-VLIW structure, which can achieve a strong realization without increasing the clock speed. Digital signal processing capability, and it can have all the advantages of fixed-point DSP and floating-point DSP at the same time. In order to launch a series of more high-end new technology platforms, people began to pay attention to the development of DSP core technology, because the DSP core is equivalent to the computer CPU, known as the heart of DSP, a large number of algorithms and operations have to be passed It is done, so the quality of the core structure will directly affect the performance, power consumption and cost of the entire DSP chip.
Considering the need for wireless access to the Internet and the development of multimedia services in the future, Sun in the United States is now preparing to embed the company’s leading product-PersonalJava language into the DSP in order to further improve the DSP The degree of automation and intelligence in signal processing. Of course, other software languages, such as high-level C language, were also embedded in the previous DSP, but this language is powerless in processing network resources and multimedia information; and PersonalJava is a Java environment suitable for personal network connections and applications, based on this environment The personal communication system can download data and images from the network and the Internet. In addition, people are still researching and developing DSP that conforms to the MPEG-4 wireless decompression standard, which will provide a basis for future communication and transmission of various multimedia information.
As a case study, let’s consider the most common function in the digital field: filtering. Simply put, filtering is to process the signal to improve its characteristics. For example, filtering can remove noise or electrostatic interference from the signal, thereby improving its signal-to-noise ratio. Why use a microprocessor instead of an analog device to filter the signal? Let's take a look at its advantages:
The performance of analog filters (or more generally, analog circuits) depends on environmental factors such as temperature. The digital filter is basically not affected by the environment.
Digital filtering is easy to replicate within a very small tolerance, because its performance does not depend on the combination of devices whose performance has deviated from the normal value.
Once an analog filter is manufactured, its characteristics (such as passband frequency range) are not easy to change. Using a microprocessor to implement a digital filter, you can change the characteristics of the filter by reprogramming it.