AN2094 Freescale Semiconductor / Motorola, AN2094 Datasheet - Page 26

no-image

AN2094

Manufacturer Part Number
AN2094
Description
ITU-T G.729 Implementation on StarCore SC140
Manufacturer
Freescale Semiconductor / Motorola
Datasheet
Details of Selected Functions
These modifications were first applied to the unoptimized C code to verify bit-exactness. The function was then
reoptimized in C, resulting in a speed improvement of 2.8 over the unoptimized version.
4.1.3 Assembly Implementation
Assembly implementation was applied to the algorithm-modified code. Loops which had been extracted to
separate functions were re-inlined, and techniques such as software-pipelining and split summation were applied.
The result was an impressive 6.1 factor in speed improvement over the initial version.
4.1.4 Summary
Table 10 lists the Norm_Corr() cycle count and the code size for each version.
Appendix A lists the prototype and pseudocode for Norm_Corr().
4.2 Optimizations in ACELP_Codebook()
The codebook search procedure is the most time-consuming part of the G.729 vocoder. In the reference C code
provided with the G.729 standard, this procedure is performed in the ACELP_Codebook() function, which
consists of the following basic steps:
4.2.1 Function-Level C Optimizations
The following C optimization procedures were applied to the initial G.729 version:
26
1.
2.
3.
4.
Align the input data and local data on the stack.
Initialize pointers and define local variables where they are used to minimize stack allocation and
execution time.
Apply multisample and software pipelining techniques to loops.
Replace the tests that use subtraction with direct comparisons. For example, replace if (sub(a,b)>0)
with if (a>b).
Pre-filter the selected codebook vector to enhance the harmonic components to improve the quality of
reconstructed speech.
Call Cor_h(), which computes the correlation matrix of the impulse response h(n).
Call Cor_h_X(), which computes the correlation vector of the target signal x’(n) with the impulse
response h(n).
Call D4i40_17() to perform an exhaustive search for the four pulses. This function effectively gen-
erates all possible code words and chooses the one that maximizes the ratio between the correlation
squared and the energy.
ITU-T G.729 Implementation on the StarCore™ SC140/SC1400 Cores, Rev. 1
Initial version
Function-level C optimization
Algorithmic changes
Assembly implementation
Table 10. Norm_Corr() Performance Summary
Version
Cycle Count
9235
4156
3291
1512
Size (Bytes)
1084
706
772
692
Freescale Semiconductor

Related parts for AN2094