AN2094 Freescale Semiconductor / Motorola, AN2094 Datasheet - Page 32

no-image

AN2094

Manufacturer Part Number
AN2094
Description
ITU-T G.729 Implementation on StarCore SC140
Manufacturer
Freescale Semiconductor / Motorola
Datasheet
Implementation Strategies
Compared to the ported version, we achieved an application improvement of 2.28 times for optimized C and 3.47
times for the best assembly implementation. These values translate into a StarCore optimization factor of 2.44 for
the optimized C code and 3.99 for the assembly. Thus, in assembly the performance improvement approaches the
ideal limit of four for a system with four ALUs.
The primary reason for this dramatic improvement is that the assembly programmer fully exploits the potential of
the architecture—the registers are more fully used, and memory moves are performed in parallel with other
computations. These improvements cannot be reliably implemented by the compiler. The performance margin
between our results and the project target allowed us to study several hypothetical scenarios and to explore ways to
reduce the efforts in future implementation.
5.3 Project Results
Figure 7 presents the results of the various implementation strategies for the G.729 vocoder and the effort needed
to achieve these results.
.
The squares in Figure 7 represent steps in the actual implementation. First, code representing 95 percent of the
original run time was optimized in C, with no assembly optimization (the ‘BestC-95 percent’ point on the graph).
Then, writing only three functions in assembly (D4i40_17(), Cor_h(), and Norm_Corr()), a performance
of 10.49 MCPS was achieved. Thus, about 9.5 percent of the code is in assembly and 90.5 percent is optimized C
code. The effort required to achieve this target was only 10 man-months. In the next phase, three additional
functions (Syn_filt(), Az_lsp() and Chebps()) were implemented in assembly for a total of 6 assembly
functions, resulting in a run time of 9.47 MCPS after a total effort of about 11 man-months. In the final phase,
represented by the ‘BestMixed’ point on the graph, 18 functions are implemented in assembly. These results are
summarized in Table 13.
32
16
15
14
13
12
11
10
9
8
6
Figure 7. Performance Versus Effort for Various G.729 Vocoder Implementations
BestC-86%
ITU-T G.729 Implementation on the StarCore™ SC140/SC1400 Cores, Rev. 1
7
BestC-86%+6Asm
BestC-95%
8
9
10
Effort (man-months)
BestC-95%+3Asm
11
BestC-95%+6Asm
12
13
Project Target
BestMixed
14
Freescale Semiconductor
15
16

Related parts for AN2094