AN2094 Freescale Semiconductor / Motorola, AN2094 Datasheet - Page 30

no-image

AN2094

Manufacturer Part Number
AN2094
Description
ITU-T G.729 Implementation on StarCore SC140
Manufacturer
Freescale Semiconductor / Motorola
Datasheet
Implementation Strategies
These results show that the C compiler generates efficient code when optimization techniques are used. Code
generated for the inner loop is especially efficient—four macs with two moves appear in the same execution set.
However, the compiler does not create the best code if two maximum values are used. Code generated for
comparisons is similar to the first assembly version, taking eight cycles to compute all four comparisons. However,
even if the generated code is not optimum, it is very efficient and performs well.
5
This chapter discusses different SC140 implementation strategies for complex DSP applications. Theoretical
approaches are examined which suggest a limit for the percentage of functions to be optimized in either C or
assembly; this limit is established at either 80 or 94 percent of application execution time. More practical
approaches are also discussed, such as optimizing only those functions which are strictly necessary to meet the
project performance target, or implementing a larger set of functions to improve the performance parameters.
The discussions in this section are based only on the encoder portion of the vocoder. The code generated from the
optimized C version of the encoder required improvement from assembly implementation, primarily in the fixed
codebook search module, to meet the performance target. The decoder met the performance target using only
optimized C.
All implementation strategies should start with an analysis of the information provided by the profile data. This
information is used to indicate, among other things, which functions are the most time consuming, and which
functions may be equivalent (require the same number of cycles to execute). Optimizing C code is well worth the
effort. Even if the optimizations are not always well-reflected in the compiler output, the optimized C code serves
as a reference point for starting the assembly implementation in the early stages of the project. The optimized C
maintains the bit-exactness and serves as an implementation pattern for assembly programming.
One way to determine when to begin the assembly implementation is to use a graph of performance versus effort.
This graph is viewed as an asymptotic curve, especially when functions are optimized in descending order of
execution time. Given a deadline date, assembly implementation should begin at the point where the tangent to the
curve crosses the project performance target after the effort point corresponding to the project deadline.
5.1 Theoretical Background
The basic formula that links the performance of the group of functions targeted for optimization (G1) to the overall
application performance is shown in Equation 3.
30
Implementation Strategies
ITU-T G.729 Implementation on the StarCore™ SC140/SC1400 Cores, Rev. 1
Initial C version
Optimized C version
Final assembly version
Note:
Includes three calls.
S
Table 12. Lag_max() Performance Summary
Version
=
1 – P (f –1)
1
f
=
P(1 – f) + f
Cycles per Frame
f
12756
3625
3247
1
Size
324
574
362
Freescale Semiconductor
Equation 3

Related parts for AN2094