Code optimization of polynomial approximation functions on clustered instruction-level parallelism processors

Document Type



The authors propose a general code optimization method for implementing polynomial approximation functions on clustered instruction-level parallelism (ILP) processors. In the proposed method, we first introduce the parallel algorithm with minimized data dependency. We then schedule and map the data dependency graph (DDG) constructed based on the parallel algorithm to appropriate clusters and functional units of a specific clustered ILP processor using the proposed parallel scheduling and mapping (PSAM) algorithm. The PSAM algorithm prioritizes those nodes on the critical path to minimize the total schedule length and ensures that the resulting schedule satisfies the resource constraints imposed by a specific cluster ILP processor. As a result, our method produces schedule lengths close to the lower bounds determined by the critical path lengths of the DDGs. Experimental results of typical polynomial mathematical functions on TI 'C67x DSP show that the proposed method achieves significant performance improvement over the traditional computation method.


Computer and Systems Architecture | Digital Communications and Networking | Electrical and Computer Engineering | OS and Networks | Systems and Communications | Systems Architecture | Theory and Algorithms


Use Find in Your Library, contact the author, or interlibrary loan to garner a copy of the item. Publisher policy does not allow archiving the final published version. If a post-print (author's peer-reviewed manuscript) is allowed and available, or publisher policy changes, the item will be deposited.