A Fine-Grained Pipelined Implementation of LU Decomposition on SIMD Processors

Kai Zhang; Shuming Chen; Wei Liu; Xi Ning

doi:10.1007/978-3-642-40820-5_4

Conference Papers Year : 2013

A Fine-Grained Pipelined Implementation of LU Decomposition on SIMD Processors

(1) , (1) , (1) , (1)

Kai Zhang

Function : Author
PersonId : 767989
ORCID : 0000-0002-0068-8197

School of Computer [Chine]

Shuming Chen

Function : Author
PersonId : 1006883

School of Computer [Chine]

Wei Liu

Function : Author
PersonId : 756839
ORCID : 0000-0003-0480-2097

School of Computer [Chine]

Xi Ning

Function : Author

School of Computer [Chine]

Abstract

The LU decomposition is a widely used method to solve the dense linear algebra in many scientific computation applications. In recent years, the single instruction multiple data (SIMD) technology has been a popular method to accelerate the LU decomposition. However, the pipeline parallelism and memory bandwidth utilization are low when the LU decomposition mapped onto SIMD processors. This paper proposes a fine-grained pipelined implementation of LU decomposition on SIMD processors. The fine-grained algorithm well utilizes data dependences of the native algorithm to explore the fine-grained parallelism among all the computation resources. By transforming the non-coalesced memory access to coalesced version, the proposed algorithm can achieve the high pipeline parallelism and the high efficient memory access. Experimental results show that the proposed technology can achieve a speedup of 1.04x to 1.82x over the native algorithm and can achieve about 89% of the peak performance on the SIMD processor.

Domains

Computer Science [cs]

Fichier principal

978-3-642-40820-5_4_Chapter.pdf (215.53 Ko)

Origin	Files produced by the author(s)

Hal Ifip : Connect in order to contact the contributor

https://inria.hal.science/hal-01513757

Submitted on : Tuesday, April 25, 2017-2:33:24 PM

Last modification on : Tuesday, September 3, 2019-3:04:02 PM

Long-term archiving on : Wednesday, July 26, 2017-1:56:15 PM

Dates and versions

hal-01513757 , version 1 (25-04-2017)

Licence

Attribution

Identifiers

HAL Id : hal-01513757 , version 1
DOI : 10.1007/978-3-642-40820-5_4

Cite

Kai Zhang, Shuming Chen, Wei Liu, Xi Ning. A Fine-Grained Pipelined Implementation of LU Decomposition on SIMD Processors. 10th International Conference on Network and Parallel Computing (NPC), Sep 2013, Guiyang, China. pp.39-48, ⟨10.1007/978-3-642-40820-5_4⟩. ⟨hal-01513757⟩

Export

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

IFIP-LNCS IFIP IFIP-NPC IFIP-LNCS-8147

85 View

483 Download

A Fine-Grained Pipelined Implementation of LU Decomposition on SIMD Processors

Abstract

Domains

Dates and versions

Licence

Identifiers

Cite

Export

Collections

Altmetric

Share