Basic Linear Algebra Subprograms
둘러보기로 가기
검색하러 가기
노트
- The cuBLAS Library provides a GPU-accelerated implementation of the basic linear algebra subroutines (BLAS).[1]
- The BLAS are a small core library of linear algebra utilities, which can be highly optimized for various architectures.[2]
- The BLAS are used in a wide range of software, including LINPACK, LAPACK, and many other algorithms commonly in use today.[3]
- BLAS 2 and BLAS 3 modules in SCSL are optimized and parallelized to take advantage of SGI's hardware architecture.[3]
- SCSL also supports the C interface to the legacy BLAS set forth by the BLAS Technical Forum.[3]
- The BLAS (Basic Linear Algebra Subprograms) are high quality "building block" routines for performing basic vector and matrix operations.[4]
- Otherwise, an automatic optimized BLAS can be build, using the ATLAS package.[4]
- This paper presents the implementation considerations and performance of the local BLAS, or BLAS local to each node of the system.[5]
- The implications of implementing BLAS on distributed memory computers are considered in this light.[5]
- Basic Linear Algebra Subroutines (BLAS) are routines that provide standard functions for basic vector and matrix operations.[6]
- Refer to BLAS (Basic Linear Algebra Subprograms) for more information on the BLAS functions.[6]
- In this paper, we only consider the single precision real and complex BLAS for vectors with positive strides.[7]
- The PB‐BLAS consist of calls to the sequential BLAS for local computations, and calls to the BLACS for communication.[8]
- Some of the linear algebra subprograms were designed in accordance with the Level 1 and Level 2 BLAS de facto standard.[9]
- The vector-scalar linear algebra subprograms include a subset of the standard set of Level 1 BLAS.[9]
- These subprograms include a subset of the standard set of Level 2 BLAS.[9]
- Some of the matrix operation subroutines were designed in accordance with the Level 3 BLAS de facto standard.[9]
- In this paper, we implement and evaluate the performance of some important BLAS operations on a matrix coprocessor.[10]
- Fortunately, many applications are based on intensive use of Level-3 BLAS with small percentage of Level-1 and Level-2 BLAS.[10]
- This paper describes a standard API for a set of Batched Basic Linear Algebra Subprograms (Batched BLAS or BBLAS).[11]
- This design makes it easy to add further functionality to the sparse BLAS in the future.[12]
- The full BLAS functionality for band-format and packed-format matrices is available through the low-level CBLAS interface.[13]
- This interface corresponds to the BLAS Technical Forum’s standard for the C interface to legacy BLAS implementations.[13]
- The library provides an interface to the BLAS operations which apply to these objects.[13]
- LINPACK could use a generic version of BLAS.[14]
- To gain performance, different machines might use tailored versions of BLAS.[14]
- BLAS for a vector machine could use the machine's fast vector operations.[14]
- Consequently, BLAS was augmented from 1984 to 1986 with level-2 kernel operations that concerned vector-matrix operations.[14]
- Their beauty has always been that computer manufacturers have been encouraged to implement the BLAS as efficiently as possible.[15]
- The vector BLAS are now called level-1 BLAS.[15]
- This paper updates the ongoing BLAS effort, and summarizes what types of operations are now available.[15]
- We now have flavors of BLAS for dense, banded, and sparse vector and matrix operations.[15]
- This paper proposes adding a set of Level 3 BLAS, which would be used to perform matrix-matrix operations.[16]
- Using the BLAS provides portability and ease of maintenance.[16]
- The authors discuss the reasoning used in selecting the operations to be included in the Level 3 BLAS.[16]
- An example illustrates how the Level 3 BLAS can be used to implement the Cholesky factorization as a block algorithm.[16]
- (1979) published the original set of 38 BLAS.[17]
- The IMSL BLAS collection includes these 38 subprograms plus additional ones that extend their functionality.[17]
- (1988 and 1990) published extensions to this set, it is customary to refer to the original 38 as Level 1 BLAS.[17]
- These are called the Level 2 BLAS (see).[17]
소스
- ↑ cuBLAS
- ↑ Basic Linear Algebra Subprograms
- ↑ 3.0 3.1 3.2 Chapter 2. Basic Linear Algebra Subprogram (BLAS) Routines
- ↑ 4.0 4.1 BLAS: Basic Linear Algebra Subprograms
- ↑ 5.0 5.1 Local Basic Linear Algebra Subroutines (LBLAS) for the CM-5/5E
- ↑ 6.0 6.1 Basic Linear Algebra Subroutines
- ↑ Basic linear algebra subprograms (BLAS) on the CDC CYBER 205
- ↑ PB‐BLAS: a set of parallel block basic linear algebra subprograms
- ↑ 9.0 9.1 9.2 9.3 Guide and Reference
- ↑ 10.0 10.1 Performance Evaluation of Basic Linear Algebra Subroutines on a Matrix Co-processor
- ↑ A Set of Batched Basic Linear Algebra Subprograms and LAPACK Routines
- ↑ An Overview of the Sparse Basic Linear Algebra Subprograms: The New Standard from the BLAS Technical Forum
- ↑ 13.0 13.1 13.2 BLAS Support — GSL 2.6 documentation
- ↑ 14.0 14.1 14.2 14.3 Basic Linear Algebra Subprograms
- ↑ 15.0 15.1 15.2 15.3 An updated set of basic linear algebra subprograms (BLAS)
- ↑ 16.0 16.1 16.2 16.3 A set of level 3 basic linear algebra subprograms
- ↑ 17.0 17.1 17.2 17.3 Basic Linear Algebra Subprograms
메타데이터
위키데이터
- ID : Q810007
Spacy 패턴 목록
- [{'LOWER': 'basic'}, {'LOWER': 'linear'}, {'LOWER': 'algebra'}, {'LEMMA': 'Subprograms'}]
- [{'LEMMA': 'BLAS'}]