Introduction to High Performance Scientific Computing

Introduction to High Performance Scientific Computing

Victor Eijkhout

Language: English

Pages: 482

ISBN: 1257992546

Format: PDF / Kindle (mobi) / ePub

This is a textbook that teaches the bridging topics between numerical analysis, parallel computing, code performance, large scale applications.

DNS and BIND on IPv6

Raspberry Pi Hacks: Tips & Tools for Making Things with the Inexpensive Linux Computer

Understanding the Basics of Raspberry Pi: A User Guide to Using Raspberry Pi

High Performance Computing in Science and Engineering '10: Transactions of the High Performance Computing Center, Stuttgart (HLRS) 2010
















number of disciplines and skill sets, and correspondingly, for someone to be successful at using high performance computing in science requires at least elementary knowledge of and skills in all these areas. Computations stem from an application context, so some acquaintance with physics and engineering sciences is desirable. Then, problems in these application areas are typically translated into linear algebraic, and sometimes combinatorial, problems, so a computational scientist needs knowledge

now repeatedly reused, and are therefore more likely to remain in the cache. This rearranged code displays better temporal locality in its use of x[i]. Spatial locality The concept of spatial locality is slightly more involved. A program is said to exhibit spatial locality if it references memory that is ‘close’ to memory it already referenced. In the classical von Neumann architecture with only a processor and memory, spatial locality should be irrelevant, since one address in memory

languages to machine instructions. However, on occasion we will discuss how a program at high level can be written to ensure efficiency at the low level. In scientific computing, however, we typically do not pay much attention to program code, focusing almost exclusively on data and how it is moved about during program execution. For most practical purposes it is as if program and data are stored separately. The little that is essential about instruction handling can be described as follows. The

90 percent. However, it should be noted that many scientific codes do not feature the dense linear solution kernel, so the performance on this benchmark is not indicative of the performance on a typical code. Linear system solution through iterative methods (section 5.5), for instance, is much less efficient in a flops-per-second sense, being dominated by the bandwidth between CPU and memory (a bandwidth bound algorithm). One implementation of the Linpack benchmark that is often used is

as • • • • load the value of b from memory into a register, load the value of c from memory into another register, compute the sum and write that into yet another register, and write the sum value back to the memory location of c. Looking at assembly code (for instance the output of a compiler), you see the explicit load, compute, and store instructions. Compute instructions such as add or multiply only operate on registers; for instance addl %eax, %edx 7. Actually, a[i] is loaded before it

Download sample


About admin