《并行計算概述課件》由會員分享,可在線閱讀,更多相關(guān)《并行計算概述課件(62頁珍藏版)》請在裝配圖網(wǎng)上搜索。
1、單擊此處編輯母版標題樣式,單擊此處編輯母版文本樣式,第二級,第三級,第四級,第五級,2019/10/27,并行計算--硬件基礎(chǔ)及性能評測,?#?,,,并行計算,Parallel Computing,基本概念,2024/9/16,2,如何滿足不斷增長的計算力需求?,用速度更快的硬件,也就是減少每一條指令所需時間,優(yōu)化算法(或者優(yōu)化編譯),用多個處理機,(,器,),同時解決一個問題,并行計算,2024/9/16,3,串行計算與并行計算,2024/9/16,4,并行的層次,程序級并行,子程序級并行,語句級并行,操作級并行,微操作級并行,,并行粒度,粗,細,2024/9/16,5,FLOPS,Floa
2、ting point number Operations Per Second --,每個時鐘周期執(zhí)行浮點運算的次數(shù),理論峰值=,CPU,主頻*每時鐘周期執(zhí)行浮點運算數(shù)*,CPU,數(shù)目,部分處理器每時鐘周期執(zhí)行浮點運算數(shù):,,2024/9/16,6,www.top500.org,2024/9/16,7,Top500,—,2007,年,11,月,高居榜首的依然是來自,IBM,的“藍色基因,/L”,。自從,2004,年,11,月以來,該系統(tǒng)已經(jīng),連續(xù)三年遙遙領(lǐng)先,,而且計算能力不斷提升,,Linpack,基準測試性能,478.2 TFlop/s,(,每秒,478.2,萬億次運算,),,而半年前還是
3、,280.6 TFlop/s,拿下亞軍位置的還是,IBM,,不過換成了一臺落成不久的,“藍色基因,/P”,。位于德國尤里希研究中心的這套新系統(tǒng)運算能力,167.3 TFlop/s,,不過按照,IBM,的設(shè)計規(guī)劃,藍色基因,/P,的性能將有望突破,1 TFlop/s,大關(guān),即每秒一千萬億次運算。,2024/9/16,8,Top500,—,2007,年,11,月,第三名也是個新面孔,同時也是新,墨西哥,計算應(yīng)用中心,(NMCAC),的第一套超級計算機,由,SGI,基于,Altix ICE 8200,打造,計算能力,126.9 TFlop/s,。,同時,印度史上首次殺入了,TOP10,行列,,印度計
4、算研究實驗室的,HP Cluster Platform 3000 BL460c,以,117.9 TFlop/s,的性能拿到了第四位,2024/9/16,9,供應(yīng)商-系統(tǒng)數(shù)量,,2024/9/16,10,供應(yīng)商-計算能力,2024/9/16,11,國家分布-系統(tǒng)數(shù)量,2024/9/16,12,國家分布-計算能力,2024/9/16,13,體系結(jié)構(gòu)-系統(tǒng)數(shù)量,2024/9/16,14,體系結(jié)構(gòu)-計算能力,2024/9/16,15,應(yīng)用領(lǐng)域-系統(tǒng)數(shù)量,2024/9/16,16,應(yīng)用領(lǐng)域-計算能力,2024/9/16,17,操作系統(tǒng)-系統(tǒng)數(shù)量,2024/9/16,18,操作系統(tǒng)-計算能力,2024/9
5、/16,19,處理器家族-系統(tǒng)數(shù)量,2024/9/16,20,處理器家族-計算能力,2024/9/16,21,系統(tǒng)數(shù)量,,,,2024/9/16,22,計算能力,,2024/9/16,23,2007,年中國高性能計算機性能,TOP100,,,2024/9/16,24,并行化方法,域分解(,Domain decomposition,),任務(wù)分解(,Task decomposition,),流水線(,Pipelining,),2024/9/16,25,域分解,First, decide how data elements should be divided among processors,Se
6、cond, decide which tasks each processor should be doing,Example: Vector addition,2024/9/16,26,域分解,,,,,,,,,,,,,,,,,,,,,,,,,Find the largest element of an array,2024/9/16,27,域分解,,,,,,,,,,,,,,,,,,,,,,,,,Find the largest element of an array,,,,,,,,,,CPU 0,CPU 1,CPU 2,CPU 3,2024/9/16,28,域分解,,,,,,,,,,,,,,
7、,,,,,,,,,,,Find the largest element of an array,,,,,,,,,,CPU 0,CPU 1,CPU 2,CPU 3,,,,,,,,,,,,,2024/9/16,29,域分解,,,,,,,,,,,,,,,,,,,,,,,,,Find the largest element of an array,,,,,,,,,,CPU 0,CPU 1,CPU 2,CPU 3,,,,,,,,,,,,,2024/9/16,30,域分解,,,,,,,,,,,,,,,,,,,,,,,,,Find the largest element of an array,,,,,,,
8、,,,CPU 0,CPU 1,CPU 2,CPU 3,,,,,,,,,,,,,2024/9/16,31,域分解,,,,,,,,,,,,,,,,,,,,,,,,,Find the largest element of an array,,,,,,,,,,CPU 0,CPU 1,CPU 2,CPU 3,,,,,,,,,,,,,2024/9/16,32,域分解,,,,,,,,,,,,,,,,,,,,,,,,,Find the largest element of an array,,,,,,,,,,CPU 0,CPU 1,CPU 2,CPU,,3,,,,,,,,,,,,,2024/9/16,33,域
9、分解,,,,,,,,,,,,,,,,,,,,,,,,,Find the largest element of an array,,,,,,,,,,CPU 0,CPU 1,CPU 2,CPU 3,,,,,,,,,,,,,2024/9/16,34,域分解,,,,,,,,,,,,,,,,,,,,,,,,,Find the largest element of an array,,,,,,,,,,CPU 0,CPU 1,CPU 2,CPU 3,,,,2024/9/16,35,域分解,,,,,,,,,,,,,,,,,,,,,,,,,Find the largest element of an array
10、,,,,,,,,,,CPU 0,CPU 1,CPU 2,CPU 3,,,,2024/9/16,36,域分解,,,,,,,,,,,,,,,,,,,,,,,,,Find the largest element of an array,,,,,,,,,,CPU 0,CPU 1,CPU 2,CPU 3,,,,2024/9/16,37,域分解,,,,,,,,,,,,,,,,,,,,,,,,,Find the largest element of an array,,,,,,,,,,CPU 0,CPU 1,CPU,,2,CPU,,3,,,,2024/9/16,38,任務(wù)(功能)分解,First, divi
11、de tasks among processors,Second, decide which data elements are going to be accessed (read and/or written) by which processors,Example: Event-handler for GUI,2024/9/16,39,任務(wù)分解,f(),s(),r(),q(),h(),g(),,,,,,,,2024/9/16,40,任務(wù)分解,f(),s(),r(),q(),h(),g(),,,,,,,,,,,CPU 0,CPU 2,CPU,,1,2024/9/16,41,任務(wù)分解,f()
12、,s(),r(),q(),h(),g(),,,,,,,,,,,CPU 0,CPU 2,CPU 1,2024/9/16,42,任務(wù)分解,f(),s(),r(),q(),h(),g(),,,,,,,,,,,CPU 0,CPU 2,CPU 1,2024/9/16,43,任務(wù)分解,f(),s(),r(),q(),h(),g(),,,,,,,,,,,CPU 0,CPU 2,CPU 1,2024/9/16,44,任務(wù)分解,f(),s(),r(),q(),h(),g(),,,,,,,,,,,CPU 0,CPU 2,CPU 1,2024/9/16,45,流水線,Special kind of task dec
13、omposition,“,Assembly line,”,parallelism,Example: 3D rendering in computer graphics,,,,,,,,Rasterize,Clip,Project,Model,Input,Output,,,2024/9/16,46,Processing One Data Set (Step 1),,,,,,,,Rasterize,Clip,Project,Model,2024/9/16,47,Processing One Data Set (Step 2),,,,,,,,Rasterize,Clip,Project,Model,2
14、024/9/16,48,Processing One Data Set (Step 3),,,,,,,,Rasterize,Clip,Project,Model,2024/9/16,49,Processing One Data Set (Step 4),,,,,,,,Rasterize,Clip,Project,Model,The pipeline processes 1 data set in 4 steps,2024/9/16,50,Processing Two Data Sets (Step 1),,,,,,,,Rasterize,Clip,Project,Model,,,,,,,,20
15、24/9/16,51,Processing Two Data Sets (Time 2),,,,,,,,Rasterize,Clip,Project,Model,,,,,,,,2024/9/16,52,Processing Two Data Sets (Step 3),,,,,,,,Rasterize,Clip,Project,Model,,,,,,,,2024/9/16,53,Processing Two Data Sets (Step 4),,,,,,,,Rasterize,Clip,Project,Model,,,,,,,,2024/9/16,54,Processing Two Data
16、 Sets (Step 5),,,,,,,,Rasterize,Clip,Project,Model,,,,,,,,The pipeline processes 2 data sets in 5 steps,2024/9/16,55,Pipelining Five Data Sets (Step 1),,,Data set 0,,,,,,,,Data set 1,,,,,,,,Data set 2,,,,,,,,Data set 3,,,,,,,,Data set 4,,,,,,,,,,CPU 0,CPU 1,CPU 2,CPU 3,2024/9/16,56,Pipelining Five D
17、ata Sets (Step 2),,,Data set 0,,,,,,,,Data set 1,,,,,,,,Data set 2,,,,,,,,Data set 3,,,,,,,,Data set 4,,,,,,,,,,CPU 0,CPU 1,CPU 2,CPU 3,2024/9/16,57,Pipelining Five Data Sets (Step 3),,,Data set 0,,,,,,,,Data set 1,,,,,,,,Data set 2,,,,,,,,Data set 3,,,,,,,,Data set 4,,,,,,,,,,CPU 0,CPU 1,CPU 2,CPU
18、3,2024/9/16,58,Pipelining Five Data Sets (Step 4),,,Data set 0,,,,,,,,Data set 1,,,,,,,,Data set 2,,,,,,,,Data set 3,,,,,,,,Data set 4,,,,,,,,,,CPU 0,CPU 1,CPU 2,CPU 3,2024/9/16,59,Pipelining Five Data Sets (Step 5),,,Data set 0,,,,,,,,Data set 1,,,,,,,,Data set 2,,,,,,,,Data set 3,,,,,,,,Data set 4
19、,,,,,,,,,,CPU 0,CPU 1,CPU 2,CPU 3,2024/9/16,60,Pipelining Five Data Sets (Step 6),,,Data set 0,,,,,,,,Data set 1,,,,,,,,Data set 2,,,,,,,,Data set 3,,,,,,,,Data set 4,,,,,,,,,,CPU 0,CPU 1,CPU 2,CPU 3,2024/9/16,61,Pipelining Five Data Sets (Step 7),,,Data set 0,,,,,,,,Data set 1,,,,,,,,Data set 2,,,,,,,,Data set 3,,,,,,,,Data set 4,,,,,,,,,,CPU 0,CPU 1,CPU 2,CPU 3,2024/9/16,62,Pipelining Five Data Sets (Step 8),,,Data set 0,,,,,,,,Data set 1,,,,,,,,Data set 2,,,,,,,,Data set 3,,,,,,,,Data set 4,,,,,,,,,,CPU 0,CPU 1,CPU 2,CPU 3,