語系:
繁體中文
English
說明(常見問題)
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
Software/hardware co-design to impro...
~
ProQuest Information and Learning Co.
Software/hardware co-design to improve productivity, portability, and performance of loop-task parallel applications.
紀錄類型:
書目-語言資料,手稿 : Monograph/item
正題名/作者:
Software/hardware co-design to improve productivity, portability, and performance of loop-task parallel applications./
作者:
Kim, Ji Yun.
面頁冊數:
1 online resource (122 pages)
附註:
Source: Dissertation Abstracts International, Volume: 78-04(E), Section: B.
Contained By:
Dissertation Abstracts International78-04B(E).
標題:
Computer engineering. -
電子資源:
click for full text (PQDT)
ISBN:
9781369343991
Software/hardware co-design to improve productivity, portability, and performance of loop-task parallel applications.
Kim, Ji Yun.
Software/hardware co-design to improve productivity, portability, and performance of loop-task parallel applications.
- 1 online resource (122 pages)
Source: Dissertation Abstracts International, Volume: 78-04(E), Section: B.
Thesis (Ph.D.)
Includes bibliographical references
Computer architects are increasingly turning to programmable accelerators tailored for narrower classes of applications in order to achieve high performance and energy efficiency. A continuing challenge with accelerators is enabling the programmer to easily extract maximum performance without intimate knowledge of the underlying microarchitecture. It is important to consider productivity and portability, in addition to performance, as first-class metrics when developing and evaluating modern computing platforms. Software-centric approaches to achieving 3P computing platforms are compelling, but sacrifice efficiency and flexibility by hiding parallel abstractions from hardware and limiting the scope of the application domain. This thesis proposes a new software/hardware co-design approach to achieving 3P platforms, called the loop-task accelerator (LTA) platform, that provides high productivity and portability without sacrificing performance or efficiency across a wide range of applications. The LTA platform addresses the weaknesses of existing approaches that are identified through detailed experimentation with and analysis of modern application development. Discussion of an early attempt at a hardware-centric approach to achieving 3P platforms provides insight into area-efficient accelerator designs and highlights the need for innovations in both software and hardware. The LTA platform focuses on exploiting loop-task parallelism by exposing loop-tasks as a common parallel abstraction at the programming API, runtime, ISA, and microarchitectural levels. The LTA programming API uses the parallel_for construct to express loop-tasks that can be exploited both across cores and within a core, the LTA runtime distributes loop-tasks across cores, and a new xpfor instruction explicitly encodes loop-tasks as functions applied to a range of loop iterations. This thesis introduces a novel task-coupling taxonomy that captures how tasks can be coupled in both space and time. The LTA engine template can be configured at design time with variable spatial and temporal task coupling to accelerate the execution of both regular and irregular loop-tasks within a core. The LTA platform is evaluated with respect to the 3P's using a vertically integrated research methodology. Compared to an in-order multi-core baseline, the LTA platform yields average improvements of 5.5x in raw performance, 2.5x in performance per area, and 1.2x in energy efficiency, while offering high productivity and portability.
Electronic reproduction.
Ann Arbor, Mich. :
ProQuest,
2018
Mode of access: World Wide Web
ISBN: 9781369343991Subjects--Topical Terms:
569006
Computer engineering.
Index Terms--Genre/Form:
554714
Electronic books.
Software/hardware co-design to improve productivity, portability, and performance of loop-task parallel applications.
LDR
:03851ntm a2200349Ki 4500
001
910380
005
20180517123954.5
006
m o u
007
cr mn||||a|a||
008
190606s2017 xx obm 000 0 eng d
020
$a
9781369343991
035
$a
(MiAaPQ)AAI10240604
035
$a
(MiAaPQ)cornellgrad:10048
035
$a
AAI10240604
040
$a
MiAaPQ
$b
eng
$c
MiAaPQ
099
$a
TUL
$f
hyy
$c
available through World Wide Web
100
1
$a
Kim, Ji Yun.
$3
1181642
245
1 0
$a
Software/hardware co-design to improve productivity, portability, and performance of loop-task parallel applications.
264
0
$c
2017
300
$a
1 online resource (122 pages)
336
$a
text
$b
txt
$2
rdacontent
337
$a
computer
$b
c
$2
rdamedia
338
$a
online resource
$b
cr
$2
rdacarrier
500
$a
Source: Dissertation Abstracts International, Volume: 78-04(E), Section: B.
500
$a
Adviser: Christopher Batten.
502
$a
Thesis (Ph.D.)
$c
Cornell University
$d
2017.
504
$a
Includes bibliographical references
520
$a
Computer architects are increasingly turning to programmable accelerators tailored for narrower classes of applications in order to achieve high performance and energy efficiency. A continuing challenge with accelerators is enabling the programmer to easily extract maximum performance without intimate knowledge of the underlying microarchitecture. It is important to consider productivity and portability, in addition to performance, as first-class metrics when developing and evaluating modern computing platforms. Software-centric approaches to achieving 3P computing platforms are compelling, but sacrifice efficiency and flexibility by hiding parallel abstractions from hardware and limiting the scope of the application domain. This thesis proposes a new software/hardware co-design approach to achieving 3P platforms, called the loop-task accelerator (LTA) platform, that provides high productivity and portability without sacrificing performance or efficiency across a wide range of applications. The LTA platform addresses the weaknesses of existing approaches that are identified through detailed experimentation with and analysis of modern application development. Discussion of an early attempt at a hardware-centric approach to achieving 3P platforms provides insight into area-efficient accelerator designs and highlights the need for innovations in both software and hardware. The LTA platform focuses on exploiting loop-task parallelism by exposing loop-tasks as a common parallel abstraction at the programming API, runtime, ISA, and microarchitectural levels. The LTA programming API uses the parallel_for construct to express loop-tasks that can be exploited both across cores and within a core, the LTA runtime distributes loop-tasks across cores, and a new xpfor instruction explicitly encodes loop-tasks as functions applied to a range of loop iterations. This thesis introduces a novel task-coupling taxonomy that captures how tasks can be coupled in both space and time. The LTA engine template can be configured at design time with variable spatial and temporal task coupling to accelerate the execution of both regular and irregular loop-tasks within a core. The LTA platform is evaluated with respect to the 3P's using a vertically integrated research methodology. Compared to an in-order multi-core baseline, the LTA platform yields average improvements of 5.5x in raw performance, 2.5x in performance per area, and 1.2x in energy efficiency, while offering high productivity and portability.
533
$a
Electronic reproduction.
$b
Ann Arbor, Mich. :
$c
ProQuest,
$d
2018
538
$a
Mode of access: World Wide Web
650
4
$a
Computer engineering.
$3
569006
650
4
$a
Computer science.
$3
573171
655
7
$a
Electronic books.
$2
local
$3
554714
690
$a
0464
690
$a
0984
710
2
$a
ProQuest Information and Learning Co.
$3
1178819
710
2
$a
Cornell University.
$b
Electrical & Computer Engrng.
$3
1179333
773
0
$t
Dissertation Abstracts International
$g
78-04B(E).
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=10240604
$z
click for full text (PQDT)
筆 0 讀者評論
多媒體
評論
新增評論
分享你的心得
Export
取書館別
處理中
...
變更密碼[密碼必須為2種組合(英文和數字)及長度為10碼以上]
登入