語系:
繁體中文
English
說明(常見問題)
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
Navigating Heterogeneity and Scalability in Modern Chip Design.
紀錄類型:
書目-語言資料,手稿 : Monograph/item
正題名/作者:
Navigating Heterogeneity and Scalability in Modern Chip Design./
作者:
Orenes-Vera, Marcelo.
面頁冊數:
1 online resource (187 pages)
附註:
Source: Dissertations Abstracts International, Volume: 85-12, Section: B.
Contained By:
Dissertations Abstracts International85-12B.
標題:
Computer science. -
電子資源:
click for full text (PQDT)
ISBN:
9798382810263
Navigating Heterogeneity and Scalability in Modern Chip Design.
Orenes-Vera, Marcelo.
Navigating Heterogeneity and Scalability in Modern Chip Design.
- 1 online resource (187 pages)
Source: Dissertations Abstracts International, Volume: 85-12, Section: B.
Thesis (Ph.D.)--Princeton University, 2024.
Includes bibliographical references
Computing systems have become ubiquitous in the modern world but their design is far from one-size-fits-all. From battery-powered devices to supercomputers, deployment requirements are a primary driver of heterogeneity in computer design. As modern systems rely on parallelism and specialization to achieve their performance and power goals, new challenges arise. The system's complexity grows with the number of distinct hardware modules, complicating the verification of correct and secure behavior. Moreover, expanding parallelization across more processing units (PUs) increases the pressure on the memory hierarchy and inter-PU network, which results in severe bottlenecks for applications traversing graph-like data structures with indirect memory accesses (IMAs). These challenges call for re-thinking software abstractions and hardware designs to achieve scalable and efficient systems, as well as introducing robust methodologies to ensure their correctness. My dissertation aims to tackle these challenges with three main thrusts.First, to facilitate hardware designers applying formal verification to their modules, this dissertation introduces AutoSVA, a toolflow that generates formal verification testbenches from module interface annotations. Testbenches generated with AutoSVA have uncovered bugs in open-source projects, including a widely used RISC-V CPU. Second, to alleviate IMA latency without increasing verification complexity, this dissertation introduces MAPLE, a network-connected memory-access engine that supports data pipelining and prefetching without requiring PU modifications. As such, off-the-shelf PUs can offload IMAs to MAPLE, and consume data via software-managed queues. Using MAPLE effectively mitigates memory latency, providing 2x speedups over software- and hardware-only prefetching. Third, to further the scalability of graph and sparse workloads, this dissertation co-designs scale-out architectures with a data-centric execution model, Dalorex, where IMAs are split into tasks that only access a confined address range and execute at the PU with dedicated access to that memory range. The parallelization of breadth-first-search on a billion-edge graph across a million PUs results in nearly an order of magnitude faster runtimes than Graph500's top entries.By introducing novel hardware designs, execution models, and verification tools, this dissertation contributes towards addressing the challenges posed by the increasing demand for high-performance, energy-efficient, and cost-effective computing systems.
Electronic reproduction.
Ann Arbor, Mich. :
ProQuest,
2024
Mode of access: World Wide Web
ISBN: 9798382810263Subjects--Topical Terms:
573171
Computer science.
Subjects--Index Terms:
Processing unitsIndex Terms--Genre/Form:
554714
Electronic books.
Navigating Heterogeneity and Scalability in Modern Chip Design.
LDR
:03894ntm a22003737 4500
001
1148400
005
20240924101916.5
006
m o d
007
cr bn ---uuuuu
008
250605s2024 xx obm 000 0 eng d
020
$a
9798382810263
035
$a
(MiAaPQ)AAI31294177
035
$a
AAI31294177
040
$a
MiAaPQ
$b
eng
$c
MiAaPQ
$d
NTU
100
1
$a
Orenes-Vera, Marcelo.
$3
1474354
245
1 0
$a
Navigating Heterogeneity and Scalability in Modern Chip Design.
264
0
$c
2024
300
$a
1 online resource (187 pages)
336
$a
text
$b
txt
$2
rdacontent
337
$a
computer
$b
c
$2
rdamedia
338
$a
online resource
$b
cr
$2
rdacarrier
500
$a
Source: Dissertations Abstracts International, Volume: 85-12, Section: B.
500
$a
Advisor: Martonosi, Margaret;Wentzlaff, David.
502
$a
Thesis (Ph.D.)--Princeton University, 2024.
504
$a
Includes bibliographical references
520
$a
Computing systems have become ubiquitous in the modern world but their design is far from one-size-fits-all. From battery-powered devices to supercomputers, deployment requirements are a primary driver of heterogeneity in computer design. As modern systems rely on parallelism and specialization to achieve their performance and power goals, new challenges arise. The system's complexity grows with the number of distinct hardware modules, complicating the verification of correct and secure behavior. Moreover, expanding parallelization across more processing units (PUs) increases the pressure on the memory hierarchy and inter-PU network, which results in severe bottlenecks for applications traversing graph-like data structures with indirect memory accesses (IMAs). These challenges call for re-thinking software abstractions and hardware designs to achieve scalable and efficient systems, as well as introducing robust methodologies to ensure their correctness. My dissertation aims to tackle these challenges with three main thrusts.First, to facilitate hardware designers applying formal verification to their modules, this dissertation introduces AutoSVA, a toolflow that generates formal verification testbenches from module interface annotations. Testbenches generated with AutoSVA have uncovered bugs in open-source projects, including a widely used RISC-V CPU. Second, to alleviate IMA latency without increasing verification complexity, this dissertation introduces MAPLE, a network-connected memory-access engine that supports data pipelining and prefetching without requiring PU modifications. As such, off-the-shelf PUs can offload IMAs to MAPLE, and consume data via software-managed queues. Using MAPLE effectively mitigates memory latency, providing 2x speedups over software- and hardware-only prefetching. Third, to further the scalability of graph and sparse workloads, this dissertation co-designs scale-out architectures with a data-centric execution model, Dalorex, where IMAs are split into tasks that only access a confined address range and execute at the PU with dedicated access to that memory range. The parallelization of breadth-first-search on a billion-edge graph across a million PUs results in nearly an order of magnitude faster runtimes than Graph500's top entries.By introducing novel hardware designs, execution models, and verification tools, this dissertation contributes towards addressing the challenges posed by the increasing demand for high-performance, energy-efficient, and cost-effective computing systems.
533
$a
Electronic reproduction.
$b
Ann Arbor, Mich. :
$c
ProQuest,
$d
2024
538
$a
Mode of access: World Wide Web
650
4
$a
Computer science.
$3
573171
650
4
$a
Computer engineering.
$3
569006
653
$a
Processing units
653
$a
Indirect memory accesses
653
$a
Computing systems
653
$a
Hardware designs
655
7
$a
Electronic books.
$2
local
$3
554714
690
$a
0464
690
$a
0984
710
2
$a
Princeton University.
$b
Computer Science.
$3
1179801
710
2
$a
ProQuest Information and Learning Co.
$3
1178819
773
0
$t
Dissertations Abstracts International
$g
85-12B.
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=31294177
$z
click for full text (PQDT)
筆 0 讀者評論
多媒體
評論
新增評論
分享你的心得
Export
取書館別
處理中
...
變更密碼[密碼必須為2種組合(英文和數字)及長度為10碼以上]
登入