國立虎尾科技大學 |

Dynamically Managing FPGAs for Efficient Computing.

紀錄類型:	書目-語言資料,印刷品 : Monograph/item
正題名/作者:	Dynamically Managing FPGAs for Efficient Computing./
作者:	Nguyen, Marie.
出版者:	Ann Arbor : ProQuest Dissertations & Theses, : 2020,
面頁冊數:	144 p.
附註:	Source: Dissertations Abstracts International, Volume: 82-04, Section: B.
Contained By:	Dissertations Abstracts International82-04B.
標題:	Engineering. -
電子資源:	http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=28148482
ISBN:	9798678110206

Dynamically Managing FPGAs for Efficient Computing.
Nguyen, Marie.

Dynamically Managing FPGAs for Efficient Computing. - Ann Arbor : ProQuest Dissertations & Theses, 2020 - 144 p.

Source: Dissertations Abstracts International, Volume: 82-04, Section: B.

Thesis (Ph.D.)--Carnegie Mellon University, 2020.

This item must not be sold to any third party vendors.

Field-programmable gate arrays (FPGAs) have undergone a dramatic transformation from a logic technology to a computing technology. This transformation is pulled by the computing industry’s need for more power/energy efficiency than software can achieve and, at the same time, more flexibility than ASICs. Nonetheless, FPGA designers still share a similar design methodology with ASIC designers. Most notably, at design time, FPGA designers commit to a fixed allocation of logic resources to modules in a design. In other words, FPGAs are mostly still used like an ``ASIC'' despite being runtime reprogrammable. Through partial reconfiguration (PR), parts of an FPGA design can be reconfigured at runtime while the remainder continues to operate without disruption. PR enables what has been possible on general-purpose processors for decades. For instance, multiple tasks can be time-multiplexed on a smaller FPGA, which can result in area/device cost, power and energy reduction, compared to statically mapping tasks on a larger FPGA. PR can become a relevant technology for an emerging class of AI-driven applications that (1) need to support many compute intensive tasks with real-time requirements and (2) are often deployed on a small, low-end FPGA due to area, cost, power or energy concerns (e.g., smart cars/robots/cameras at the Edge). For such applications, using a large expensive FPGA is typically not a viable option.Though PR is a promising technology and has been supported by FPGA tools for over a decade, it is still a feature waiting to be proven for its commercial value. The reconfiguration time (between few to tens of milliseconds on today's FPGAs), also referred as PR time, is often considered as one of the major hurdles preventing a more widespread use of PR. While the non-trivial PR time represents a technical challenge, we believe that a more important question to address is ``When, how and why should an FPGA designer consider using PR?''. Addressing this question requires to (1) identify applications that can tolerate PR time and still benefit from a PR approach, (2) design good architectural and runtime management strategies to build efficient designs leveraging PR, and (3) evaluate whether the area/device cost, power or energy benefits are important enough to justify a transition from a statically mapped design. This thesis seeks to advance the state-of-the-art in the dynamism of computing FPGAs by tackling the aforementioned challenges. Specifically, we demonstrate that a design exploiting PR can be more area/device cost, power or energy efficient than a statically mapped design (ASIC-style design) with slack. slack occurs when all resources occupied by an ASIC-style design are not active all the time. Using PR, a designer can attempt to reduce slack by changing the allocation of resources over time. In this work, we identify slack's reduction as the most important opportunity for improvement available to PR-style designs. We refer to a PR-style design as a design in which logic resources are allocated to different modules of one design over time using PR. We develop efficient PR allocation and execution strategies to reduce slack, and show through analytical modeling and implemented designs that a PR-style design can outperform an ASIC-style design in challenging scenarios that have to deliver required performance under strict area, cost, power, and energy constraints. Further, we leverage the findings and analysis from our theoretical investigation to develop a soft-logic-realized framework for accelerating computer vision with real-time requirements (30+ fps). This framework includes the necessary architectural and runtime management strategies to support spatial and temporal sharing of the FPGA fabric at a very fine-grain (i.e. the time interval between reconfigurations is within millisecond range) while meeting performance requirements. Using the framework, we design and implement efficient PR-style designs to quantify the performance, area/device cost, power and energy benefits of PR-style designs relative to ASIC-style designs and to software implementations. Notably, we show that a PR-style design can be more power and energy efficient than an ASIC-style design even when frequently reconfiguring the fabric (i.e. when more than half of the execution time is spent reconfiguring the fabric) and under specific conditions. We also make projections on the impact of higher PR speed on the costs and benefits of using PR at a very fine-grain. Through our study, we find that, while higher reconfiguration speed can make a PR-style more area/device cost efficient, the power/energy overhead incurred in a PR-style design due to, for instance, fabric reconfigurations and additional data movement can make a PR approach less power/energy efficient than an ASIC-style design.

ISBN: 9798678110206Subjects--Topical Terms:

561152
Engineering.
Subjects--Index Terms:

Computer vision

Dynamically Managing FPGAs for Efficient Computing.
LDR:05963nam a2200361 4500 001 1038043
005 20210910100702.5
008 211029s2020 ||||||||||||||||| ||eng d
020 $a 9798678110206
035 $a (MiAaPQ)AAI28148482
035 $a AAI28148482
040 $a MiAaPQ $c MiAaPQ
100 1 $a Nguyen, Marie. $0 (orcid)0000-0002-7226-1598 $3 1335377
245 1 0 $a Dynamically Managing FPGAs for Efficient Computing.
260 1 $a Ann Arbor : $b ProQuest Dissertations & Theses, $c 2020
300 $a 144 p.
500 $a Source: Dissertations Abstracts International, Volume: 82-04, Section: B.
500 $a Advisor: Hoe, James C.
502 $a Thesis (Ph.D.)--Carnegie Mellon University, 2020.
506 $a This item must not be sold to any third party vendors.
520 $a Field-programmable gate arrays (FPGAs) have undergone a dramatic transformation from a logic technology to a computing technology. This transformation is pulled by the computing industry’s need for more power/energy efficiency than software can achieve and, at the same time, more flexibility than ASICs. Nonetheless, FPGA designers still share a similar design methodology with ASIC designers. Most notably, at design time, FPGA designers commit to a fixed allocation of logic resources to modules in a design. In other words, FPGAs are mostly still used like an ``ASIC'' despite being runtime reprogrammable. Through partial reconfiguration (PR), parts of an FPGA design can be reconfigured at runtime while the remainder continues to operate without disruption. PR enables what has been possible on general-purpose processors for decades. For instance, multiple tasks can be time-multiplexed on a smaller FPGA, which can result in area/device cost, power and energy reduction, compared to statically mapping tasks on a larger FPGA. PR can become a relevant technology for an emerging class of AI-driven applications that (1) need to support many compute intensive tasks with real-time requirements and (2) are often deployed on a small, low-end FPGA due to area, cost, power or energy concerns (e.g., smart cars/robots/cameras at the Edge). For such applications, using a large expensive FPGA is typically not a viable option.Though PR is a promising technology and has been supported by FPGA tools for over a decade, it is still a feature waiting to be proven for its commercial value. The reconfiguration time (between few to tens of milliseconds on today's FPGAs), also referred as PR time, is often considered as one of the major hurdles preventing a more widespread use of PR. While the non-trivial PR time represents a technical challenge, we believe that a more important question to address is ``When, how and why should an FPGA designer consider using PR?''. Addressing this question requires to (1) identify applications that can tolerate PR time and still benefit from a PR approach, (2) design good architectural and runtime management strategies to build efficient designs leveraging PR, and (3) evaluate whether the area/device cost, power or energy benefits are important enough to justify a transition from a statically mapped design. This thesis seeks to advance the state-of-the-art in the dynamism of computing FPGAs by tackling the aforementioned challenges. Specifically, we demonstrate that a design exploiting PR can be more area/device cost, power or energy efficient than a statically mapped design (ASIC-style design) with slack. slack occurs when all resources occupied by an ASIC-style design are not active all the time. Using PR, a designer can attempt to reduce slack by changing the allocation of resources over time. In this work, we identify slack's reduction as the most important opportunity for improvement available to PR-style designs. We refer to a PR-style design as a design in which logic resources are allocated to different modules of one design over time using PR. We develop efficient PR allocation and execution strategies to reduce slack, and show through analytical modeling and implemented designs that a PR-style design can outperform an ASIC-style design in challenging scenarios that have to deliver required performance under strict area, cost, power, and energy constraints. Further, we leverage the findings and analysis from our theoretical investigation to develop a soft-logic-realized framework for accelerating computer vision with real-time requirements (30+ fps). This framework includes the necessary architectural and runtime management strategies to support spatial and temporal sharing of the FPGA fabric at a very fine-grain (i.e. the time interval between reconfigurations is within millisecond range) while meeting performance requirements. Using the framework, we design and implement efficient PR-style designs to quantify the performance, area/device cost, power and energy benefits of PR-style designs relative to ASIC-style designs and to software implementations. Notably, we show that a PR-style design can be more power and energy efficient than an ASIC-style design even when frequently reconfiguring the fabric (i.e. when more than half of the execution time is spent reconfiguring the fabric) and under specific conditions. We also make projections on the impact of higher PR speed on the costs and benefits of using PR at a very fine-grain. Through our study, we find that, while higher reconfiguration speed can make a PR-style more area/device cost efficient, the power/energy overhead incurred in a PR-style design due to, for instance, fabric reconfigurations and additional data movement can make a PR approach less power/energy efficient than an ASIC-style design.
590 $a School code: 0041.
650 4 $a Engineering. $3 561152
650 4 $a Computer science. $3 573171
650 4 $a Electrical engineering. $3 596380
653 $a Computer vision
653 $a Efficient computing
653 $a FPGA
653 $a Partial reconfiguration
690 $a 0537
690 $a 0984
690 $a 0544
710 2 $a Carnegie Mellon University. $b Electrical and Computer Engineering. $3 1182305
773 0 $t Dissertations Abstracts International $g 82-04B.
790 $a 0041
791 $a Ph.D.
792 $a 2020
793 $a English
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=28148482