國立虎尾科技大學 |

Layer-type Specialized Processing Engines for a Semi-Streaming Convolutional Neural Network Hardware Architecture for FPGAs.

紀錄類型:	書目-語言資料,印刷品 : Monograph/item
正題名/作者:	Layer-type Specialized Processing Engines for a Semi-Streaming Convolutional Neural Network Hardware Architecture for FPGAs./
作者:	Shaydyuk, Nazariy.
出版者:	Ann Arbor : ProQuest Dissertations & Theses, : 2020,
面頁冊數:	113 p.
附註:	Source: Masters Abstracts International, Volume: 81-12.
Contained By:	Masters Abstracts International81-12.
標題:	Electrical engineering. -
電子資源:	http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=27993166
ISBN:	9798641486406

Layer-type Specialized Processing Engines for a Semi-Streaming Convolutional Neural Network Hardware Architecture for FPGAs.
Shaydyuk, Nazariy.

Layer-type Specialized Processing Engines for a Semi-Streaming Convolutional Neural Network Hardware Architecture for FPGAs. - Ann Arbor : ProQuest Dissertations & Theses, 2020 - 113 p.

Source: Masters Abstracts International, Volume: 81-12.

Thesis (M.S.)--The University of Texas at San Antonio, 2020.

This item must not be sold to any third party vendors.

The rapid research advancements in Convolutional Neural Networks (CNN) have promoted significant increase in machine vision applications with artificial intelligence capabilities on resource-constrained mobile devices. Field Programmable Gate Arrays (FPGAs) have become a popular hardware target choice for CNN deployment following two main implementation approaches: streaming hardware architecture and single computation engines. The first approach required implementing every layer as a discrete processing unit and was suitable for small CNN architectures that could fit onto resource-constrained targets. The second approach involved scalable single engines capable of executing models of different complexities, but the achievable performance of one-size-fits-all implementations could vary across CNNs with different workload attributes. To take the benefits of both of the above methods, this research proposes a new design paradigm called semi-streaming and offers a set of five layer-specialized configurable processing engines suitable for semi-streaming hardware architecture. Therefore, preserving the elements of the data streaming paradigm allows nearly independent CNN execution which minimizes the interaction with the main processor as in true streaming architectures. Additionally, utilizing multiple specialized computation units allows reuse of the blocks for similar layers and eliminates the need in the explicit implementation of all layers of a network. The processing engines were tailored for the 8-bit quantized MobileNetV2 as a potential target model, implementing normalized addition, depthwise, pointwise (expansion and projection), and standard 2D convolution layers capable of delivering 5.4GOp/s, 16GOp/s, 27.2GOp/s, 27.2GOp/s and 89.6GOp/s, respectively, with the overall energy efficiency of 5.32GOp/s/W at a 100MHz system clock, requiring the total power of 6.2W on the XCZU7EV SoC FPGA.

ISBN: 9798641486406Subjects--Topical Terms:

596380
Electrical engineering.
Subjects--Index Terms:

CNN

Layer-type Specialized Processing Engines for a Semi-Streaming Convolutional Neural Network Hardware Architecture for FPGAs.
LDR:03046nam a2200349 4500 001 1037980
005 20210910100646.5
008 211029s2020 ||||||||||||||||| ||eng d
020 $a 9798641486406
035 $a (MiAaPQ)AAI27993166
035 $a AAI27993166
040 $a MiAaPQ $c MiAaPQ
100 1 $a Shaydyuk, Nazariy. $3 1335300
245 1 0 $a Layer-type Specialized Processing Engines for a Semi-Streaming Convolutional Neural Network Hardware Architecture for FPGAs.
260 1 $a Ann Arbor : $b ProQuest Dissertations & Theses, $c 2020
300 $a 113 p.
500 $a Source: Masters Abstracts International, Volume: 81-12.
500 $a Advisor: John, Eugene.
502 $a Thesis (M.S.)--The University of Texas at San Antonio, 2020.
506 $a This item must not be sold to any third party vendors.
520 $a The rapid research advancements in Convolutional Neural Networks (CNN) have promoted significant increase in machine vision applications with artificial intelligence capabilities on resource-constrained mobile devices. Field Programmable Gate Arrays (FPGAs) have become a popular hardware target choice for CNN deployment following two main implementation approaches: streaming hardware architecture and single computation engines. The first approach required implementing every layer as a discrete processing unit and was suitable for small CNN architectures that could fit onto resource-constrained targets. The second approach involved scalable single engines capable of executing models of different complexities, but the achievable performance of one-size-fits-all implementations could vary across CNNs with different workload attributes. To take the benefits of both of the above methods, this research proposes a new design paradigm called semi-streaming and offers a set of five layer-specialized configurable processing engines suitable for semi-streaming hardware architecture. Therefore, preserving the elements of the data streaming paradigm allows nearly independent CNN execution which minimizes the interaction with the main processor as in true streaming architectures. Additionally, utilizing multiple specialized computation units allows reuse of the blocks for similar layers and eliminates the need in the explicit implementation of all layers of a network. The processing engines were tailored for the 8-bit quantized MobileNetV2 as a potential target model, implementing normalized addition, depthwise, pointwise (expansion and projection), and standard 2D convolution layers capable of delivering 5.4GOp/s, 16GOp/s, 27.2GOp/s, 27.2GOp/s and 89.6GOp/s, respectively, with the overall energy efficiency of 5.32GOp/s/W at a 100MHz system clock, requiring the total power of 6.2W on the XCZU7EV SoC FPGA.
590 $a School code: 1283.
650 4 $a Electrical engineering. $3 596380
653 $a CNN
653 $a FPGA
653 $a Hardware accelerator
653 $a Inference
653 $a MobileNetV2
690 $a 0544
710 2 $a The University of Texas at San Antonio. $b Electrical & Computer Engineering. $3 845615
773 0 $t Masters Abstracts International $g 81-12.
790 $a 1283
791 $a M.S.
792 $a 2020
793 $a English
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=27993166