語系:
繁體中文
English
說明(常見問題)
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
An OpenCL Framework for Real-time In...
~
Kumawat, Sachin.
An OpenCL Framework for Real-time Inference of Next-Generation Convolutional Neural Networks on FPGAs.
紀錄類型:
書目-語言資料,手稿 : Monograph/item
正題名/作者:
An OpenCL Framework for Real-time Inference of Next-Generation Convolutional Neural Networks on FPGAs./
作者:
Kumawat, Sachin.
面頁冊數:
1 online resource (93 pages)
附註:
Source: Masters Abstracts International, Volume: 57-05.
Contained By:
Masters Abstracts International57-05(E).
標題:
Electrical engineering. -
電子資源:
click for full text (PQDT)
ISBN:
9780355764413
An OpenCL Framework for Real-time Inference of Next-Generation Convolutional Neural Networks on FPGAs.
Kumawat, Sachin.
An OpenCL Framework for Real-time Inference of Next-Generation Convolutional Neural Networks on FPGAs.
- 1 online resource (93 pages)
Source: Masters Abstracts International, Volume: 57-05.
Thesis (M.S.)--University of California, Davis, 2017.
Includes bibliographical references
Modern Convolutional Neural Networks (CNNs) consist of billions of multiplications and additions which require the use of parallel computing units such as GPUs, FPGAs and other DSP processors. Consequently, General Purpose GPU (GPGPU) computing has taken this field by storm. At the same time, there has been an increased interest in FPGA based acceleration of CNN inference.
Electronic reproduction.
Ann Arbor, Mich. :
ProQuest,
2018
Mode of access: World Wide Web
ISBN: 9780355764413Subjects--Topical Terms:
596380
Electrical engineering.
Index Terms--Genre/Form:
554714
Electronic books.
An OpenCL Framework for Real-time Inference of Next-Generation Convolutional Neural Networks on FPGAs.
LDR
:03215ntm a2200349Ki 4500
001
917076
005
20181005115847.5
006
m o u
007
cr mn||||a|a||
008
190606s2017 xx obm 000 0 eng d
020
$a
9780355764413
035
$a
(MiAaPQ)AAI10682711
035
$a
(MiAaPQ)ucdavis:17550
035
$a
AAI10682711
040
$a
MiAaPQ
$b
eng
$c
MiAaPQ
$d
NTU
100
1
$a
Kumawat, Sachin.
$3
1191002
245
1 3
$a
An OpenCL Framework for Real-time Inference of Next-Generation Convolutional Neural Networks on FPGAs.
264
0
$c
2017
300
$a
1 online resource (93 pages)
336
$a
text
$b
txt
$2
rdacontent
337
$a
computer
$b
c
$2
rdamedia
338
$a
online resource
$b
cr
$2
rdacarrier
500
$a
Source: Masters Abstracts International, Volume: 57-05.
500
$a
Adviser: Soheil Ghiasi.
502
$a
Thesis (M.S.)--University of California, Davis, 2017.
504
$a
Includes bibliographical references
520
$a
Modern Convolutional Neural Networks (CNNs) consist of billions of multiplications and additions which require the use of parallel computing units such as GPUs, FPGAs and other DSP processors. Consequently, General Purpose GPU (GPGPU) computing has taken this field by storm. At the same time, there has been an increased interest in FPGA based acceleration of CNN inference.
520
$a
In this work, we present FICaffe, a framework for FPGA-based Inference with Caffe, which provides a complete automated generation and mapping of CNN accelerators on FPGAs. We target applications with critical latency requirements and design high processing efficiency accelerators for CNNs. The architecture is structured in a highly concurrent OpenCL library, which enables High Level Synthesis tools to effectively exploit data, task and pipeline parallelism. We propose a unified memory model, that drives exploration of optimal design by matching on-chip and off-chip memory bandwidths available on FPGA platforms. We also identify origins of all clock cycle stalls and overheads inherent to CNN acceleration designs and provide a detailed model to accurately predict the runtime latency with less than 4% error against on-board tests. Furthermore, with FICaffe we provide support for cross-network synthesis, such that it is possible to processes a variety of CNNs, with reasonable efficiency, without long re-compilation hours. FICaffe is integrated with the popular deep learning framework Caffe, and is deployable to a wide variety of CNNs. FICaffe's efficacy is shown by mapping to a 28nm Stratix V GXA7 chip, and both network specific and cross-network performance are reported for AlexNet, VGG, SqueezeNet and GoogLeNet. We show a processing efficiency of 95.8% for the widely-reported VGG benchmark, which outperforms prior work. FICaffe also achieves more than 2X speedup on Stratix V GXA7 compared with the best published results on this chip, to the best of our knowledge.
533
$a
Electronic reproduction.
$b
Ann Arbor, Mich. :
$c
ProQuest,
$d
2018
538
$a
Mode of access: World Wide Web
650
4
$a
Electrical engineering.
$3
596380
650
4
$a
Computer engineering.
$3
569006
655
7
$a
Electronic books.
$2
local
$3
554714
690
$a
0544
690
$a
0464
710
2
$a
ProQuest Information and Learning Co.
$3
1178819
710
2
$a
University of California, Davis.
$b
Electrical and Computer Engineering.
$3
1178925
773
0
$t
Masters Abstracts International
$g
57-05(E).
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=10682711
$z
click for full text (PQDT)
筆 0 讀者評論
多媒體
評論
新增評論
分享你的心得
Export
取書館別
處理中
...
變更密碼[密碼必須為2種組合(英文和數字)及長度為10碼以上]
登入