國立虎尾科技大學 |

FPGA Accelerator Architecture for Q-learning and its Applications in Space Exploration.

紀錄類型:	書目-語言資料,手稿 : Monograph/item
正題名/作者:	FPGA Accelerator Architecture for Q-learning and its Applications in Space Exploration./
作者:	Gankidi, Pranay Reddy.
面頁冊數:	1 online resource (85 pages)
附註:	Source: Masters Abstracts International, Volume: 56-02.
標題:	Computer engineering. -
電子資源:	click for full text (PQDT)
ISBN:	9781369406658

FPGA Accelerator Architecture for Q-learning and its Applications in Space Exploration.
Gankidi, Pranay Reddy.

FPGA Accelerator Architecture for Q-learning and its Applications in Space Exploration. - 1 online resource (85 pages)

Source: Masters Abstracts International, Volume: 56-02.

Thesis (M.S.)--Arizona State University, 2016.

Includes bibliographical references

Achieving human level intelligence is a long-term goal for many Artificial Intelligence (AI) researchers. Recent developments in combining deep learning and reinforcement learning helped us to move a step forward in achieving this goal. Reinforcement learning using a delayed reward mechanism is an approach to machine intelligence which studies decision making with control and how a decision making agent can learn to act optimally in an environment-unaware conditions.

Electronic reproduction.
Ann Arbor, Mich. :
ProQuest,
2018

Mode of access: World Wide Web

ISBN: 9781369406658Subjects--Topical Terms:

569006
Computer engineering.
Index Terms--Genre/Form:

554714
Electronic books.

FPGA Accelerator Architecture for Q-learning and its Applications in Space Exploration.
LDR:03477ntm a2200361K 4500 001 915268
005 20180727125211.5
006 m o u
007 cr mn||||a|a||
008 190606s2016 xx obm 000 0 eng d
020 $a 9781369406658
035 $a (MiAaPQ)AAI10245815
035 $a (MiAaPQ)asu:16618
035 $a AAI10245815
040 $a MiAaPQ $b eng $c MiAaPQ
100 1 $a Gankidi, Pranay Reddy. $3 1188575
245 1 0 $a FPGA Accelerator Architecture for Q-learning and its Applications in Space Exploration.
264 0 $c 2016
300 $a 1 online resource (85 pages)
336 $a text $b txt $2 rdacontent
337 $a computer $b c $2 rdamedia
338 $a online resource $b cr $2 rdacarrier
500 $a Source: Masters Abstracts International, Volume: 56-02.
500 $a Adviser: Jekanthan Thangavelautham.
502 $a Thesis (M.S.)--Arizona State University, 2016.
504 $a Includes bibliographical references
520 $a Achieving human level intelligence is a long-term goal for many Artificial Intelligence (AI) researchers. Recent developments in combining deep learning and reinforcement learning helped us to move a step forward in achieving this goal. Reinforcement learning using a delayed reward mechanism is an approach to machine intelligence which studies decision making with control and how a decision making agent can learn to act optimally in an environment-unaware conditions.
520 $a Q-learning is one of the model-free reinforcement directed learning strategies which uses temporal differences to estimate the performances of state-action pairs called Q values. A simple implementation of Q-learning algorithm can be done using a Q table memory to store and update the Q values. However, with an increase in state space data due to a complex environment, and with an increase in possible number of actions an agent can perform, Q table reaches its space limit and would be difficult to scale well. Q-learning with neural networks eliminates the use of Q table by approximating the Q function using neural networks.
520 $a Autonomous agents need to develop cognitive properties and become self-adaptive to be deployable in any environment. Reinforcement learning with Q-learning have been very efficient in solving such problems. However, embedded systems like space rovers and autonomous robots rarely implement such techniques due to the constraints faced like processing power, chip area, convergence rate and cost of the chip. These problems present a need for a portable, low power, area efficient hardware accelerator to accelerate the process of such learning.
520 $a This problem is targeted by implementing a hardware schematic architecture for Q-learning using Artificial Neural networks. This architecture exploits the massive parallelism provided by neural network with a dedicated fine grain parallelism provided by a Field Programmable Gate Array (FPGA) thereby processing the Q values at a high throughput. Mars exploration rovers currently use Xilinx-Space-grade FPGA devices for image processing, pyrotechnic operation control and obstacle avoidance. The hardware resource consumption for the architecture has been synthesized considering Xilinx Virtex7 FPGA as the target device.
533 $a Electronic reproduction. $b Ann Arbor, Mich. : $c ProQuest, $d 2018
538 $a Mode of access: World Wide Web
650 4 $a Computer engineering. $3 569006
650 4 $a Artificial intelligence. $3 559380
655 7 $a Electronic books. $2 local $3 554714
690 $a 0464
690 $a 0800
710 2 $a ProQuest Information and Learning Co. $3 1178819
710 2 $a Arizona State University. $b Engineering. $3 1178943
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=10245815 $z click for full text (PQDT)