國立虎尾科技大學 |

Comparing Human and Machine Visual Perception.

紀錄類型:	書目-語言資料,手稿 : Monograph/item
正題名/作者:	Comparing Human and Machine Visual Perception./
作者:	Veerabadran, Vijay.
面頁冊數:	1 online resource (125 pages)
附註:	Source: Dissertations Abstracts International, Volume: 85-10, Section: B.
Contained By:	Dissertations Abstracts International85-10B.
標題:	Neurosciences. -
電子資源:	click for full text (PQDT)
ISBN:	9798382223575

Comparing Human and Machine Visual Perception.
Veerabadran, Vijay.

Comparing Human and Machine Visual Perception. - 1 online resource (125 pages)

Source: Dissertations Abstracts International, Volume: 85-10, Section: B.

Thesis (Ph.D.)--University of California, San Diego, 2024.

Includes bibliographical references

In this dissertation, we focus on examining differences in perception between humans and computer vision models and contribute novel research methods to increase their alignment. In recent studies comparing how humans and deep neural networks used in computer vision perceive visual stimuli, we find extensive evidence on how these highly performant models' visual perception often poorly aligns with human perception. For example, these models have been shown to classify objects in a scene solely based on a small fraction of border pixels in an image (Carter et al., 2021), preferentially attend to information outside the human frequency sensitivity spectrum (Subramanian et al., 2023), and (in)famously classify images by local texture rather than by global form (Geirhos et al., 2019b). These deviations of machine vision are often due to their overreliance on short-range features and our first set of contributions directly address this by adding lateral connections---critical for long-range spatial feature processing in biological vision---into deep neural networks. First, in Chapters 2 and 3, we introduce the bio-inspired DivNormEI and V1Net models respectively which implement feedforward and recurrent lateral connections in deep neural networks (DNNs). We show that these models develop bio-realistic orientation tuning and directly lead to robust object recognition/segmentation. We also show that recurrent lateral connections give rise to parameter-efficient contour integration (a task well-known to test long-range feature integration capacity). In Chapter 4, we introduce LocRNN, a high-performing recurrent circuit evolved from V1Net and propose combining it with Adaptive Computation Time (ACT) to learn a dynamic instance-conditional number of RNN timesteps. ACT enables LocRNN to generalize in a zero-shot manner to novel test-time difficulty levels of challenging visual path integration tasks. These chapters together highlight the effectiveness of our proposed bio-inspired design in creating human-like robustness to out-of-distribution settings. Complimentary to bio-inspired design, we also propose a new way to compare human and machine perception; advancing this area helps us better identify factors of deviation between these systems and guides us in building future neural networks with stronger alignment. In an elaborate psychophysics study described in Chapter 5, we explored how humans and deep neural networks alike, can be tricked by barely noticeable adversarial changes to images. We discuss the degree of alignment between the two visual systems and identify factors which influence this alignment. Our actionable predictions we discuss in this Chapter inspires the design of future neural network models with a goal of strengthening their alignment to human perception. We conclude this dissertation by eliciting important future directions of expansion of the research described here to build the next generation of computer vision models increasingly aligned with human vision.

Electronic reproduction.
Ann Arbor, Mich. :
ProQuest,
2024

Mode of access: World Wide Web

ISBN: 9798382223575Subjects--Topical Terms:

593561
Neurosciences.
Subjects--Index Terms:

Adversarial machine learningIndex Terms--Genre/Form:

554714
Electronic books.

Comparing Human and Machine Visual Perception.
LDR:04448ntm a22004217 4500 001 1146254
005 20240812064353.5
006 m o d
007 cr bn ---uuuuu
008 250605s2024 xx obm 000 0 eng d
020 $a 9798382223575
035 $a (MiAaPQ)AAI30992102
035 $a AAI30992102
040 $a MiAaPQ $b eng $c MiAaPQ $d NTU
100 1 $a Veerabadran, Vijay. $3 1471613
245 1 0 $a Comparing Human and Machine Visual Perception.
264 0 $c 2024
300 $a 1 online resource (125 pages)
336 $a text $b txt $2 rdacontent
337 $a computer $b c $2 rdamedia
338 $a online resource $b cr $2 rdacarrier
500 $a Source: Dissertations Abstracts International, Volume: 85-10, Section: B.
500 $a Advisor: de Sa, Virginia R.
502 $a Thesis (Ph.D.)--University of California, San Diego, 2024.
504 $a Includes bibliographical references
520 $a In this dissertation, we focus on examining differences in perception between humans and computer vision models and contribute novel research methods to increase their alignment. In recent studies comparing how humans and deep neural networks used in computer vision perceive visual stimuli, we find extensive evidence on how these highly performant models' visual perception often poorly aligns with human perception. For example, these models have been shown to classify objects in a scene solely based on a small fraction of border pixels in an image (Carter et al., 2021), preferentially attend to information outside the human frequency sensitivity spectrum (Subramanian et al., 2023), and (in)famously classify images by local texture rather than by global form (Geirhos et al., 2019b). These deviations of machine vision are often due to their overreliance on short-range features and our first set of contributions directly address this by adding lateral connections---critical for long-range spatial feature processing in biological vision---into deep neural networks. First, in Chapters 2 and 3, we introduce the bio-inspired DivNormEI and V1Net models respectively which implement feedforward and recurrent lateral connections in deep neural networks (DNNs). We show that these models develop bio-realistic orientation tuning and directly lead to robust object recognition/segmentation. We also show that recurrent lateral connections give rise to parameter-efficient contour integration (a task well-known to test long-range feature integration capacity). In Chapter 4, we introduce LocRNN, a high-performing recurrent circuit evolved from V1Net and propose combining it with Adaptive Computation Time (ACT) to learn a dynamic instance-conditional number of RNN timesteps. ACT enables LocRNN to generalize in a zero-shot manner to novel test-time difficulty levels of challenging visual path integration tasks. These chapters together highlight the effectiveness of our proposed bio-inspired design in creating human-like robustness to out-of-distribution settings. Complimentary to bio-inspired design, we also propose a new way to compare human and machine perception; advancing this area helps us better identify factors of deviation between these systems and guides us in building future neural networks with stronger alignment. In an elaborate psychophysics study described in Chapter 5, we explored how humans and deep neural networks alike, can be tricked by barely noticeable adversarial changes to images. We discuss the degree of alignment between the two visual systems and identify factors which influence this alignment. Our actionable predictions we discuss in this Chapter inspires the design of future neural network models with a goal of strengthening their alignment to human perception. We conclude this dissertation by eliciting important future directions of expansion of the research described here to build the next generation of computer vision models increasingly aligned with human vision.
533 $a Electronic reproduction. $b Ann Arbor, Mich. : $c ProQuest, $d 2024
538 $a Mode of access: World Wide Web
650 4 $a Neurosciences. $3 593561
650 4 $a Computer science. $3 573171
650 4 $a Statistics. $3 556824
650 4 $a Experimental psychology. $3 1180476
653 $a Adversarial machine learning
653 $a Recurrent Neural Networks
653 $a Computer vision
653 $a Deep learning
653 $a Object recognition
655 7 $a Electronic books. $2 local $3 554714
690 $a 0800
690 $a 0317
690 $a 0984
690 $a 0623
690 $a 0463
710 2 $a ProQuest Information and Learning Co. $3 1178819
710 2 $a University of California, San Diego. $b Cognitive Science. $3 1186722
773 0 $t Dissertations Abstracts International $g 85-10B.
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=30992102 $z click for full text (PQDT)