國立虎尾科技大學 |

Data Subset Selection and Its Applications in Computer Vision.

紀錄類型:	書目-語言資料,手稿 : Monograph/item
正題名/作者:	Data Subset Selection and Its Applications in Computer Vision./
作者:	Banerjee, Subhankar.
面頁冊數:	1 online resource (106 pages)
附註:	Source: Dissertations Abstracts International, Volume: 85-12, Section: B.
Contained By:	Dissertations Abstracts International85-12B.
標題:	Computer science. -
電子資源:	click for full text (PQDT)
ISBN:	9798382779911

Data Subset Selection and Its Applications in Computer Vision.
Banerjee, Subhankar.

Data Subset Selection and Its Applications in Computer Vision. - 1 online resource (106 pages)

Source: Dissertations Abstracts International, Volume: 85-12, Section: B.

Thesis (Ph.D.)--The Florida State University, 2024.

Includes bibliographical references

Recent advancements in the field of deep learning have dramatically improved the performance of machine learning models in a variety of applications, including computer vision, text mining, speech processing and fraud detection among others.Deep learning algorithms automatically learn a set of informative features from a given dataset and have depicted commendable performance on a variety of computer vision applications. However, efficient training of deep learning architectures with a large number of hidden layers is mostly dependent on high-end GPUs and distributed computing infrastructures. Training a deep neural network often involves significant computational overhead. Some applications (such as those running on mobile platforms) are severely limited in terms of memory and computational resources. They have limited access to computational resources and face a fundamental challenge in handling large-scale training data. Cloud services can be leveraged for training but involve challenges with data privacy and cost. Such applications necessitate a subset selection algorithm to identify the informative samples from a large amount of training data, so that the deep network can be trained using only the selected subset. In the 2nd chapter we describe a novel subset selection algorithm, DeepSub, to address this practical challenge. Our framework is computationally efficient, easy to implement and also enjoys nice theoretical properties. Our extensive empirical studies on three challenging computer vision applications (face, handwritten digits and object recognition), using three popular deep learning architectures (AlexNet, GoogleNet and ResNet) corroborate the potential of DeepSub over competing baselines. We propose two novel frameworks to address this important and practical challenge in a slightly different configuration. We pose a budgeted subset selection problem as an NP-hard integer quadratic programming (IQP) problem and derive two convex relaxations to solve the same. Our extensive empirical studies on two challenging vision datasets, using four common deep learning models corroborate the potential of our framework for real-world, resource-constrained applications.Mini-batch gradient descent is the standard algorithm to train deep models, where mini-batches of a fixed size are sampled randomly from the training data and passed through the network sequentially. In the 3rd chapter, we present a novel algorithm to generate a deterministic sequence of minibatches to train a deep neural network (rather than a random sequence). Our rationale is to select a mini-batch by minimizing the Maximum Mean Discrepancy (MMD) between the already selected mini-batches and the unselected training samples. We pose the mini-batch selection as a constrained optimization problem and derive a linear programming relaxation to determine the sequence of mini-batches. To the best of our knowledge, this is the first research effort that uses the MMD criterion to determine a sequence of mini-batches to train a deep neural network. The proposed mini-batch sequencing strategy is deterministic and independent of the underlying network architecture and prediction task. Our extensive empirical analyses on three challenging datasets corroborate the merit of our framework over competing baselines. We further study the performance of our framework on two other applications besides classification (regression and semantic segmentation) to validate its generalizability.In the 4th chapter we discuss a subset selection framework to determine the best testing subset when the training domain may or may not be same with the testing domain. We exploit the geometry of the unlabeled data to identify a batch of representative samples which can reconstruct the target data with minimal error. We pose the sample selection as an NP-hard optimization problem and solve it efficiently using an iterative algorithm, with global convergence. Our extensive empirical results on three benchmark image datasets, with several pre-trained deep neural networks, corroborate the promise and potential of our method for real-world applications.

Electronic reproduction.
Ann Arbor, Mich. :
ProQuest,
2024

Mode of access: World Wide Web

ISBN: 9798382779911Subjects--Topical Terms:

573171
Computer science.
Subjects--Index Terms:

Computer visionIndex Terms--Genre/Form:

554714
Electronic books.

Data Subset Selection and Its Applications in Computer Vision.
LDR:05560ntm a22004097 4500 001 1149715
005 20241022112621.5
006 m o d
007 cr bn ---uuuuu
008 250605s2024 xx obm 000 0 eng d
020 $a 9798382779911
035 $a (MiAaPQ)AAI31146406
035 $a AAI31146406
040 $a MiAaPQ $b eng $c MiAaPQ $d NTU
100 1 $a Banerjee, Subhankar. $3 1476048
245 1 0 $a Data Subset Selection and Its Applications in Computer Vision.
264 0 $c 2024
300 $a 1 online resource (106 pages)
336 $a text $b txt $2 rdacontent
337 $a computer $b c $2 rdamedia
338 $a online resource $b cr $2 rdacarrier
500 $a Source: Dissertations Abstracts International, Volume: 85-12, Section: B.
500 $a Advisor: Chakraborty, Shayok.
502 $a Thesis (Ph.D.)--The Florida State University, 2024.
504 $a Includes bibliographical references
520 $a Recent advancements in the field of deep learning have dramatically improved the performance of machine learning models in a variety of applications, including computer vision, text mining, speech processing and fraud detection among others.Deep learning algorithms automatically learn a set of informative features from a given dataset and have depicted commendable performance on a variety of computer vision applications. However, efficient training of deep learning architectures with a large number of hidden layers is mostly dependent on high-end GPUs and distributed computing infrastructures. Training a deep neural network often involves significant computational overhead. Some applications (such as those running on mobile platforms) are severely limited in terms of memory and computational resources. They have limited access to computational resources and face a fundamental challenge in handling large-scale training data. Cloud services can be leveraged for training but involve challenges with data privacy and cost. Such applications necessitate a subset selection algorithm to identify the informative samples from a large amount of training data, so that the deep network can be trained using only the selected subset. In the 2nd chapter we describe a novel subset selection algorithm, DeepSub, to address this practical challenge. Our framework is computationally efficient, easy to implement and also enjoys nice theoretical properties. Our extensive empirical studies on three challenging computer vision applications (face, handwritten digits and object recognition), using three popular deep learning architectures (AlexNet, GoogleNet and ResNet) corroborate the potential of DeepSub over competing baselines. We propose two novel frameworks to address this important and practical challenge in a slightly different configuration. We pose a budgeted subset selection problem as an NP-hard integer quadratic programming (IQP) problem and derive two convex relaxations to solve the same. Our extensive empirical studies on two challenging vision datasets, using four common deep learning models corroborate the potential of our framework for real-world, resource-constrained applications.Mini-batch gradient descent is the standard algorithm to train deep models, where mini-batches of a fixed size are sampled randomly from the training data and passed through the network sequentially. In the 3rd chapter, we present a novel algorithm to generate a deterministic sequence of minibatches to train a deep neural network (rather than a random sequence). Our rationale is to select a mini-batch by minimizing the Maximum Mean Discrepancy (MMD) between the already selected mini-batches and the unselected training samples. We pose the mini-batch selection as a constrained optimization problem and derive a linear programming relaxation to determine the sequence of mini-batches. To the best of our knowledge, this is the first research effort that uses the MMD criterion to determine a sequence of mini-batches to train a deep neural network. The proposed mini-batch sequencing strategy is deterministic and independent of the underlying network architecture and prediction task. Our extensive empirical analyses on three challenging datasets corroborate the merit of our framework over competing baselines. We further study the performance of our framework on two other applications besides classification (regression and semantic segmentation) to validate its generalizability.In the 4th chapter we discuss a subset selection framework to determine the best testing subset when the training domain may or may not be same with the testing domain. We exploit the geometry of the unlabeled data to identify a batch of representative samples which can reconstruct the target data with minimal error. We pose the sample selection as an NP-hard optimization problem and solve it efficiently using an iterative algorithm, with global convergence. Our extensive empirical results on three benchmark image datasets, with several pre-trained deep neural networks, corroborate the promise and potential of our method for real-world applications.
533 $a Electronic reproduction. $b Ann Arbor, Mich. : $c ProQuest, $d 2024
538 $a Mode of access: World Wide Web
650 4 $a Computer science. $3 573171
650 4 $a Computer engineering. $3 569006
653 $a Computer vision
653 $a Deep learning
653 $a Mini-batch selection
653 $a Model optimization
653 $a Model raking
653 $a Subset selection
655 7 $a Electronic books. $2 local $3 554714
690 $a 0984
690 $a 0800
690 $a 0464
710 2 $a ProQuest Information and Learning Co. $3 1178819
710 2 $a The Florida State University. $b Computer Science. $3 1473002
773 0 $t Dissertations Abstracts International $g 85-12B.
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=31146406 $z click for full text (PQDT)