國立虎尾科技大學 |

Topics in Statistical Inference for Massive Data and High-Dimensional Data.

紀錄類型:	書目-語言資料,手稿 : Monograph/item
正題名/作者:	Topics in Statistical Inference for Massive Data and High-Dimensional Data./
作者:	Peng, Liuhua.
面頁冊數:	1 online resource (152 pages)
附註:	Source: Dissertation Abstracts International, Volume: 79-04(E), Section: B.
Contained By:	Dissertation Abstracts International79-04B(E).
標題:	Statistics. -
電子資源:	click for full text (PQDT)
ISBN:	9780355334302

Topics in Statistical Inference for Massive Data and High-Dimensional Data.
Peng, Liuhua.

Topics in Statistical Inference for Massive Data and High-Dimensional Data. - 1 online resource (152 pages)

Source: Dissertation Abstracts International, Volume: 79-04(E), Section: B.

Thesis (Ph.D.)--Iowa State University, 2017.

Includes bibliographical references

This dissertation consists of three research papers that deal with three different problems in statistics concerning high-volume datasets. The first paper studies the distributed statistical inference for massive data. With the increasing size of the data, computational complexity and feasibility should be taken into consideration for statistical analyses. We investigate the statistical efficiency of the distributed version of a general class of statistics. Distributed bootstrap algorithms are proposed to approximate the distribution of the distributed statistics. These approaches relief the computational burdens of conventional methods while preserving adequate statistical efficiency. The second paper deals with testing the identity and sphericity hypotheses problem regarding high-dimensional covariance matrices, with a focus on improving the power of existing methods. By taking advantage of the sparsity in the underlying covariance matrices, the power improvement is accomplished by utilizing the banding estimator for the covariance matrices, which leads to a significant reduction in the variance of the test statistics. The last paper considers variable selection for high-dimensional data. Distance-based variable importance measures are proposed to rank and select variables with dependence structures being taken into consideration. The importance measures are inspired by the multi-response permutation procedure (MRPP) and the energy distance. A backward selection algorithm is developed to discover important variables and to improve the power of the original MRPP for high-dimensional data.

Electronic reproduction.
Ann Arbor, Mich. :
ProQuest,
2018

Mode of access: World Wide Web

ISBN: 9780355334302Subjects--Topical Terms:

556824
Statistics.
Index Terms--Genre/Form:

554714
Electronic books.

Topics in Statistical Inference for Massive Data and High-Dimensional Data.
LDR:02837ntm a2200325Ki 4500 001 920605
005 20181203094030.5
006 m o u
007 cr mn||||a|a||
008 190606s2017 xx obm 000 0 eng d
020 $a 9780355334302
035 $a (MiAaPQ)AAI10254829
035 $a (MiAaPQ)iastate:16261
035 $a AAI10254829
040 $a MiAaPQ $b eng $c MiAaPQ $d NTU
100 1 $a Peng, Liuhua. $3 1195456
245 1 0 $a Topics in Statistical Inference for Massive Data and High-Dimensional Data.
264 0 $c 2017
300 $a 1 online resource (152 pages)
336 $a text $b txt $2 rdacontent
337 $a computer $b c $2 rdamedia
338 $a online resource $b cr $2 rdacarrier
500 $a Source: Dissertation Abstracts International, Volume: 79-04(E), Section: B.
500 $a Advisers: Song Xi Chen; Dan Nettleton.
502 $a Thesis (Ph.D.)--Iowa State University, 2017.
504 $a Includes bibliographical references
520 $a This dissertation consists of three research papers that deal with three different problems in statistics concerning high-volume datasets. The first paper studies the distributed statistical inference for massive data. With the increasing size of the data, computational complexity and feasibility should be taken into consideration for statistical analyses. We investigate the statistical efficiency of the distributed version of a general class of statistics. Distributed bootstrap algorithms are proposed to approximate the distribution of the distributed statistics. These approaches relief the computational burdens of conventional methods while preserving adequate statistical efficiency. The second paper deals with testing the identity and sphericity hypotheses problem regarding high-dimensional covariance matrices, with a focus on improving the power of existing methods. By taking advantage of the sparsity in the underlying covariance matrices, the power improvement is accomplished by utilizing the banding estimator for the covariance matrices, which leads to a significant reduction in the variance of the test statistics. The last paper considers variable selection for high-dimensional data. Distance-based variable importance measures are proposed to rank and select variables with dependence structures being taken into consideration. The importance measures are inspired by the multi-response permutation procedure (MRPP) and the energy distance. A backward selection algorithm is developed to discover important variables and to improve the power of the original MRPP for high-dimensional data.
533 $a Electronic reproduction. $b Ann Arbor, Mich. : $c ProQuest, $d 2018
538 $a Mode of access: World Wide Web
650 4 $a Statistics. $3 556824
655 7 $a Electronic books. $2 local $3 554714
690 $a 0463
710 2 $a ProQuest Information and Learning Co. $3 1178819
710 2 $a Iowa State University. $b Statistics. $3 1182662
773 0 $t Dissertation Abstracts International $g 79-04B(E).
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=10254829 $z click for full text (PQDT)