語系:
繁體中文
English
說明(常見問題)
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
Statistical Inference for Big Data.
~
ProQuest Information and Learning Co.
Statistical Inference for Big Data.
紀錄類型:
書目-語言資料,手稿 : Monograph/item
正題名/作者:
Statistical Inference for Big Data./
作者:
Zhao, Tianqi.
面頁冊數:
1 online resource (381 pages)
附註:
Source: Dissertation Abstracts International, Volume: 78-11(E), Section: B.
Contained By:
Dissertation Abstracts International78-11B(E).
標題:
Statistics. -
電子資源:
click for full text (PQDT)
ISBN:
9780355041750
Statistical Inference for Big Data.
Zhao, Tianqi.
Statistical Inference for Big Data.
- 1 online resource (381 pages)
Source: Dissertation Abstracts International, Volume: 78-11(E), Section: B.
Thesis (Ph.D.)
Includes bibliographical references
This dissertation develops novel inferential methods and theory for assessing uncertainty of modern statistical procedures unique to big data analysis. In particular, we mainly focus on four challenging aspects of big data: massive sample size, high dimensionality, heterogeneity and complexity. To begin with, we consider a partially linear framework for modeling massive heterogeneous data. The major goal is to extract common features across all sub-populations while exploring heterogeneity of each sub-population. In particular, we propose an aggregation type estimator for the commonality parameter that possesses the (non-asymptotic) minimax optimal bound and asymptotic distribution as if there were no heterogeneity. This oracle result holds when the number of sub-populations does not grow too fast.
Electronic reproduction.
Ann Arbor, Mich. :
ProQuest,
2018
Mode of access: World Wide Web
ISBN: 9780355041750Subjects--Topical Terms:
556824
Statistics.
Index Terms--Genre/Form:
554714
Electronic books.
Statistical Inference for Big Data.
LDR
:04503ntm a2200373Ki 4500
001
911240
005
20180529081900.5
006
m o u
007
cr mn||||a|a||
008
190606s2017 xx obm 000 0 eng d
020
$a
9780355041750
035
$a
(MiAaPQ)AAI10284976
035
$a
(MiAaPQ)princeton:12188
035
$a
AAI10284976
040
$a
MiAaPQ
$b
eng
$c
MiAaPQ
099
$a
TUL
$f
hyy
$c
available through World Wide Web
100
1
$a
Zhao, Tianqi.
$3
1182939
245
1 0
$a
Statistical Inference for Big Data.
264
0
$c
2017
300
$a
1 online resource (381 pages)
336
$a
text
$b
txt
$2
rdacontent
337
$a
computer
$b
c
$2
rdamedia
338
$a
online resource
$b
cr
$2
rdacarrier
500
$a
Source: Dissertation Abstracts International, Volume: 78-11(E), Section: B.
500
$a
Adviser: Han Liu.
502
$a
Thesis (Ph.D.)
$c
Princeton University
$d
2017.
504
$a
Includes bibliographical references
520
$a
This dissertation develops novel inferential methods and theory for assessing uncertainty of modern statistical procedures unique to big data analysis. In particular, we mainly focus on four challenging aspects of big data: massive sample size, high dimensionality, heterogeneity and complexity. To begin with, we consider a partially linear framework for modeling massive heterogeneous data. The major goal is to extract common features across all sub-populations while exploring heterogeneity of each sub-population. In particular, we propose an aggregation type estimator for the commonality parameter that possesses the (non-asymptotic) minimax optimal bound and asymptotic distribution as if there were no heterogeneity. This oracle result holds when the number of sub-populations does not grow too fast.
520
$a
The next problem focuses on the challenge of the high dimensionality. We propose a robust inferential procedure for assessing uncertainties of parameter estimation in high dimensional linear models, where the dimension p can grow exponentially fast with the sample size n. We develop a new de-biasing framework tailored for nonsmooth loss functions. Our framework enables us to exploit the composite quantile function to construct a de-biased CQR estimator. This estimator is robust, and preserves efficiency in the sense that the worst case efficiency loss is less than 30% compared to square-loss-based procedures. In many cases our estimator is close to or better than the latter.
520
$a
Next, we consider the problem of high dimensional semiparametric generalized linear models. We propose a new inferential framework which addresses a variety of challenging problems in high dimensional data analysis, including incomplete data, selection bias, and heterogeneity. First, we develop a regularized statistical chromatography approach to infer the parameter of interest under the proposed semiparametric generalized linear model without the need of estimating the unknown base measure function. Then we propose a new likelihood ratio based framework to construct post-regularization confidence regions and tests for the low dimensional components of high dimensional parameters. We demonstrate the consequences of the general theory by using examples of missing data and multiple datasets inference.
520
$a
Lastly, we study the rank likelihood as a powerful inferential tool in multivariate analysis. The computation of the full rank likelihood function is often intractable in large-scale datasets. Motivated by this, we resort to lower order rank approximations and propose a new family of local rank likelihood functions. In particular, we show that the maximizer of the second-order local rank likelihood coincides with the Kendall's tau correlation matrix for the transelliptical distribution family. Motivated by this new interpretation of the Kendall's tau, we then investigate the third-order local rank likelihood, whose maximizer defines a new estimator that can be viewed as the third-order counterpart of the Kendall's tau correlation matrix. We establish asymptotic normality and calculate its limiting variance under the Gaussian copula model, which enables the construction of confidence intervals based on this new estimator.
533
$a
Electronic reproduction.
$b
Ann Arbor, Mich. :
$c
ProQuest,
$d
2018
538
$a
Mode of access: World Wide Web
650
4
$a
Statistics.
$3
556824
655
7
$a
Electronic books.
$2
local
$3
554714
690
$a
0463
710
2
$a
ProQuest Information and Learning Co.
$3
1178819
710
2
$a
Princeton University.
$b
Operations Research and Financial Engineering.
$3
1182940
773
0
$t
Dissertation Abstracts International
$g
78-11B(E).
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=10284976
$z
click for full text (PQDT)
筆 0 讀者評論
多媒體
評論
新增評論
分享你的心得
Export
取書館別
處理中
...
變更密碼[密碼必須為2種組合(英文和數字)及長度為10碼以上]
登入