國立虎尾科技大學 |

Latent Data Modeling With Biostatistical Applications.

紀錄類型:	書目-語言資料,手稿 : Monograph/item
正題名/作者:	Latent Data Modeling With Biostatistical Applications./
作者:	Liu, Yan.
面頁冊數:	1 online resource (154 pages)
附註:	Source: Dissertation Abstracts International, Volume: 79-04(E), Section: B.
Contained By:	Dissertation Abstracts International79-04B(E).
標題:	Statistics. -
電子資源:	click for full text (PQDT)
ISBN:	9780355343748

Latent Data Modeling With Biostatistical Applications.
Liu, Yan.

Latent Data Modeling With Biostatistical Applications. - 1 online resource (154 pages)

Source: Dissertation Abstracts International, Volume: 79-04(E), Section: B.

Thesis (Ph.D.)

Includes bibliographical references

This dissertation consists of three projects that make use of latent variable modeling techniques. One of the focuses of this dissertation research has been in the area of spatial and spatio-temporal modeling. The specific topics and motivating problems in this study have been fully supported and motivated by the Companion Animal Parasite Council (CAPC). In particular, the CAPC has developed a rather extensive database, which houses several common dog disease data sets collected throughout the conterminous United States. This data exists at a county level and was collected monthly over a span of 5 consecutive years, and exhibits strong spatial and temporal correlation structures. Further, due to non-reporting counties a significant portion of the data is missing, both in the spatial and temporal domain. The goal of our work in this area was to identify risk factors significantly related to the prevalence of the various diseases and to develop models which could be used to accurately forecast future disease trends nationwide. No similar work has been completed for these diseases on the spatio-temporal scale that we consider. To accomplish this task, we developed and implemented a Bayesian spatio-temporal regression model to analyze the data. Due to the relatively large spatial scale and complex structure of the data, a key challenge was developing computationally efficient algorithms that could be used to implement Markov chain Monte Carlo (MCMC) techniques. Once this was completed, we implemented our models to assess the relevance of the considered covariates and to forecast future trends. In addition to the spatial and spatio-temporal modeling problems, this dissertation research also focus on developing new modeling techniques for data collected on pooled specimens. The concept of using pooling as a more cost effective data collection technique is becoming pervasive in the biological sciences and elsewhere. In particular pooled data is collected by first amalgamating several specimens (e.g., blood, urine, etc.), collected from individuals, into a pooled sample, this pooled sample is then measured for a characteristic of interest; e.g., in infectious disease studies the pooled outcome is typically binary indicating disease status and in biological marker (i.e., biomarker) evaluation studies the outcome is continuous. In either case, information on several individuals is obtained at the expense of making only one measurement, thus reducing the cost of data collection. However, the statistical analysis of measurements (either binary or continuous) taken on pools is often fraught with many challenges. In my dissertation research, I have considered developing regression methods for both continuous and binary outcomes measured on pools. For continuous outcomes, I proposed a general regression framework which can be used to analyze pooled outcomes under practically all parametric models. This was accomplished through the use of an advanced Monte Carlo sampling algorithm, which was implemented to approximate the observed data likelihood. Proceeding in this fashion, also allows us to account for measurement error, which has not been accounted for previously, and led to the development of computationally efficient software which can be used to implement the proposed approach. For binary outcomes (usually referred to as group testing data), I developed a novel Bayesian generalized additive model. Specifically, the proposed approach assumes the linear predictor depends on several unknown smooth functions of some covariates as well as linear combinations of other covariates. In addition, our model can account for imperfect testing, and can be used to analyze data collected according to any group testing process.

Electronic reproduction.
Ann Arbor, Mich. :
ProQuest,
2018

Mode of access: World Wide Web

ISBN: 9780355343748Subjects--Topical Terms:

556824
Statistics.
Index Terms--Genre/Form:

554714
Electronic books.

Latent Data Modeling With Biostatistical Applications.
LDR:05078ntm a2200361Ki 4500 001 911264
005 20180529081901.5
006 m o u
007 cr mn||||a|a||
008 190606s2017 xx obm 000 0 eng d
020 $a 9780355343748
035 $a (MiAaPQ)AAI10601721
035 $a (MiAaPQ)clemson:14431
035 $a AAI10601721
040 $a MiAaPQ $b eng $c MiAaPQ
099 $a TUL $f hyy $c available through World Wide Web
100 1 $a Liu, Yan. $3 1060385
245 1 0 $a Latent Data Modeling With Biostatistical Applications.
264 0 $c 2017
300 $a 1 online resource (154 pages)
336 $a text $b txt $2 rdacontent
337 $a computer $b c $2 rdamedia
338 $a online resource $b cr $2 rdacarrier
500 $a Source: Dissertation Abstracts International, Volume: 79-04(E), Section: B.
500 $a Advisers: Christopher S. McMahan; Colin M. Gallagher.
502 $a Thesis (Ph.D.) $c Clemson University $d 2017.
504 $a Includes bibliographical references
520 $a This dissertation consists of three projects that make use of latent variable modeling techniques. One of the focuses of this dissertation research has been in the area of spatial and spatio-temporal modeling. The specific topics and motivating problems in this study have been fully supported and motivated by the Companion Animal Parasite Council (CAPC). In particular, the CAPC has developed a rather extensive database, which houses several common dog disease data sets collected throughout the conterminous United States. This data exists at a county level and was collected monthly over a span of 5 consecutive years, and exhibits strong spatial and temporal correlation structures. Further, due to non-reporting counties a significant portion of the data is missing, both in the spatial and temporal domain. The goal of our work in this area was to identify risk factors significantly related to the prevalence of the various diseases and to develop models which could be used to accurately forecast future disease trends nationwide. No similar work has been completed for these diseases on the spatio-temporal scale that we consider. To accomplish this task, we developed and implemented a Bayesian spatio-temporal regression model to analyze the data. Due to the relatively large spatial scale and complex structure of the data, a key challenge was developing computationally efficient algorithms that could be used to implement Markov chain Monte Carlo (MCMC) techniques. Once this was completed, we implemented our models to assess the relevance of the considered covariates and to forecast future trends. In addition to the spatial and spatio-temporal modeling problems, this dissertation research also focus on developing new modeling techniques for data collected on pooled specimens. The concept of using pooling as a more cost effective data collection technique is becoming pervasive in the biological sciences and elsewhere. In particular pooled data is collected by first amalgamating several specimens (e.g., blood, urine, etc.), collected from individuals, into a pooled sample, this pooled sample is then measured for a characteristic of interest; e.g., in infectious disease studies the pooled outcome is typically binary indicating disease status and in biological marker (i.e., biomarker) evaluation studies the outcome is continuous. In either case, information on several individuals is obtained at the expense of making only one measurement, thus reducing the cost of data collection. However, the statistical analysis of measurements (either binary or continuous) taken on pools is often fraught with many challenges. In my dissertation research, I have considered developing regression methods for both continuous and binary outcomes measured on pools. For continuous outcomes, I proposed a general regression framework which can be used to analyze pooled outcomes under practically all parametric models. This was accomplished through the use of an advanced Monte Carlo sampling algorithm, which was implemented to approximate the observed data likelihood. Proceeding in this fashion, also allows us to account for measurement error, which has not been accounted for previously, and led to the development of computationally efficient software which can be used to implement the proposed approach. For binary outcomes (usually referred to as group testing data), I developed a novel Bayesian generalized additive model. Specifically, the proposed approach assumes the linear predictor depends on several unknown smooth functions of some covariates as well as linear combinations of other covariates. In addition, our model can account for imperfect testing, and can be used to analyze data collected according to any group testing process.
533 $a Electronic reproduction. $b Ann Arbor, Mich. : $c ProQuest, $d 2018
538 $a Mode of access: World Wide Web
650 4 $a Statistics. $3 556824
650 4 $a Biostatistics. $3 783654
650 4 $a Epidemiology. $3 635923
655 7 $a Electronic books. $2 local $3 554714
690 $a 0463
690 $a 0308
690 $a 0766
710 2 $a ProQuest Information and Learning Co. $3 1178819
710 2 $a Clemson University. $b Mathematical Science. $3 1182500
773 0 $t Dissertation Abstracts International $g 79-04B(E).
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=10601721 $z click for full text (PQDT)