語系:
繁體中文
English
說明(常見問題)
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
EarthSciBert : = Pre-Trained Language Model for Information Retrieval in Earth Science.
紀錄類型:
書目-語言資料,手稿 : Monograph/item
正題名/作者:
EarthSciBert :/
其他題名:
Pre-Trained Language Model for Information Retrieval in Earth Science.
作者:
Shrestha, Rishabh.
面頁冊數:
1 online resource (67 pages)
附註:
Source: Masters Abstracts International, Volume: 85-07.
Contained By:
Masters Abstracts International85-07.
標題:
Aerospace engineering. -
電子資源:
click for full text (PQDT)
ISBN:
9798381443578
EarthSciBert : = Pre-Trained Language Model for Information Retrieval in Earth Science.
Shrestha, Rishabh.
EarthSciBert :
Pre-Trained Language Model for Information Retrieval in Earth Science. - 1 online resource (67 pages)
Source: Masters Abstracts International, Volume: 85-07.
Thesis (M.S.)--The University of Mississippi, 2023.
Includes bibliographical references
Large Language Models (LLMs), such as Generative Pre-trained Transformers (GPT) and Bidirectional Encoder Representations from Transformers (BERT), have significantly advanced Natural Language Processing (NLP), achieving state-of-the-art results in various tasks. Notably used in systems like Google Search, these models can be domain-tuned through domain-specific pre-training. In Earth Science, where massive data is generated and made publicly available by institutions such as the National Aeronautics and Space Administration (NASA), understanding the usage of such datasets in scientific literature is critical to assessing their scientific impact.This thesis develops and introduces EarthSciBERT, a domain-specific BERT developed for Earth Science by pre-training BERT on Earth Science literature abstracts and further fine-tuning it for dataset retrieval and ranking in publications. The performance of EarthSciBERT was compared against several BERT models, including the standard BERT-Base, a BERT model pre-trained from scratch, and a continually pre-trained BERT from BERT-Base, and also with all these models with the added fine-tuning step. The effectiveness of these models in information retrieval-such as retrieving or suggesting appropriate datasets for Earth Science research-was evaluated using metrics such as Precision at k, Recall at k, Mean Average Precision (MAP), etc. The findings indicate that EarthSciBERT outperforms the original BERT model and other variants when using various metrics. This suggests that domain-adapted BERT models hold promise for specialized information retrieval tasks.This study offers a novel method for using Machine Learning models for information retrieval tasks to retrieve and rank datasets in Earth Science. It sets the foundation for similar advancements in other scientific domains, such as medicine and biology. This study also contributes to Artificial Intelligence (AI) advancements by highlighting the significant contributions such domain-specific language models can make and advance.
Electronic reproduction.
Ann Arbor, Mich. :
ProQuest,
2024
Mode of access: World Wide Web
ISBN: 9798381443578Subjects--Topical Terms:
686400
Aerospace engineering.
Subjects--Index Terms:
Earth ScienceIndex Terms--Genre/Form:
554714
Electronic books.
EarthSciBert : = Pre-Trained Language Model for Information Retrieval in Earth Science.
LDR
:03424ntm a22003977 4500
001
1151637
005
20241113060913.5
006
m o d
007
cr mn ---uuuuu
008
250605s2023 xx obm 000 0 eng d
020
$a
9798381443578
035
$a
(MiAaPQ)AAI30813798
035
$a
AAI30813798
040
$a
MiAaPQ
$b
eng
$c
MiAaPQ
$d
NTU
100
1
$a
Shrestha, Rishabh.
$3
1478441
245
1 0
$a
EarthSciBert :
$b
Pre-Trained Language Model for Information Retrieval in Earth Science.
264
0
$c
2023
300
$a
1 online resource (67 pages)
336
$a
text
$b
txt
$2
rdacontent
337
$a
computer
$b
c
$2
rdamedia
338
$a
online resource
$b
cr
$2
rdacarrier
500
$a
Source: Masters Abstracts International, Volume: 85-07.
500
$a
Advisor: Le, Thai.
502
$a
Thesis (M.S.)--The University of Mississippi, 2023.
504
$a
Includes bibliographical references
520
$a
Large Language Models (LLMs), such as Generative Pre-trained Transformers (GPT) and Bidirectional Encoder Representations from Transformers (BERT), have significantly advanced Natural Language Processing (NLP), achieving state-of-the-art results in various tasks. Notably used in systems like Google Search, these models can be domain-tuned through domain-specific pre-training. In Earth Science, where massive data is generated and made publicly available by institutions such as the National Aeronautics and Space Administration (NASA), understanding the usage of such datasets in scientific literature is critical to assessing their scientific impact.This thesis develops and introduces EarthSciBERT, a domain-specific BERT developed for Earth Science by pre-training BERT on Earth Science literature abstracts and further fine-tuning it for dataset retrieval and ranking in publications. The performance of EarthSciBERT was compared against several BERT models, including the standard BERT-Base, a BERT model pre-trained from scratch, and a continually pre-trained BERT from BERT-Base, and also with all these models with the added fine-tuning step. The effectiveness of these models in information retrieval-such as retrieving or suggesting appropriate datasets for Earth Science research-was evaluated using metrics such as Precision at k, Recall at k, Mean Average Precision (MAP), etc. The findings indicate that EarthSciBERT outperforms the original BERT model and other variants when using various metrics. This suggests that domain-adapted BERT models hold promise for specialized information retrieval tasks.This study offers a novel method for using Machine Learning models for information retrieval tasks to retrieve and rank datasets in Earth Science. It sets the foundation for similar advancements in other scientific domains, such as medicine and biology. This study also contributes to Artificial Intelligence (AI) advancements by highlighting the significant contributions such domain-specific language models can make and advance.
533
$a
Electronic reproduction.
$b
Ann Arbor, Mich. :
$c
ProQuest,
$d
2024
538
$a
Mode of access: World Wide Web
650
4
$a
Aerospace engineering.
$3
686400
650
4
$a
Mathematics.
$3
527692
650
4
$a
Computer science.
$3
573171
653
$a
Earth Science
653
$a
Information retrieval
653
$a
Large Language Models
653
$a
Scientific domains
653
$a
Mean Average Precision
655
7
$a
Electronic books.
$2
local
$3
554714
690
$a
0984
690
$a
0405
690
$a
0538
710
2
$a
The University of Mississippi.
$b
Computer Science.
$3
1185406
710
2
$a
ProQuest Information and Learning Co.
$3
1178819
773
0
$t
Masters Abstracts International
$g
85-07.
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=30813798
$z
click for full text (PQDT)
筆 0 讀者評論
多媒體
評論
新增評論
分享你的心得
Export
取書館別
處理中
...
變更密碼[密碼必須為2種組合(英文和數字)及長度為10碼以上]
登入