語系:
繁體中文
English
說明(常見問題)
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
Using Syntactic Patterns to Enhance ...
~
Meyer, Bradley B.
Using Syntactic Patterns to Enhance Text Analytics.
紀錄類型:
書目-語言資料,手稿 : Monograph/item
正題名/作者:
Using Syntactic Patterns to Enhance Text Analytics./
作者:
Meyer, Bradley B.
面頁冊數:
1 online resource (139 pages)
附註:
Source: Dissertation Abstracts International, Volume: 79-01(E), Section: B.
Contained By:
Dissertation Abstracts International79-01B(E).
標題:
Computer science. -
電子資源:
click for full text (PQDT)
ISBN:
9780355152128
Using Syntactic Patterns to Enhance Text Analytics.
Meyer, Bradley B.
Using Syntactic Patterns to Enhance Text Analytics.
- 1 online resource (139 pages)
Source: Dissertation Abstracts International, Volume: 79-01(E), Section: B.
Thesis (Ph.D.)
Includes bibliographical references
Large scale product and service reviews proliferate and are commonly found across the web. The ability to harvest, digest and analyze a large corpus of reviews from online websites is still however a difficult problem. This problem is referred to as opinion mining. Opinion mining is an important area of research as advances in the field enable consumers and business to make better informed decisions from others experiences.
Electronic reproduction.
Ann Arbor, Mich. :
ProQuest,
2018
Mode of access: World Wide Web
ISBN: 9780355152128Subjects--Topical Terms:
573171
Computer science.
Index Terms--Genre/Form:
554714
Electronic books.
Using Syntactic Patterns to Enhance Text Analytics.
LDR
:04470ntm a2200397Ki 4500
001
911853
005
20180531103648.5
006
m o u
007
cr mn||||a|a||
008
190606s2017 xx obm 000 0 eng d
020
$a
9780355152128
035
$a
(MiAaPQ)AAI10289620
035
$a
(MiAaPQ)ncat:10753
035
$a
AAI10289620
040
$a
MiAaPQ
$b
eng
$c
MiAaPQ
099
$a
TUL
$f
hyy
$c
available through World Wide Web
100
1
$a
Meyer, Bradley B.
$3
1183942
245
1 0
$a
Using Syntactic Patterns to Enhance Text Analytics.
264
0
$c
2017
300
$a
1 online resource (139 pages)
336
$a
text
$b
txt
$2
rdacontent
337
$a
computer
$b
c
$2
rdamedia
338
$a
online resource
$b
cr
$2
rdacarrier
500
$a
Source: Dissertation Abstracts International, Volume: 79-01(E), Section: B.
500
$a
Adviser: Marwan Bikdash.
502
$a
Thesis (Ph.D.)
$c
North Carolina Agricultural and Technical State University
$d
2017.
504
$a
Includes bibliographical references
520
$a
Large scale product and service reviews proliferate and are commonly found across the web. The ability to harvest, digest and analyze a large corpus of reviews from online websites is still however a difficult problem. This problem is referred to as opinion mining. Opinion mining is an important area of research as advances in the field enable consumers and business to make better informed decisions from others experiences.
520
$a
Much of the research in opinion mining relies upon the Bag-Of-Words assumption which yields computationally tractable methods. The BOW assumption disregards language constructs considering each word to be independent. My research does not follow the often used BOW model rather it diverges by examining recurring patterns found in written languages. This dissertation attempts to answer the question, "Can opinion mining tasks benefit by using syntactic patterns in text?". I answer this question by injecting information gained from decomposing and examining syntactic patterns into aspect extraction and sentence level sentiment analysis methods and performing experiments accordingly.
520
$a
I propose a variant of Latent Dirichlet Allocation (LDA) model referred to as the LDA-POS model. The LDA-POS model examines short range syntactic dependencies by conditioning the word assignment to the topic on both the previous word and the previous word's part-of-speech (POS). I also experiment with a LDA-POS model which filters the word assignment if the previous word emotes low sentiment. Using these models and two comparative models I perform aspect extraction experiments on a large corpus of hotel reviews. My results find that the models which include additional information from syntactic patterns or sentiment signals typically outperform models which do not include this information.
520
$a
Aspect extraction is naturally complemented with sentiment analysis allowing the researcher to understand the reviewers bias toward the aspect. The adjective-noun pair is a standard document level sentiment analysis method. Sentiment analysis however can be examined at different levels of granularity, such as document, paragraph or sentence. It is at the sentence level where this method can fail. I propose a machine learning sentence level sentiment classification technique which uses features constructed from syntactic sentence patterns. My experiments on a hotel reviews dataset have shown the efficacy of these methods verses lexicon BOW methods. I also find that these methods work well across domains a common weakness of lexicon BOW sentiment analysis methods.
520
$a
Lastly I demonstrate how contemporary network algorithms which focus solely on the topological structure of nodes and edges can be extended to incorporate additional information allowing the researcher to study node-centric problems. I illustrate this by presenting the "Hotel Reviewers Problem" which requires the fusing of aspects and sentiment. To solve this problem I propose a novel graph clustering algorithm which efficiently identifies communities of hotel reviewers which hold similar sentiments toward hotel aspects.
533
$a
Electronic reproduction.
$b
Ann Arbor, Mich. :
$c
ProQuest,
$d
2018
538
$a
Mode of access: World Wide Web
650
4
$a
Computer science.
$3
573171
650
4
$a
Linguistics.
$3
557829
655
7
$a
Electronic books.
$2
local
$3
554714
690
$a
0984
690
$a
0290
710
2
$a
ProQuest Information and Learning Co.
$3
1178819
710
2
$a
North Carolina Agricultural and Technical State University.
$b
Computational Science and Engineering.
$3
1180329
773
0
$t
Dissertation Abstracts International
$g
79-01B(E).
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=10289620
$z
click for full text (PQDT)
筆 0 讀者評論
多媒體
評論
新增評論
分享你的心得
Export
取書館別
處理中
...
變更密碼[密碼必須為2種組合(英文和數字)及長度為10碼以上]
登入