語系:
繁體中文
English
說明(常見問題)
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
Supervised categorization for habitual versus episodic sentences.
紀錄類型:
書目-語言資料,手稿 : Monograph/item
正題名/作者:
Supervised categorization for habitual versus episodic sentences./
作者:
Mathew, Thomas A.
面頁冊數:
1 online resource (75 pages)
附註:
Source: Masters Abstracts International, Volume: 71-01.
Contained By:
Masters Abstracts International71-01.
標題:
Computer science. -
電子資源:
click for full text (PQDT)
ISBN:
9781109157741
Supervised categorization for habitual versus episodic sentences.
Mathew, Thomas A.
Supervised categorization for habitual versus episodic sentences.
- 1 online resource (75 pages)
Source: Masters Abstracts International, Volume: 71-01.
Thesis (M.S.)--Georgetown University, 2009.
Includes bibliographical references
In natural language, there are commonly used sentence constructions which express a form of genericity, a general property which summarizes groups of particular episodes or behavior - such sentences are referred to as habitual sentences. The availability of such constructions in human discourse serves a specific communication function which is to provide a mechanism to convey knowledge on common or regular behavior that defines and characterizes the environment we live in. This can be contrasted with episodic sentences which express some degree of detail surrounding irregular events in time and serve more of a reporting purpose in human communication. Given the different linguistic function of habitual sentences and episodic sentences and the different nature of information communicated by them, it can be argued that there is significance in a repeatable method, based on internal and possibly external sentence characteristics, which can make a distinction between these two sentence categories where applicable. This research is conducted with the primary goal of category disambiguation in situations where the verbal predicate of a sentence is known to be used in both a habitual and an episodic context; sentences for which the verbal predicate provides explicit categorization are not considered to prevent undue skew on the results. A secondary objective of the research effort is to attempt to statistically study and report the individual and collective impact of lexical and syntactic features which are known to influence the categorization of a sentence as either habitual or episodic. I have focused on the influence of syntactic and lexical features such as tense, aspect, noun phrase features, temporal modifiers, specific adverb modifiers, and specific verb auxiliaries on genericity. Another secondary objective is to attempt to build and evaluate a supervised machine learning classifier that disambiguates between habitual and episodic sentences using selected syntactic and lexical features and a previously categorized set of sentences for the purpose of training and evaluating the classifier. Using such features I have created a machine classifier that provides 86.3% precision in disambiguating habitual and episodic sentences. This compares against a baseline of 73.1% precision where every sentence is blindly categorized as belonging to the more commonly occurring episodic category. In order to support these objectives, a representative corpus sample was hand-annotated to provide a human perspective on an appropriate category per sentence. The final results support a claim that a machine based classifier trained on the features discussed can out-perform the baseline model. Implications for successful markup of habitual sentences include application in building knowledge-bases that describe general world behavior. Successful markup of episodic sentences can find use in more sensitive information extraction systems.
Electronic reproduction.
Ann Arbor, Mich. :
ProQuest,
2024
Mode of access: World Wide Web
ISBN: 9781109157741Subjects--Topical Terms:
573171
Computer science.
Subjects--Index Terms:
CategorizationIndex Terms--Genre/Form:
554714
Electronic books.
Supervised categorization for habitual versus episodic sentences.
LDR
:04394ntm a22004337 4500
001
1150563
005
20241028051755.5
006
m o d
007
cr bn ---uuuuu
008
250605s2009 xx obm 000 0 eng d
020
$a
9781109157741
035
$a
(MiAaPQ)AAI1464760
035
$a
(MiAaPQ)georgetown:10172
035
$a
AAI1464760
040
$a
MiAaPQ
$b
eng
$c
MiAaPQ
$d
NTU
100
1
$a
Mathew, Thomas A.
$3
1477068
245
1 0
$a
Supervised categorization for habitual versus episodic sentences.
264
0
$c
2009
300
$a
1 online resource (75 pages)
336
$a
text
$b
txt
$2
rdacontent
337
$a
computer
$b
c
$2
rdamedia
338
$a
online resource
$b
cr
$2
rdacarrier
500
$a
Source: Masters Abstracts International, Volume: 71-01.
500
$a
Publisher info.: Dissertation/Thesis.
500
$a
Advisor: Katz, Graham.
502
$a
Thesis (M.S.)--Georgetown University, 2009.
504
$a
Includes bibliographical references
520
$a
In natural language, there are commonly used sentence constructions which express a form of genericity, a general property which summarizes groups of particular episodes or behavior - such sentences are referred to as habitual sentences. The availability of such constructions in human discourse serves a specific communication function which is to provide a mechanism to convey knowledge on common or regular behavior that defines and characterizes the environment we live in. This can be contrasted with episodic sentences which express some degree of detail surrounding irregular events in time and serve more of a reporting purpose in human communication. Given the different linguistic function of habitual sentences and episodic sentences and the different nature of information communicated by them, it can be argued that there is significance in a repeatable method, based on internal and possibly external sentence characteristics, which can make a distinction between these two sentence categories where applicable. This research is conducted with the primary goal of category disambiguation in situations where the verbal predicate of a sentence is known to be used in both a habitual and an episodic context; sentences for which the verbal predicate provides explicit categorization are not considered to prevent undue skew on the results. A secondary objective of the research effort is to attempt to statistically study and report the individual and collective impact of lexical and syntactic features which are known to influence the categorization of a sentence as either habitual or episodic. I have focused on the influence of syntactic and lexical features such as tense, aspect, noun phrase features, temporal modifiers, specific adverb modifiers, and specific verb auxiliaries on genericity. Another secondary objective is to attempt to build and evaluate a supervised machine learning classifier that disambiguates between habitual and episodic sentences using selected syntactic and lexical features and a previously categorized set of sentences for the purpose of training and evaluating the classifier. Using such features I have created a machine classifier that provides 86.3% precision in disambiguating habitual and episodic sentences. This compares against a baseline of 73.1% precision where every sentence is blindly categorized as belonging to the more commonly occurring episodic category. In order to support these objectives, a representative corpus sample was hand-annotated to provide a human perspective on an appropriate category per sentence. The final results support a claim that a machine based classifier trained on the features discussed can out-perform the baseline model. Implications for successful markup of habitual sentences include application in building knowledge-bases that describe general world behavior. Successful markup of episodic sentences can find use in more sensitive information extraction systems.
533
$a
Electronic reproduction.
$b
Ann Arbor, Mich. :
$c
ProQuest,
$d
2024
538
$a
Mode of access: World Wide Web
650
4
$a
Computer science.
$3
573171
650
4
$a
Statistics.
$3
556824
650
4
$a
Linguistics.
$3
557829
653
$a
Categorization
653
$a
Characterizing
653
$a
Episodic
653
$a
Generic
653
$a
Habitual
653
$a
Machine learning
655
7
$a
Electronic books.
$2
local
$3
554714
690
$a
0290
690
$a
0463
690
$a
0984
710
2
$a
Georgetown University.
$b
Linguistics.
$3
1182048
710
2
$a
ProQuest Information and Learning Co.
$3
1178819
773
0
$t
Masters Abstracts International
$g
71-01.
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=1464760
$z
click for full text (PQDT)
筆 0 讀者評論
多媒體
評論
新增評論
分享你的心得
Export
取書館別
處理中
...
變更密碼[密碼必須為2種組合(英文和數字)及長度為10碼以上]
登入