語系:
繁體中文
English
說明(常見問題)
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
Search in Adverse Environments.
~
Soo, Jason J.
Search in Adverse Environments.
紀錄類型:
書目-語言資料,手稿 : Monograph/item
正題名/作者:
Search in Adverse Environments./
作者:
Soo, Jason J.
面頁冊數:
1 online resource (104 pages)
附註:
Source: Dissertation Abstracts International, Volume: 77-09(E), Section: B.
標題:
Information technology. -
電子資源:
click for full text (PQDT)
ISBN:
9781339672960
Search in Adverse Environments.
Soo, Jason J.
Search in Adverse Environments.
- 1 online resource (104 pages)
Source: Dissertation Abstracts International, Volume: 77-09(E), Section: B.
Thesis (Ph.D.)--Georgetown University, 2016.
Includes bibliographical references
Today, search is a ubiquitous task. This task often carries the expectation that relevant results shall be returned within the first 10 documents. While the advent of modern online search engines have created such expectations, there exist environments in which such approaches are not omnipotent. These environments are defined by their lack of vital resources, such as the Internet, query logs, user models, and refined algorithms. This amalgam of resources is the keystone of the modern search systems. Without these resources, systemic error rates become intractable, and a novel, customized approach is required.
Electronic reproduction.
Ann Arbor, Mich. :
ProQuest,
2018
Mode of access: World Wide Web
ISBN: 9781339672960Subjects--Topical Terms:
559429
Information technology.
Index Terms--Genre/Form:
554714
Electronic books.
Search in Adverse Environments.
LDR
:03291ntm a2200385K 4500
001
912771
005
20180608130007.5
006
m o u
007
cr mn||||a|a||
008
190606s2016 xx obm 000 0 eng d
020
$a
9781339672960
035
$a
(MiAaPQ)AAI10103738
035
$a
(MiAaPQ)georgetown:13335
035
$a
AAI10103738
040
$a
MiAaPQ
$b
eng
$c
MiAaPQ
100
1
$a
Soo, Jason J.
$3
1185291
245
1 0
$a
Search in Adverse Environments.
264
0
$c
2016
300
$a
1 online resource (104 pages)
336
$a
text
$b
txt
$2
rdacontent
337
$a
computer
$b
c
$2
rdamedia
338
$a
online resource
$b
cr
$2
rdacarrier
500
$a
Source: Dissertation Abstracts International, Volume: 77-09(E), Section: B.
500
$a
Adviser: Ophir Frieder.
502
$a
Thesis (Ph.D.)--Georgetown University, 2016.
504
$a
Includes bibliographical references
520
$a
Today, search is a ubiquitous task. This task often carries the expectation that relevant results shall be returned within the first 10 documents. While the advent of modern online search engines have created such expectations, there exist environments in which such approaches are not omnipotent. These environments are defined by their lack of vital resources, such as the Internet, query logs, user models, and refined algorithms. This amalgam of resources is the keystone of the modern search systems. Without these resources, systemic error rates become intractable, and a novel, customized approach is required.
520
$a
Frequently, adverse environments host information of great value. For example, medical records, personal information, historical documents, or national security data. These collections often contain errors introduced by user error, systematically (for example, by an Optical Character Recognition process), or both. Accounting for such errors, and persevering to retrieving relevant documents, is the focus of my research.
520
$a
I assert a solution effectively considering both the term's context and substring features can yield superior results with minimal external dependencies when searching such adverse conditions.
520
$a
In this dissertation, I present my solution for searching corrupted document collections in adverse environments. My solution---Segments---is a language independent, domain independent, unsupervised approach that I experimentally show is either as good or better than the prior art, state-of-the-art, and commonly deployed solutions. Segments achieves its results by analyzing context and substring features of corrupted terms. Segments is in use within the Archives Section of the United States Holocaust Memorial Museum to search multilingual collections with sparse query logs.
520
$a
This document is dedicated to describing my experimental results, and demonstrating both the strength, and drawbacks that Segments has to offer for real world deployments.
520
$a
Index words: Optical Character Recognition (OCR), Spelling Correction, Post Processing, Corrupted Documents.
533
$a
Electronic reproduction.
$b
Ann Arbor, Mich. :
$c
ProQuest,
$d
2018
538
$a
Mode of access: World Wide Web
650
4
$a
Information technology.
$3
559429
650
4
$a
Information science.
$3
561178
655
7
$a
Electronic books.
$2
local
$3
554714
690
$a
0489
690
$a
0723
710
2
$a
ProQuest Information and Learning Co.
$3
1178819
710
2
$a
Georgetown University.
$b
Computer Science.
$3
1185292
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=10103738
$z
click for full text (PQDT)
筆 0 讀者評論
多媒體
評論
新增評論
分享你的心得
Export
取書館別
處理中
...
變更密碼[密碼必須為2種組合(英文和數字)及長度為10碼以上]
登入