國立虎尾科技大學 |

Similarity search and satisfaction.

紀錄類型:	書目-語言資料,手稿 : Monograph/item
正題名/作者:	Similarity search and satisfaction./
作者:	Williams, Kyle.
面頁冊數:	1 online resource (168 pages)
附註:	Source: Dissertation Abstracts International, Volume: 78-04(E), Section: B.
Contained By:	Dissertation Abstracts International78-04B(E).
標題:	Information technology. -
電子資源:	click for full text (PQDT)
ISBN:	9781369405576

Similarity search and satisfaction.
Williams, Kyle.

Similarity search and satisfaction. - 1 online resource (168 pages)

Source: Dissertation Abstracts International, Volume: 78-04(E), Section: B.

Thesis (Ph.D.)

Includes bibliographical references

The Web and online search engines have greatly simplified information access. This has led to advantages in many areas, including education, disaster management, science, and community development. However, along with these advantages, several challenges have arisen, such as those related to data redundancy, query construction, the ethical use of the Web, and the design of appropriate evaluation methods. This dissertation focuses on two general problems in information retrieval: similarity and satisfaction.

Electronic reproduction.
Ann Arbor, Mich. :
ProQuest,
2018

Mode of access: World Wide Web

ISBN: 9781369405576Subjects--Topical Terms:

559429
Information technology.
Index Terms--Genre/Form:

554714
Electronic books.

Similarity search and satisfaction.
LDR:04237ntm a2200385Ki 4500 001 909821
005 20180426091047.5
006 m o u
007 cr mn||||a|a||
008 190606s2016 xx obm 000 0 eng d
020 $a 9781369405576
035 $a (MiAaPQ)AAI10297144
035 $a AAI10297144
040 $a MiAaPQ $b eng $c MiAaPQ
099 $a TUL $f hyy $c available through World Wide Web
100 1 $a Williams, Kyle. $3 1180786
245 1 0 $a Similarity search and satisfaction.
264 0 $c 2016
300 $a 1 online resource (168 pages)
336 $a text $b txt $2 rdacontent
337 $a computer $b c $2 rdamedia
338 $a online resource $b cr $2 rdacarrier
500 $a Source: Dissertation Abstracts International, Volume: 78-04(E), Section: B.
500 $a Adviser: C. Lee Giles.
502 $a Thesis (Ph.D.) $c The Pennsylvania State University $d 2016.
504 $a Includes bibliographical references
520 $a The Web and online search engines have greatly simplified information access. This has led to advantages in many areas, including education, disaster management, science, and community development. However, along with these advantages, several challenges have arisen, such as those related to data redundancy, query construction, the ethical use of the Web, and the design of appropriate evaluation methods. This dissertation focuses on two general problems in information retrieval: similarity and satisfaction.
520 $a Near duplication is common in document collections and refers to the case where a large amount of similarity exists among documents. This dissertation focuses on near duplicate detection in scholarly big data and state of the art methods from the Web are shown to be effective at detecting near duplicate scholarly documents. These findings are used in the design of an information extraction Web service that was designed to be scalable and efficient when processing scholarly big data. The Web service includes a near duplicate matching backend to avoid redundant information extraction and is shown to lead to an 8.46% decrease in the amount of time required to extract metadata and citations from 3.5 million academic documents.
520 $a Similarity search is similar to near duplicate detection; however, instead of identifying all near duplicates, the goal is instead to find documents that are similar to a given query document. This is especially useful in situations where it is challenging to construct keyword queries for complex information needs. A similar document search engine that receives whole documents as queries and automatically finds similar files is proposed. The search engine is scalable and works with multiple similarity functions and document collections. It includes a recursive search algorithm that produces a search result tree that is used for ranking and that leads to a significant improvement in search performance.
520 $a There are many uses for similarity search on the Web. In this dissertation, a method for using similarity search to detect candidate sources of plagiarism from the Web is proposed. A single document is received as a query and potential sources of plagiarism are returned. The method achieves F-1 scores of 0.54 and 0.47 in offline and online evaluations, respectively. Similar methods are presented for detecting synthetic scientific articles and achieve precision and recall scores of 0.96 and 0.99, respectively.
520 $a Finally, evaluation is an important topic underlying much of information retrieval. Methods for measuring good abandonment in mobile search are presented, where good abandonment refers to users being satisfied in search without the need to click on results. Using gestures as signals, an accuracy of 75% is achieved when differentiating between good and bad abandonment. Furthermore, is it shown how good abandonment is driven by mobile answers, snippets, and images on the results page.
533 $a Electronic reproduction. $b Ann Arbor, Mich. : $c ProQuest, $d 2018
538 $a Mode of access: World Wide Web
650 4 $a Information technology. $3 559429
650 4 $a Computer science. $3 573171
655 7 $a Electronic books. $2 local $3 554714
690 $a 0489
690 $a 0984
710 2 $a ProQuest Information and Learning Co. $3 1178819
710 2 $a The Pennsylvania State University. $3 845556
773 0 $t Dissertation Abstracts International $g 78-04B(E).
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=10297144 $z click for full text (PQDT)