國立虎尾科技大學 |

Efficient Data Management and Processing in Big Data Applications.

紀錄類型:	書目-語言資料,手稿 : Monograph/item
正題名/作者:	Efficient Data Management and Processing in Big Data Applications./
作者:	Cao, Xiang.
面頁冊數:	1 online resource (107 pages)
附註:	Source: Dissertation Abstracts International, Volume: 78-12(E), Section: B.
標題:	Computer science. -
電子資源:	click for full text (PQDT)
ISBN:	9780355091328

Efficient Data Management and Processing in Big Data Applications.
Cao, Xiang.

Efficient Data Management and Processing in Big Data Applications. - 1 online resource (107 pages)

Source: Dissertation Abstracts International, Volume: 78-12(E), Section: B.

Thesis (Ph.D.)--University of Minnesota, 2017.

Includes bibliographical references

In today's Big Data applications, huge amount of data are being generated. With the rapid growth of data amount, data management and processing become essential. It is important to design efficient approaches to manage and process data. In this thesis, data management and processing are investigated for Big Data applications.

Electronic reproduction.
Ann Arbor, Mich. :
ProQuest,
2018

Mode of access: World Wide Web

ISBN: 9780355091328Subjects--Topical Terms:

573171
Computer science.
Index Terms--Genre/Form:

554714
Electronic books.

Efficient Data Management and Processing in Big Data Applications.
LDR:03487ntm a2200373K 4500 001 915315
005 20180727125212.5
006 m o u
007 cr mn||||a|a||
008 190606s2017 xx obm 000 0 eng d
020 $a 9780355091328
035 $a (MiAaPQ)AAI10287092
035 $a (MiAaPQ)umn:18201
035 $a AAI10287092
040 $a MiAaPQ $b eng $c MiAaPQ
100 1 $a Cao, Xiang. $3 1188635
245 1 0 $a Efficient Data Management and Processing in Big Data Applications.
264 0 $c 2017
300 $a 1 online resource (107 pages)
336 $a text $b txt $2 rdacontent
337 $a computer $b c $2 rdamedia
338 $a online resource $b cr $2 rdacarrier
500 $a Source: Dissertation Abstracts International, Volume: 78-12(E), Section: B.
500 $a Adviser: David H. C. Du.
502 $a Thesis (Ph.D.)--University of Minnesota, 2017.
504 $a Includes bibliographical references
520 $a In today's Big Data applications, huge amount of data are being generated. With the rapid growth of data amount, data management and processing become essential. It is important to design efficient approaches to manage and process data. In this thesis, data management and processing are investigated for Big Data applications.
520 $a Key-value store (KVS) is widely used in many Big Data applications by providing flexible and efficient performance. Recently, a new Ethernet accessed disk drive for key-value pairs called "Kinetic Drive" was developed by Seagate. It can reduce the management complexity, especially in large-scale deployment.
520 $a It is important to manage the key-value pairs and store them in Kinetic Drives in an organized way. In this thesis, we present data allocation schemes on a large-scale key-value store system using Kinetic Drives. We investigate key indexing schemes and allocate data on drives accordingly. We propose efficient approaches to migrate data among drives.
520 $a Also, it is necessary to manage huge amount of key-value pairs to provide attributes search for users. In this thesis, we design a large-scale searchable key-value store system based on Kinetic Drives. We investigate an indexing scheme to map data to the drives. We propose a key generation approach to reflect metadata information of the actual data and support users' attributes search requests.
520 $a Nowadays, MapReduce has become a very popular framework to process data in many applications. Data shuffling usually accounts for a large portion of the entire running time of MapReduce jobs. In recent years, scale-up computing architecture for MapReduce jobs has been developed. With multi-processor, multi-core design connected via NUMAlink and large shared memories, NUMA architecture provides a powerful scale-up computing capability.
520 $a In this thesis, we focus on the optimization of data shuffling phase in MapReduce framework in NUMA machine. We concentrate on the various bandwidth capacities of NUMAlink(s) among different memory locations to fully utilize the network. We investigate the NUMAlink topology and propose a topology-aware reducer placement algorithm to speed up the data shuffling phase. We extend our approach to a larger computing environment with multiple NUMA machines.
533 $a Electronic reproduction. $b Ann Arbor, Mich. : $c ProQuest, $d 2018
538 $a Mode of access: World Wide Web
650 4 $a Computer science. $3 573171
655 7 $a Electronic books. $2 local $3 554714
690 $a 0984
710 2 $a ProQuest Information and Learning Co. $3 1178819
710 2 $a University of Minnesota. $b Computer Science. $3 1180176
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=10287092 $z click for full text (PQDT)