國立虎尾科技大學 |

yInMem : = A Parallel Distributed Indexed In-Memory Computation System for Big Data Analytics.

紀錄類型:	書目-語言資料,手稿 : Monograph/item
正題名/作者:	yInMem :/
其他題名:	A Parallel Distributed Indexed In-Memory Computation System for Big Data Analytics.
作者:	Huang, Yin.
面頁冊數:	1 online resource (132 pages)
附註:	Source: Dissertation Abstracts International, Volume: 78-10(E), Section: B.
Contained By:	Dissertation Abstracts International78-10B(E).
標題:	Computer science. -
電子資源:	click for full text (PQDT)
ISBN:	9781369807899

yInMem : = A Parallel Distributed Indexed In-Memory Computation System for Big Data Analytics.
Huang, Yin.

yInMem :A Parallel Distributed Indexed In-Memory Computation System for Big Data Analytics. - 1 online resource (132 pages)

Source: Dissertation Abstracts International, Volume: 78-10(E), Section: B.

Thesis (Ph.D.)

Includes bibliographical references

Cluster computing is experiencing a surge of interest in in-memory computing system with the advances in hardware such as memory. However, the network media has the smallest bandwidth as compared to memory and disk in a typical setting of cluster computing environment. In addition, the sparse nature of graph applications, such as social network, imposes new challenges for in-memory computing system. Examples of such challenges are data locality, workload balance and memory management. As a result, fine control over data partitioning and data sharing plays a crucial role in improving the speed of large-scale data-parallel processing systems by reducing the cross-node communication. In order to maximize the performance, in-memory computing system should be offering optimized data throughput for parallel computation in large-scale data analytics.

Electronic reproduction.
Ann Arbor, Mich. :
ProQuest,
2018

Mode of access: World Wide Web

ISBN: 9781369807899Subjects--Topical Terms:

573171
Computer science.
Index Terms--Genre/Form:

554714
Electronic books.

yInMem : = A Parallel Distributed Indexed In-Memory Computation System for Big Data Analytics.
LDR:03588ntm a2200373Ki 4500 001 908964
005 20180419104823.5
006 m o u
007 cr mn||||a|a||
008 190606s2017 xx obm 000 0 eng d
020 $a 9781369807899
035 $a (MiAaPQ)AAI10277192
035 $a (MiAaPQ)umbc:11647
035 $a AAI10277192
040 $a MiAaPQ $b eng $c MiAaPQ
099 $a TUL $f hyy $c available through World Wide Web
100 1 $a Huang, Yin. $3 1179406
245 1 0 $a yInMem : $b A Parallel Distributed Indexed In-Memory Computation System for Big Data Analytics.
264 0 $c 2017
300 $a 1 online resource (132 pages)
336 $a text $b txt $2 rdacontent
337 $a computer $b c $2 rdamedia
338 $a online resource $b cr $2 rdacarrier
500 $a Source: Dissertation Abstracts International, Volume: 78-10(E), Section: B.
500 $a Adviser: Yelena Yesha.
502 $a Thesis (Ph.D.) $c University of Maryland, Baltimore County $d 2017.
504 $a Includes bibliographical references
520 $a Cluster computing is experiencing a surge of interest in in-memory computing system with the advances in hardware such as memory. However, the network media has the smallest bandwidth as compared to memory and disk in a typical setting of cluster computing environment. In addition, the sparse nature of graph applications, such as social network, imposes new challenges for in-memory computing system. Examples of such challenges are data locality, workload balance and memory management. As a result, fine control over data partitioning and data sharing plays a crucial role in improving the speed of large-scale data-parallel processing systems by reducing the cross-node communication. In order to maximize the performance, in-memory computing system should be offering optimized data throughput for parallel computation in large-scale data analytics.
520 $a This dissertation presents yInMem: a parallel, distributed, indexed, in-memory computing system for big data analytics. With the goal of building an in-memory computing system that enables optimal data partitioning and improves efficiency of iterative machine learning and graph algorithms, yInMem bridges the gap between HPC and Hadoop by parallelizing the computation with MPI while obtaining the advantage of distributed data storage, such as NoSQL database built on top of Hadoop. The novelty of yInMem results from introducing indexes or associative arrays to the in-memory computing system. Such a design offers benefits of fine control over data distribution with parallel computation to maximize the computing resources usage in the cluster.
520 $a By analyzing the linear algebra characteristics of iterative machine learning and graph algorithms, such as spectral clustering and PageRank, we find that yInMem is capable of maximizing the usage of computing resources in the cluster. Leveraging the insights of Sparse Matrix-Vector Multiplication (SpMV), we also provide an optimal data partitioning algorithm on top of yInMem for load balance and data locality.
520 $a In order to evaluate yInMem, we investigate iterative machine learning and graph algorithms using both synthetic benchmarks and real user applications. yInMem matches or exceeds the performance of existing specialized systems.
533 $a Electronic reproduction. $b Ann Arbor, Mich. : $c ProQuest, $d 2018
538 $a Mode of access: World Wide Web
650 4 $a Computer science. $3 573171
655 7 $a Electronic books. $2 local $3 554714
690 $a 0984
710 2 $a ProQuest Information and Learning Co. $3 1178819
710 2 $a University of Maryland, Baltimore County. $b Computer Science. $3 1179407
773 0 $t Dissertation Abstracts International $g 78-10B(E).
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=10277192 $z click for full text (PQDT)