Language:
English
繁體中文
Help
Login
Back
Switch To:
Labeled
|
MARC Mode
|
ISBD
Optimizing data locality in analytic...
~
ProQuest Information and Learning Co.
Optimizing data locality in analytic workloads over distributed computing environments.
Record Type:
Language materials, manuscript : Monograph/item
Title/Author:
Optimizing data locality in analytic workloads over distributed computing environments./
Author:
Elshater, Yehia.
Description:
1 online resource (133 pages)
Notes:
Source: Dissertation Abstracts International, Volume: 75-01C.
Subject:
Computer science. -
Online resource:
click for full text (PQDT)
Optimizing data locality in analytic workloads over distributed computing environments.
Elshater, Yehia.
Optimizing data locality in analytic workloads over distributed computing environments.
- 1 online resource (133 pages)
Source: Dissertation Abstracts International, Volume: 75-01C.
Thesis (Ph.D.)--Queen's University (Canada), 2017.
Includes bibliographical references
With the explosion of data that are generated every second, there is an emerging need for big data analytics using scalable systems and platforms for exploration, mining and decision making purposes. To gain better business insights, the business users are interested to integrate different kinds of analytics to achieve their goals. These analytics may involve accessing the same data for different purposes. Modern data intensive systems co-locate the computation as close as possible to the data to achieve greater e ciency. This placement of computation close to the data is called data locality. Data locality has a significant impact on the performance of jobs in a large cluster since higher data locality means there is less data transfer over the network.
Electronic reproduction.
Ann Arbor, Mich. :
ProQuest,
2018
Mode of access: World Wide Web
Subjects--Topical Terms:
573171
Computer science.
Index Terms--Genre/Form:
554714
Electronic books.
Optimizing data locality in analytic workloads over distributed computing environments.
LDR
:02569ntm a2200301K 4500
001
913982
005
20180628100932.5
006
m o u
007
cr mn||||a|a||
008
190606s2017 xx obm 000 0 eng d
035
$a
(MiAaPQ)AAI10625934
035
$a
(MiAaPQ)QueensUCan197415890
035
$a
AAI10625934
040
$a
MiAaPQ
$b
eng
$c
MiAaPQ
100
1
$a
Elshater, Yehia.
$3
1187033
245
1 0
$a
Optimizing data locality in analytic workloads over distributed computing environments.
264
0
$c
2017
300
$a
1 online resource (133 pages)
336
$a
text
$b
txt
$2
rdacontent
337
$a
computer
$b
c
$2
rdamedia
338
$a
online resource
$b
cr
$2
rdacarrier
500
$a
Source: Dissertation Abstracts International, Volume: 75-01C.
502
$a
Thesis (Ph.D.)--Queen's University (Canada), 2017.
504
$a
Includes bibliographical references
520
$a
With the explosion of data that are generated every second, there is an emerging need for big data analytics using scalable systems and platforms for exploration, mining and decision making purposes. To gain better business insights, the business users are interested to integrate different kinds of analytics to achieve their goals. These analytics may involve accessing the same data for different purposes. Modern data intensive systems co-locate the computation as close as possible to the data to achieve greater e ciency. This placement of computation close to the data is called data locality. Data locality has a significant impact on the performance of jobs in a large cluster since higher data locality means there is less data transfer over the network.
520
$a
In this work, we examine data locality in parallel processing frameworks and propose approaches to optimize it. First, we conduct a literature review of the existing systems that maximize data locality while processing big data analytics workflows. Second, we provide YARN Locality Simulator (YLocSim), a simulator tool that simulates the interactions between YARN components in a real cluster to report the data locality percentages. This tool gives the users better insights about the expected performance of the computing cluster. Third, we develop YARN Dynamic Replication Manager (YDRM), which is a new component in YARN that interacts with the existing YARN's Resource Manager to improve the data locality.
533
$a
Electronic reproduction.
$b
Ann Arbor, Mich. :
$c
ProQuest,
$d
2018
538
$a
Mode of access: World Wide Web
650
4
$a
Computer science.
$3
573171
655
7
$a
Electronic books.
$2
local
$3
554714
690
$a
0984
710
2
$a
ProQuest Information and Learning Co.
$3
1178819
710
2
$a
Queen's University (Canada).
$3
1148613
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=10625934
$z
click for full text (PQDT)
based on 0 review(s)
Multimedia
Reviews
Add a review
and share your thoughts with other readers
Export
pickup library
Processing
...
Change password
Login