Language:
English
繁體中文
Help
Login
Back
Switch To:
Labeled
|
MARC Mode
|
ISBD
Sorting Large Data Sets with FPGA-Ac...
~
Chen, Han.
Sorting Large Data Sets with FPGA-Accelerated Samplesort.
Record Type:
Language materials, printed : Monograph/item
Title/Author:
Sorting Large Data Sets with FPGA-Accelerated Samplesort./
Author:
Chen, Han.
Published:
Ann Arbor : ProQuest Dissertations & Theses, : 2020,
Description:
120 p.
Notes:
Source: Dissertations Abstracts International, Volume: 82-02, Section: B.
Contained By:
Dissertations Abstracts International82-02B.
Subject:
Electrical engineering. -
Online resource:
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=27994970
ISBN:
9798662596825
Sorting Large Data Sets with FPGA-Accelerated Samplesort.
Chen, Han.
Sorting Large Data Sets with FPGA-Accelerated Samplesort.
- Ann Arbor : ProQuest Dissertations & Theses, 2020 - 120 p.
Source: Dissertations Abstracts International, Volume: 82-02, Section: B.
Thesis (Ph.D.)--State University of New York at Stony Brook, 2020.
This item must not be sold to any third party vendors.
Sorting is a fundamental operation in many applications such as databases, search, and social networks. Although field-programmable gate arrays (FPGAs) have been shown very efficient at sorting data sizes that fit on chip, systems that sort larger data sets by shuffling data on and off chip are typically bottlenecked by costly merge operations or data transfer time.This thesis proposes a new technique for sorting large data sets, which uses a variant of the samplesort algorithm on a server with a PCIe-connected FPGA. Samplesort avoids merging by randomly sampling values to determine how to partition data into non-overlapping buckets that can be independently sorted. The key to this design is a novel parallel multi-stage hardware partitioner, which is a scalable high-throughput solution that greatly accelerates the samplesort partitioning step. Using samplesort for FPGA-accelerated sorting provides several advantages over other sorting algorithms, while also presenting a number of new challenges that are addressed with cooperation between the FPGA and the software running on the host CPU.For applying this sorting system in different scenarios, this thesis includes an automation tool for design space exploration and generating the optimal design. Based on the automation tool, we prototype this design on an Amazon Web Services FPGA instance.Experimental results demonstrate that the prototype system sorts 230 key-value records with a throughput of 7.2 GB/s, limited only by the on-board DRAM capacity and available PCIe bandwidth. When sorting 230 records, the system exhibits a 37.4x speedup over the widely used GNU parallel sort on an 8-thread state-of-the-art CPU. This thesis explores further extensions to this system to allow sorting terabytes of data distributed in NVMe SSDs and for sorting petabytes of data using multiple servers in the data warehouse.
ISBN: 9798662596825Subjects--Topical Terms:
596380
Electrical engineering.
Subjects--Index Terms:
Cloud
Sorting Large Data Sets with FPGA-Accelerated Samplesort.
LDR
:02944nam a2200337 4500
001
1037983
005
20210910100647.5
008
211029s2020 ||||||||||||||||| ||eng d
020
$a
9798662596825
035
$a
(MiAaPQ)AAI27994970
035
$a
AAI27994970
040
$a
MiAaPQ
$c
MiAaPQ
100
1
$a
Chen, Han.
$3
1335303
245
1 0
$a
Sorting Large Data Sets with FPGA-Accelerated Samplesort.
260
1
$a
Ann Arbor :
$b
ProQuest Dissertations & Theses,
$c
2020
300
$a
120 p.
500
$a
Source: Dissertations Abstracts International, Volume: 82-02, Section: B.
500
$a
Advisor: Milder, Peter.
502
$a
Thesis (Ph.D.)--State University of New York at Stony Brook, 2020.
506
$a
This item must not be sold to any third party vendors.
520
$a
Sorting is a fundamental operation in many applications such as databases, search, and social networks. Although field-programmable gate arrays (FPGAs) have been shown very efficient at sorting data sizes that fit on chip, systems that sort larger data sets by shuffling data on and off chip are typically bottlenecked by costly merge operations or data transfer time.This thesis proposes a new technique for sorting large data sets, which uses a variant of the samplesort algorithm on a server with a PCIe-connected FPGA. Samplesort avoids merging by randomly sampling values to determine how to partition data into non-overlapping buckets that can be independently sorted. The key to this design is a novel parallel multi-stage hardware partitioner, which is a scalable high-throughput solution that greatly accelerates the samplesort partitioning step. Using samplesort for FPGA-accelerated sorting provides several advantages over other sorting algorithms, while also presenting a number of new challenges that are addressed with cooperation between the FPGA and the software running on the host CPU.For applying this sorting system in different scenarios, this thesis includes an automation tool for design space exploration and generating the optimal design. Based on the automation tool, we prototype this design on an Amazon Web Services FPGA instance.Experimental results demonstrate that the prototype system sorts 230 key-value records with a throughput of 7.2 GB/s, limited only by the on-board DRAM capacity and available PCIe bandwidth. When sorting 230 records, the system exhibits a 37.4x speedup over the widely used GNU parallel sort on an 8-thread state-of-the-art CPU. This thesis explores further extensions to this system to allow sorting terabytes of data distributed in NVMe SSDs and for sorting petabytes of data using multiple servers in the data warehouse.
590
$a
School code: 0771.
650
4
$a
Electrical engineering.
$3
596380
653
$a
Cloud
653
$a
FPGA
653
$a
Hardware partitioning
653
$a
Sorting
690
$a
0544
710
2
$a
State University of New York at Stony Brook.
$b
Electrical Engineering.
$3
1187031
773
0
$t
Dissertations Abstracts International
$g
82-02B.
790
$a
0771
791
$a
Ph.D.
792
$a
2020
793
$a
English
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=27994970
based on 0 review(s)
Multimedia
Reviews
Add a review
and share your thoughts with other readers
Export
pickup library
Processing
...
Change password
Login