A Two-Directional BigData Sorting Architecture on FPGAs

A Two-Directional BigData Sorting Architecture on FPGAs

Table of Contents


Sorting is pivotal data analytics and becomes challenging with intensive computation on drastically growing data volume. Sorting on FPGA has shown superior throughput, but the limited in-system memory causes vast data transferring to/from external storage when handling a large dataset. We propose a two-directional sorting (2DSort) architecture which sorts data sequences on both horizontal and vertical directions. 2DSort significantly reduces the costly data transmission by tracking the data on FPGA and writes back the data to external storage when the final sorting position is determined. 2DSort shows an average 38.5% runtime enhancement in comparison to the state-of-the-art bigdata sorting engine on FPGA.

  • Author Keywords

    • Arithmetic and logic structures,
    • data-path design,
    • logic arrays,
    • system architectures,
    • integration and modeling,
    • reconfigurable hardware
  • IEEE Keywords

    • Sorting,
    • Field programmable gate arrays,
    • Random-access memory,
    • Distributed databases,
    • Computer architecture,
    • Throughput,
    • Data communication


sorting is one of the pivotal operations in various domains of data processing [1-2]. FPGA (Field Programmable Gate Array) which implements customized logic has been demonstrated to enable superior sorting throughput while attaining higher energy efficiency than programmable processors [3, 4]. However, the growing sizes of modern datasets can easily exceed the in-system storage of an FPGA and pose challenges for sorting. To handle a large dataset, the sorting engine in FPGA [4] performs multiple rounds of merge sort. The subsequences of the main dataset are iteratively loaded from the system flash storage and merge-sorted into longer sequences. The sorted data sequence will be kept in the on-board DRAM temporarily and will be written back to the flash before loading the next group of sub-sequences. The process will be repeated until the data are all sorted. In this way, the dataset can be sorted in a pipeline scheme and the sorting system does not need to keep the whole dataset to the limited in-system storage at the same time. Although merge-sort can handle large datasets, the sorting flow would need to frequently move the sorted sub-sequences to external flash storage before allocating the next group of sub-sequences to the DRAM on FPGA board. The enormous and costly data movement to and from the external flash has outweighed the sorting operations in modern data processing systems, and become a major performance bottleneck [4]. The end-to-end sorting runtime is mainly determined by the data transmission instead of the sorting logic on FPGA. The sorting runtime will be significantly increased when dealing with the growing size of the future dataset. In this paper, we propose 2DSort, a two-directional sorting architecture on FPGA, to significantly reduce the amount of time-consuming data transmission between the FPGA and system flash. As illustrated in Fig. 1, 2DSort adopts a two-phase sorting scheme. In Phase I, the un-sorted data are merge-sorted into data pages in the horizontal direction. In Phase II, 2DSort tracks characteristics of data blocks from different pages and performs vertical sorting to determines the global rank of data. These data with global rank will be written back to the flash storage and will not be accessed again. When compared with the state-of-the-art FPGA accelerator [4], the proposed scheme can attain an average of 38.5% runtime enhancement when sorting 512GB of data. The rest of the paper is organized as follows. Section II introduces the existing works on accelerating large scale sorting. The detailed architecture and implementation of the proposed 2DSort system are described in Section III. We present the performance evaluation and comparison in Section IV. Section V will conclude this work.


We propose a two-directional sorting (2DSort) architecture to significantly reduces the data transmission by tracking the sorting process and data on FPGA. Data will be written back to flash when the final sorting position is determined. The experiments show that 2DSort has attained an average 38.5% runtime enhancement with only minor design overhead.

About KSRA

The Kavian Scientific Research Association (KSRA) is a non-profit research organization to provide research / educational services in December 2013. The members of the community had formed a virtual group on the Viber social network. The core of the Kavian Scientific Association was formed with these members as founders. These individuals, led by Professor Siavosh Kaviani, decided to launch a scientific / research association with an emphasis on education.

KSRA research association, as a non-profit research firm, is committed to providing research services in the field of knowledge. The main beneficiaries of this association are public or private knowledge-based companies, students, researchers, researchers, professors, universities, and industrial and semi-industrial centers around the world.

Our main services Based on Education for all Spectrum people in the world. We want to make an integration between researches and educations. We believe education is the main right of Human beings. So our services should be concentrated on inclusive education.

The KSRA team partners with local under-served communities around the world to improve the access to and quality of knowledge based on education, amplify and augment learning programs where they exist, and create new opportunities for e-learning where traditional education systems are lacking or non-existent.

FULL Paper PDF file:

A Two-Directional BigData Sorting Architecture on FPGAs



B. Lai, C. Chen, Y. Hsin and B. Lin,




A Two-Directional BigData Sorting Architecture on FPGAs

Publish in

in IEEE Computer Architecture Letters, vol. 19, no. 1, pp. 72-75, 1 Jan.-June 2020,



PDF reference and original file: Click here

+ posts

Somayeh Nosrati was born in 1982 in Tehran. She holds a Master's degree in artificial intelligence from Khatam University of Tehran.

Website | + posts

Professor Siavosh Kaviani was born in 1961 in Tehran. He had a professorship. He holds a Ph.D. in Software Engineering from the QL University of Software Development Methodology and an honorary Ph.D. from the University of Chelsea.

Website | + posts

Nasim Gazerani was born in 1983 in Arak. She holds a Master's degree in Software Engineering from UM University of Malaysia.