The Creation of the Chunk List

Now that I’ve finished my course in data structures, I’ve been unsure of what to learn next. I thought that I could potentially create a new data structure of my own — one that may be faster or more efficient than others; how to do something like that however, was what I was unsure about.

Recently I’ve decided to focus my attention on concurrent programming, as I means to brush up on my existing knowledge and bridge any gaps of things I don’t know yet. That same night I got back into researching multithreading, I had the idea to use it in a data structure.

Chunking data is obviously no new concept; however, I had never found any data structures that used chunking as the basis of their implementation. I figured that by using chunking alongside concurrency, I could create an extremely fast run-time in regards to particular methods as searching and/or sorting.

By using chunking and concurrency to my advantage, I came up with the chunk list — a dynamic list-based data structure that would separate large amounts of data into specifically sized chunks, each of which should be able to be searched at the exact same time by searching each chunk on a separate thread.

As a result of implementing this concept into its own class, I was able to create something that almost consistently gives around 3.5x faster results than a regular ArrayList. However, should speed be a particular issue even after implementation, users can modify the size of the chunks and benchmark the speed of using smaller or larger chunks, depending on the amount of data being stored. Generally a chunk size of 1/10th the size of the data being stored gives the fastest and most consistent results.

To find out more, check out the GitHub repository here containing an implementation I did in C#, as well as a presentation discussing the usage, implementation, and run-time analysis of the data structure. (This presentation can also be found on my website on the portfolio page, or here)