Sort Sort Goroutine!
Koichi Toda
Software Engineer | Full Remote Worker | Microservices | Golang | Python | Changing the world one technology at a time. | MBTI type = INFJ
The sorting process is probably one of the top five functions that people in the entire world use without even realizing it. Ranking scores, sorting fees and names, ranking and sorting are probably fundamental to people's lives.If this could be made faster, everyone would surely benefit.
For example, anyone can easily implement a sorting process using the classic QuickSort algorithm, which is synonymous with stable and fast performance.
Golang has a sort package as a standard feature.
If you write as follows...
package main
import (
"fmt"
"sort"
)
func main() {
l := []int{5, 1, 3, 4, 2}
sort.Ints(l)
fmt.Println(l)
}
You can easily get the sorted results ([1 2 3 4 5]).
Ideas for speed up
Using the standard sorting function is convenient, but is there any way to make sorting even faster?
Now that multi-core environments are common and golang is available to everyone, why not use goroutine to speed up the process by making full use of parallel processing?
If the original sequence of numbers is spliced into two parts, each executed with goroutine, the execution time could easily be cut in half!
That is the beginning of my idea.
Turning ideas into reality
That is true, but here is the big problem!
How can we sort results that are divided in half into a single sorted result at the end? Just concatenating them does not result in a sorted result. In the end, the sorted numbers must be sorted again in the end to make a single sorted sequence of numbers.
In this way, it gradually becomes clear that although it may seem simple at first glance, the parallel execution of the sorting process is fraught with difficulties.
Even if we can perform parallel processing, is that capability wasted in sorting?
What should we do?
I have devised one plan. When splitting a column of numbers into two, one should be a column of numbers with only numbers below a certain threshold, and the other should be a column of numbers with only numbers greater than the threshold.
Then wouldn't the result of sorting these two columns of numbers just concatenate them into a sorted column of numbers?
So I actually did it.
The first step is to find the threshold value in a sequence of numbers, which generally means finding the median. However, there are pitfalls in trying to find the median rigorously. As a little research shows, in order to find the median, one usually has to sort. This is a real downfall. In this case, I decided to simply find the maximum value of a sequence of numbers and use the integer value that is half of this value as the threshold value.
func getMax(l []int) int {
max := 0
for _, v := range l {
if max < v {
max = v
}
}
return max
}
(In this case, since we used a completely random sequence of numbers, we were able to achieve our intent even with roughly half of the maximum value, but the degree of scatter and bias of the numbers will differ depending on the type of data being handled. We understood that depending on the sorting target, it would be a matter of performance to devise a heuristic or other method to find the median value).
领英推荐
Anyway, having obtained the threshold value, we decided to divide the sequence of numbers based on this value.
Since appending each number one by one to two slices would take a lot of processing time and would be inefficient in terms of memory, this time we adjusted the slice so that the columns of numbers below the threshold value and the columns of numbers above the threshold value are separated and lined up at a specific separation position in the first and second parts of one slice prepared in advance. The following is an example of the adjustment.
This single slice and the two slices bordering the separation position were then passed to goroutine to be sorted. The sort result is overwritten by the (originally single) slice passed to the goroutine.
func divideList(l []int) ([]int, int) {
divl := make([]int, len(l))
i, j := 0, len(l)-1
max := getMax(l)
for _, v := range l {
if v <= max/2 {
divl[i] = v
i++
} else {
divl[j] = v
j--
}
}
return divl, i
}
Then, to start the goroutine, the first thing that comes to mind is to use channels to wait for each goroutine to complete its processing (in fact, my first version used channels).
I finally decided to use WaitGroup, which simplifies the code.
func sortList(l []int, wg *sync.WaitGroup) {
defer wg.Done()
sort.Ints(l)
}
func BenchmarkSortUsingGorutine(b *testing.B) {
size := 1000000
b.ResetTimer()
for i := 0; i < b.N; i++ {
var wg sync.WaitGroup
list := randomList(size, 100)
l, sep := divideList(list)
wg.Add(2)
go sortList(l[:sep], &wg)
go sortList(l[sep:], &wg)
wg.Wait()
}
}
The machine I ran on was a quad-core, and when I ran two goroutiles, the performance increased as in this case, but when I ran four goroutines, the performance was worse instead. This means that the results also depend on the execution environment (number of cores), and a machine with more cores may be able to run more goroutines to speed up the process.
Execution Results
goroutine_sort$ go test -bench . -benchtime 1000x
goos: darwin
goarch: amd64
pkg: github.com/Koichi-Toda/goroutine_sort
cpu: Intel(R) Core(TM) i5-1038NG7 CPU @ 2.00GHz
BenchmarkSortUsingGorutine-8 ? ? ? ? ? ? ? 1000 ? 71709987 ns/op
BenchmarkSortBasic-8 ? ? ? ? ? ? ? ? ? ? ? 1000 106246242 ns/op
PASS
ok? github.com/Koichi-Toda/goroutine_sort 276.921s
When run, the using goroutine version was approximately 30% faster than the standard sort.
More speed
Can we make it a little faster?
In the first method described above, where one list of numbers is separated into two lists, data copying to memory cells would occur for the size of the list of numbers. It would be possible to assume that a large number (ideally 50% in this example) of these would originally satisfy the threshold condition without changing the position of the list of numbers.
To account for this, we have rewritten the system to find the position of the list of numbers that have a value higher than the threshold and a value lower than the threshold, and only swap these when this is discovered. This should dramatically reduce the number of moves that need to be made. (This processing scheme mimics the actual swapping process in QuickSort.)
func divideListEnhanced(l []int) ([]int, int) {
i, j := 0, len(l)-1
pivot := getMax(l) / 2
for {
for l[i] <= pivot {
i++
}
for l[j] > pivot {
j--
}
if i >= j {
return l, i
}
// swap
l[i], l[j] = l[j], l[i]
i++
j--
}
}
In fact, when I measured it, I found that this enhanced version was a clear improvement (but not as much of a performance improvement as I thought it would be, making it only a small improvement).
goroutine_sort$ go test -bench . -benchtime 1000x
goos: darwin
goarch: amd64
pkg: github.com/Koichi-Toda/goroutine_sort
cpu: Intel(R) Core(TM) i5-1038NG7 CPU @ 2.00GHz
BenchmarkSortUsingGorutine-8 ? ? ? ? ? ? ? 1000 ? 72925503 ns/op
BenchmarkSortUsingGorutineEnhanced-8 ? ? ? 1000 ? 68815445 ns/op
BenchmarkSortBasic-8 ? ? ? ? ? ? ? ? ? ? ? 1000 104637312 ns/op
PASS
ok? github.com/Koichi-Toda/goroutine_sort 274.991sx
Conclusion
I believe that it is an important act of pure experience to try your best to devise a smaller and simpler theme from time to time.
By trying out golang, you will realize how good it is as a language, and you will also have a beautiful experience that will make you fall in love with golang even more.
Source code https://github.com/Koichi-Toda/goroutine_sort