课程: Advanced Go Programming: Data Structures, Code Architecture, and Testing

Concurrent processing

- [Instructor] Concurrency is a way for us to make the most of our computing resources and aggregate multiple data subsets or windows as we called them in the previous video. Let's have a closer look at how this can be implemented in Go using goroutines and channels. You can easily leverage concurrency in our data processing by using goroutines and channels. As we slice and subset our data, we might want to speed up processing by taking advantage of concurrency. This is one of Go's great strengths. In this example, we create a number slice, loop over each number and start a new goroutine to process it. Once processing is completed, the goroutines send their computed value to the results channel. After the for loop, once all the goroutines are started, the main function receives all the computed values from the results channel and prints them. Due to their nature, channels will maintain the order of the results. While the concurrent solution we've just seen is correct and will make use of concurrency to calculate each element of the number slice, let's consider its use of resources. We'll start as many goroutines as we have elements in the number slice. These goroutines will then only be able to complete once the main goroutine has been able to receive their values. This will make the process slow and will not be suitable for the large data streams that we have been discussing so far. Let's have a look at an alternative solution which addresses these shortcomings. Begin by creating a work function which takes in three channels. One for input, one for the results, and a done channel. Use unidirectional channel types to ensure type safety inside the work function. This function makes use of a for loop, together with the select keyword, which makes it possible for one single goroutine to process multiple inputs. As soon as a message is received on the done channel, shut down the goroutine using the return statement. Otherwise, the function will receive values from the in channel and process them, sending the value to the out channel continuously. The only return statement is on received values from the done channel. Let's have a look at the correct invocation of the workers from the main function. Initialize the in and results channels as buffered channels of the same size as the worker count. This will allow values to be immediately sent through these channels if they have the capacity, reducing the blocking of the workers. We start the correct number of goroutines by using a for loop and passing them the channels as parameters. Next, loop through the numbers and send each of them to the in channel. We do this in another goroutine. Once all the values are sent, close the done channel, signaling to our workers that they should shut down. Finally, once everything is completed, the main goroutine is ready to receive all the computed values and print them to the terminal as we have done before. The pattern demonstrated by this simple example is fairly common. Reusing goroutines makes it easier for us to control the scale of our processing logic which is essential for large data streams. It's also common to use another signal channel which signals to the workers when it is time to shut down. You'll practice this pattern in the second challenge.

内容