Multiprocessing
- [Instructor] You're probably great at multi-processing and Python already and don't even know it. Here, I'll show you. I have a file called 1000seconds.py. All it does is call time.sleep for a thousand seconds. So I'm going to open a second tab and run it in the second tab. So now we have two tabs running this program. Great. Two Python processes running independently on my machine, multi-processing, and Python. All right, so that's how you do that, onto the next subject then. Just kidding. Yes, I do have two separate Python processes running but I had to start them by hand. How do we write a program to start, stop, and manage these for us? Well, conveniently, there's a module that's very similar to the threading module we used previously. Let's check it out. And that module is called multi-processing. From multiprocessing import Process. Okay, so before I run this, there is a small hitch with using the official Python multi-processing module. On some operating systems, you can't use this to spin up a new process that runs the function if that function is defined in the same file as opposed to imported at the top, like this, import myFunction. That's going to make life difficult for us for example purposes where we want to define and run functions in the same Jupyter notebook. So fortunately there is a third party module that solves this called multi-process and you can install it with pip install multiprocess. The multi-process module has all of the same functions and is used exactly the same as multi-processing but it doesn't have the bug with pickiness about where the function is defined. So play around with it, use either of them for these examples, but I'm going to use multi-process and also I'm going to import the time module. And this process class is so similar to the thread class we used earlier that I can actually just copy this code. So I'm going to do that and paste it down here. Now, instead of threading.thread, we want process. So replace both of those and instead of t1 and t2, let's call them p1 and p2. This should work exactly the same now. And there we go. Oh, there aren't any results. But of course, remember, processes don't share memory. They get a copy of this dictionary in their own separate memory space, and we have no way of accessing it except if they record it somewhere like a file system or a database. One thing we can do though is print the computed value from within the function itself. So rather than returning this or saving it in the results, we just print and there we go. But you see it printed them right next to each other, not a 14, it's a one and a four. What happens if we add another line in here, like Finished computing? So it's printing the one and the four very quickly, then it prints both processes, print Finished computing, and then we have this extra weird new line. Well, what happens if we add 10 processes to the mix? And again, I'm going to use the same pattern that we used previously with the threads. So processes is equal to a list, n for n in range (0,10) then p.start, p for in processes, and p.join p for p in processes and we don't need to bother printing the results. Oh, whoops, bad syntax there. Okay. And you can see this output starts to look a little bit funky. New lines aren't where you expected them to be. There's overlap between the separate function calls. If we copy our threads code over, we can do a similar thing with that. Let me just copy this and I'm just going to import the threading module at the top. Okay. And we're going to use the same long square function so it should print things out. Oh, and let's just do zero through 10 there. So that actually looks a little bit nicer then with the multiple processes. So what's going on? We usually talk about threads and processes as computing things in parallel. And with processes, that's true. Modern computers have multiple processors and you're literally asking for multiple processors to process your tasks in parallel. However, with threads, what's happening is that the same processor will execute a statement from thread A, thread B, thread C, then thread A again, and it basically picks them up in a round robin fashion. It will go to work on a different thread if one of them is hanging around waiting for something for whatever reason, like a time.sleep. So threading emulates parallel computing and can sometimes be very powerful if your programs have periods of downtime where you're waiting for something.
随堂练习,边学边练
下载课堂讲义。学练结合,紧跟进度,轻松巩固知识。