1 00:00:05,940 --> 00:00:12,480 Let's take a moment to look at a crate that we can use to help us parallelize our programs. 2 00:00:12,660 --> 00:00:21,720 Rayon is a lightweight data parallel laser crate that makes it super easy to convert sequential computations 3 00:00:21,720 --> 00:00:27,780 into parallel computations while guaranteeing safety from data races. 4 00:00:28,230 --> 00:00:34,350 A unique characteristic that Rayon provides is that it will automatically spawn our threads for us. 5 00:00:34,500 --> 00:00:36,960 But how does it know how many to spawn? 6 00:00:37,170 --> 00:00:45,090 Well, by default, Rayon uses the same number of threads as the number of logical CPUs available. 7 00:00:45,120 --> 00:00:47,220 Assuming Hyper-threading is enabled. 8 00:00:47,250 --> 00:00:51,180 Otherwise, it will just use the number of physical CPUs. 9 00:00:51,880 --> 00:00:58,660 So let's do a quick example of creating a program that computes the factorial of a number and compare 10 00:00:58,660 --> 00:01:02,530 the performance between single threaded and multithreaded. 11 00:01:02,800 --> 00:01:09,610 So the first thing we need to do in our cargo file is add a couple of dependencies and I've already 12 00:01:09,610 --> 00:01:10,420 added them. 13 00:01:10,420 --> 00:01:18,430 And so we need Ray on a version one and then obviously it'll take our latest version and then also NUM 14 00:01:18,430 --> 00:01:23,110 and the latest version under 0.4. 15 00:01:24,130 --> 00:01:29,950 And then inside Main, we can start importing in these dependencies. 16 00:01:29,950 --> 00:01:32,680 So we'll go ahead and get them all imported in. 17 00:01:32,830 --> 00:01:41,380 So we'll start out with Ray on Prelude and then Asterisk to take everything in and now we'll use NUM 18 00:01:42,580 --> 00:01:47,980 and we want big unsigned INT and one. 19 00:01:49,330 --> 00:01:55,180 And then lastly, we want standard time and instant. 20 00:01:55,180 --> 00:01:59,080 And this is going to be what we use to. 21 00:02:00,120 --> 00:02:04,920 Measure the time it takes for our functions to run. 22 00:02:05,700 --> 00:02:12,660 So we'll start off with creating our single threaded factorial function, and it's going to accept one 23 00:02:12,660 --> 00:02:22,380 parameter of an unsigned 32 bit integer, and we're going to resign a big unsigned integer. 24 00:02:22,380 --> 00:02:30,390 And the reason we have to use a big unsigned integer is because if we used the largest unsigned integer 25 00:02:30,390 --> 00:02:38,520 that Russell provides us, which is a unsigned 128, our factorial would actually only be able to go 26 00:02:38,520 --> 00:02:41,910 up to 35. 27 00:02:42,720 --> 00:02:45,510 I found that out by just playing around with it a little bit. 28 00:02:45,900 --> 00:02:49,860 So we needed to use a big we need to use a big unsigned integer. 29 00:02:49,860 --> 00:02:54,870 That way we can use much larger values when trying to compute our factorial. 30 00:02:55,710 --> 00:02:59,490 So when computing a factorial, we need to do a couple of checks. 31 00:02:59,490 --> 00:03:05,910 We need a check if the number passed in is equal to zero or if it equals to one. 32 00:03:05,910 --> 00:03:09,240 And if it does, then we want to return. 33 00:03:10,060 --> 00:03:13,240 A big unsigned integer one. 34 00:03:16,140 --> 00:03:26,130 Otherwise now we can actually compute our factorial and so we'll do from one to num, including our 35 00:03:26,130 --> 00:03:26,790 number. 36 00:03:27,120 --> 00:03:35,490 We want to map that value, that element into a big unsigned integer. 37 00:03:37,000 --> 00:03:40,000 And then we want to call the reduced method on this. 38 00:03:40,300 --> 00:03:45,460 And I will explain to you what reduces in a second. 39 00:03:47,470 --> 00:03:48,970 I just want to get this set up. 40 00:03:48,970 --> 00:03:51,580 That way we can look at it. 41 00:03:54,790 --> 00:03:56,350 And what do we have here? 42 00:03:56,650 --> 00:03:58,330 No function. 43 00:03:59,280 --> 00:04:01,570 How could this need to be capitalized? 44 00:04:01,600 --> 00:04:02,260 There we go. 45 00:04:02,770 --> 00:04:06,880 All right, so let's let's break down this code. 46 00:04:07,480 --> 00:04:14,860 So again, we created one to NUM, which is also inclusive, since we have this equal sign and then 47 00:04:14,860 --> 00:04:19,810 we map it into a big unsigned integer. 48 00:04:21,050 --> 00:04:23,210 And then we call the reduced function on it. 49 00:04:23,300 --> 00:04:28,730 And the reducing function is a closure with two arguments. 50 00:04:28,730 --> 00:04:36,590 So we have argument one here and argument to here and argument one is going to be the accumulator and 51 00:04:36,590 --> 00:04:42,530 then X is also is going to be the element that we're passing in from here. 52 00:04:42,620 --> 00:04:50,180 So what the accumulator is going to do is we're going to say, Hey, take the accumulator and then multiply 53 00:04:50,180 --> 00:04:57,490 it by X, and then now that's our accumulator, and then just continuously do this until we equal NUM 54 00:04:57,710 --> 00:05:00,320 and then reduce returns and option. 55 00:05:00,320 --> 00:05:04,490 So we have to unwrap it in order to get the value inside of it. 56 00:05:05,720 --> 00:05:14,960 So let's create a test program of this or a simple test just to make sure that it is working correctly. 57 00:05:15,410 --> 00:05:18,650 And the factorial of three is six. 58 00:05:18,650 --> 00:05:21,590 So if this runs correctly, we should see six. 59 00:05:22,330 --> 00:05:24,910 Which is what we have printed out so great. 60 00:05:24,910 --> 00:05:28,150 So we know that it's working correctly, but just to verify. 61 00:05:28,180 --> 00:05:31,630 Let's also do four and we see 24. 62 00:05:31,630 --> 00:05:32,470 So awesome. 63 00:05:32,470 --> 00:05:34,510 So everything's working properly. 64 00:05:35,020 --> 00:05:40,030 So now let's take this function and let's use rayon. 65 00:05:40,030 --> 00:05:44,050 That way we can have the code running in parallel. 66 00:05:45,070 --> 00:05:56,350 So we'll just call it multi factor multithreaded factorial and our setup is going to be extremely similar. 67 00:05:56,350 --> 00:06:05,260 So we're going to take in an unsigned 32 bit integer and now we're going to also return a big unsigned 68 00:06:05,260 --> 00:06:05,890 integer. 69 00:06:05,890 --> 00:06:08,680 So let's do our check again 70 00:06:12,670 --> 00:06:13,750 return. 71 00:06:15,560 --> 00:06:18,530 One and then otherwise. 72 00:06:20,730 --> 00:06:25,050 We want one to numb, inclusive. 73 00:06:25,440 --> 00:06:34,860 And now where rayon comes in is we're going to call in pas etre and this is going to create an end to 74 00:06:34,860 --> 00:06:35,850 iterator for us. 75 00:06:35,850 --> 00:06:44,430 But PAS stands for parallel, so we're going to create a parallel iterator and now we're going to map 76 00:06:45,900 --> 00:06:48,360 big unit from like we did above. 77 00:06:49,810 --> 00:06:52,420 And now we're going to call reduce again. 78 00:06:52,690 --> 00:06:57,550 And before I fill in what reduces, let's look at the function signature up here. 79 00:06:57,550 --> 00:07:06,370 And we see that we have a disappeared, a mute, mutable self and then F of type F. 80 00:07:06,490 --> 00:07:11,440 Well, if we look at this reduced function, we actually see the function signatures different. 81 00:07:11,450 --> 00:07:17,560 So we see that we're going to take two parameters in this case. 82 00:07:18,130 --> 00:07:20,250 So we need to modify it that way. 83 00:07:20,260 --> 00:07:24,580 And then this is going to allow us for parallel execution. 84 00:07:24,580 --> 00:07:30,190 So this reduced function actually comes inside the Rayon Library. 85 00:07:30,520 --> 00:07:36,490 So let's set up this reduced function to allow it to be happy. 86 00:07:36,490 --> 00:07:41,260 So we're going to have a big unit of one. 87 00:07:42,310 --> 00:07:45,490 We're going to have our accumulator. 88 00:07:45,490 --> 00:07:48,430 We're going to have our X value the same as above. 89 00:07:48,430 --> 00:07:51,790 And then we're going to have accumulator times X. 90 00:07:52,360 --> 00:07:57,810 So what reduces and let's see if it gives us a good example here. 91 00:07:57,820 --> 00:08:06,430 So we have to include an identity parameter, which is what we're doing with big unit and then one. 92 00:08:08,190 --> 00:08:11,970 And then the identity iterator is going to be where? 93 00:08:13,230 --> 00:08:14,640 We start out at. 94 00:08:14,640 --> 00:08:18,900 So we'll have one and that's where our accumulator is going to start at. 95 00:08:18,900 --> 00:08:28,230 So it's just a different function signature provided to us by the Reon Crate to essentially do this 96 00:08:28,230 --> 00:08:28,770 up here. 97 00:08:28,770 --> 00:08:36,120 It's just typed out slightly different and in this case it's just going to return our item to us so 98 00:08:36,120 --> 00:08:38,160 we do not have to unwrap this. 99 00:08:39,980 --> 00:08:49,100 So now let's look at this and make sure that our factory rules are computing the same value. 100 00:08:50,220 --> 00:08:52,560 So we see 24 and both. 101 00:08:52,560 --> 00:08:58,170 And then just to make sure, let's do three again and make sure each one prints out six. 102 00:08:58,260 --> 00:08:58,980 So great. 103 00:08:58,980 --> 00:09:07,890 So both of our factorial functions are printing out the correct values as we expect. 104 00:09:07,890 --> 00:09:10,710 So now let's create a benchmark for this. 105 00:09:11,070 --> 00:09:15,180 And this is where our time incident comes into play. 106 00:09:15,180 --> 00:09:19,710 So now we're going to create an instant and assign it now. 107 00:09:22,190 --> 00:09:28,130 And now we're going to calculate our factorial of 50,000. 108 00:09:29,690 --> 00:09:34,520 And now we want to print out how much time it takes for that computation. 109 00:09:34,520 --> 00:09:38,210 So we'll give it a little bit of styling. 110 00:09:38,210 --> 00:09:40,040 So we want it to print out. 111 00:09:42,510 --> 00:09:53,520 Two decimals into the second as seconds are just two decimals and we'll say now dot elapsed to get the 112 00:09:53,520 --> 00:09:57,450 time differentiation so from time now to time. 113 00:09:57,570 --> 00:10:03,990 Now here as well and then the difference between the two so the time elapsed and now we want to do this 114 00:10:03,990 --> 00:10:10,320 again but we want to do it with our multi factorial. 115 00:10:12,900 --> 00:10:25,230 And so if we run this now, the factorial of 50,000 single threaded is going to take 4.9 2 seconds and 116 00:10:25,230 --> 00:10:33,150 our paralyzed factorial actually took 169 and a half milliseconds. 117 00:10:33,270 --> 00:10:40,800 So that just shows you how much faster it is to run programs in a parallel manner. 118 00:10:40,800 --> 00:10:48,030 So if we run it again and this time 40,000, we see that we're seeing magnitudes higher of performance 119 00:10:48,030 --> 00:10:53,430 speed ups, which is why I wanted to show you just how awesome this real crate is. 120 00:10:53,430 --> 00:11:03,150 So hopefully this was a very insightful lecture for you into the power that parallel ising our applications 121 00:11:03,600 --> 00:11:05,220 can provide for us.