1 00:00:00,830 --> 00:00:05,570 We just saw some scenarios in which this whole event based communication architecture seems to totally 2 00:00:05,570 --> 00:00:06,730 fall apart. 3 00:00:06,950 --> 00:00:10,850 In this video I'm going to answer a couple of questions that you might have about all this stuff right 4 00:00:10,850 --> 00:00:11,910 from the get go. 5 00:00:11,930 --> 00:00:15,800 Well then take a pause and then we'll go through some possible ways of solving these problems in the 6 00:00:15,800 --> 00:00:16,720 video after this one. 7 00:00:16,960 --> 00:00:19,220 It's going to try to keep this video a little bit shorter. 8 00:00:19,250 --> 00:00:19,510 OK. 9 00:00:19,520 --> 00:00:22,020 So first big question you might have. 10 00:00:22,100 --> 00:00:26,230 Remember we're making use of async communication between our different services. 11 00:00:26,250 --> 00:00:28,760 That's why we're dealing with these event things. 12 00:00:28,760 --> 00:00:33,230 We had said earlier on INSIDE the course that when we are working micro services we can have them communicate 13 00:00:33,290 --> 00:00:37,840 asynchronously at events or synchronous sleep with direct requests. 14 00:00:37,850 --> 00:00:42,290 So at this point it might sound like this async communication style is just awful. 15 00:00:42,290 --> 00:00:46,340 Based upon all the problems I just mentioned and you might think that it would be a lot easier if we 16 00:00:46,340 --> 00:00:49,220 just stuck to some kind of synchronous communication approach. 17 00:00:49,220 --> 00:00:54,110 Well turns out all the same stuff happens with synchronous communications. 18 00:00:54,110 --> 00:00:57,330 So at that point you might say well the heck with this micros services stuff. 19 00:00:57,350 --> 00:00:59,930 Let's just go back to building monolith style apps. 20 00:00:59,930 --> 00:01:04,540 Well it turns out all this stuff actually does happen with monolithic style applications too. 21 00:01:04,550 --> 00:01:07,820 So let me show you diagram just to expand on this first point. 22 00:01:07,840 --> 00:01:08,110 All right. 23 00:01:08,110 --> 00:01:12,850 So we're going to imagine that we are building the same kind of money depositing application but now 24 00:01:12,910 --> 00:01:18,690 we have a more monolith style approach where we are not exchanging events or anything like that. 25 00:01:18,730 --> 00:01:22,420 Now in this scenario we would imagine that maybe a user makes three requests in a row. 26 00:01:22,420 --> 00:01:23,890 One request you deposit 70. 27 00:01:23,890 --> 00:01:30,030 Deposit 40 and then withdraw 100 now even if we are using some kind of monolith style application it 28 00:01:30,030 --> 00:01:34,640 is still incredibly likely that we are running multiple instances or copies of that application. 29 00:01:35,100 --> 00:01:37,830 So never a user makes these three requests right here. 30 00:01:37,830 --> 00:01:42,840 They will probably go to a load balancer of sorts and then these requests will essentially be randomly 31 00:01:42,840 --> 00:01:45,130 assigned to all of these different instances. 32 00:01:45,240 --> 00:01:48,410 So we can imagine that their first request goes there. 33 00:01:48,420 --> 00:01:56,130 The second to right here and the third one down here so then each of these instances are going to raise 34 00:01:56,130 --> 00:01:59,000 essentially and try to process that incoming request. 35 00:01:59,130 --> 00:02:04,680 So we might be in that same kind of scenario where maybe instance a an instance B have a lot of traffic 36 00:02:04,680 --> 00:02:09,420 incoming right now for whatever crazy reason maybe they are a provision to some virtual machine that 37 00:02:09,420 --> 00:02:15,990 has lower specs than instance see down here or whatever reason we can very easily imagine that monoliths 38 00:02:15,990 --> 00:02:20,860 C down here or instant C might process this withdraw one hundred dollars request first. 39 00:02:20,910 --> 00:02:22,620 So it would reach into the database. 40 00:02:22,620 --> 00:02:25,260 Take a look at the user's balance Oh it's zero right now. 41 00:02:25,260 --> 00:02:27,480 Well once again we're in some huge error. 42 00:02:28,110 --> 00:02:32,580 So by just going back to some model modeling style approach we still deal with these concurrency issues. 43 00:02:32,760 --> 00:02:38,670 It just turns out that when we start using this micros services event based approach these same concurrency 44 00:02:38,670 --> 00:02:43,920 issues just become a little bit more prominent because we're now talking about adding in this extra 45 00:02:43,980 --> 00:02:46,520 latency step of the Nats server. 46 00:02:46,520 --> 00:02:51,030 And we're also talking about the possibility of having automatic retrials or read deliveries of these 47 00:02:51,030 --> 00:02:52,230 different events. 48 00:02:52,230 --> 00:02:56,790 So because the system becomes more complex because there are additional communication jumps inside of 49 00:02:56,790 --> 00:03:00,750 here this whole concurrency issue just becomes a little bit more prominent. 50 00:03:00,750 --> 00:03:02,390 Well to be honest a lot more prominent. 51 00:03:02,550 --> 00:03:08,090 But it's still an issue even if we went back to an old style approach again. 52 00:03:08,110 --> 00:03:11,370 So next up you might come up with an immediate solution. 53 00:03:11,380 --> 00:03:16,390 If I asked you Hey solve this problem here's one possible solution you might decide to come up with 54 00:03:17,450 --> 00:03:21,680 this a very common solution or a solution people come up with and then they very quickly realize oh 55 00:03:21,680 --> 00:03:23,600 wait that won't quite work out. 56 00:03:23,630 --> 00:03:27,470 So when we were going through all these different scenarios that everything would fail it seemed like 57 00:03:27,500 --> 00:03:33,200 a lot of the issues kind of stemmed from the fact that we had two separate services processing events 58 00:03:33,620 --> 00:03:39,260 because now there was a scenario where maybe one service was slower than the other or had communication 59 00:03:39,260 --> 00:03:41,920 issues with file storage or whatever else. 60 00:03:41,930 --> 00:03:48,230 So very convolution people come up with is let's just run one copy of the service one instance think 61 00:03:48,230 --> 00:03:50,070 about what would happen if we did that. 62 00:03:50,150 --> 00:03:55,400 So we would say let's throw that service out now we're just running one copy of the account service 63 00:03:56,530 --> 00:04:00,000 now as we start to publish events deposit deposit withdrawal. 64 00:04:00,210 --> 00:04:07,370 They'll go to hear what process that process that process that now even then there's still a possibility 65 00:04:07,370 --> 00:04:08,300 of failure here. 66 00:04:08,330 --> 00:04:13,040 For example this first event we could have some issue opening up that file because of some temporary 67 00:04:13,040 --> 00:04:17,500 issue with the hard drive so we might end up having some issue processing this first event. 68 00:04:17,600 --> 00:04:22,370 We might fail to process it entirely and it gets thrown back over essentially more or less. 69 00:04:22,370 --> 00:04:25,150 It doesn't actually get thrown back over but we never acknowledge it. 70 00:04:25,160 --> 00:04:31,160 And so Nats figures that has to deliver it again if we might then successfully process 40 dollars and 71 00:04:31,160 --> 00:04:34,670 then withdraw one hundred and boom we're still back to the same issue as before. 72 00:04:34,850 --> 00:04:38,890 But this is not foolproof but at least it kind of solves in the issues we ran into. 73 00:04:38,930 --> 00:04:43,160 And we are running multiple instances but that's not the real big issue here where we still have the 74 00:04:43,160 --> 00:04:45,020 possibility of failure. 75 00:04:45,120 --> 00:04:50,100 The big issue is that if we are only going to run one copy of the service now all of a sudden we've 76 00:04:50,100 --> 00:04:53,210 got a processing bottleneck inside of our application. 77 00:04:53,440 --> 00:04:59,100 If we can only run one copy of the service well we are very severely constrained at how quickly our 78 00:04:59,100 --> 00:05:02,230 application can process data and how we can scale it up. 79 00:05:02,240 --> 00:05:05,190 Remember generally two ways to scale up an application. 80 00:05:05,190 --> 00:05:09,960 We can scale it vertically where we are going to increase the specs that are dedicated or the amount 81 00:05:09,960 --> 00:05:13,650 of you and processing power that the service gets or horizontally. 82 00:05:13,650 --> 00:05:16,430 That's where we create more copies of this service. 83 00:05:16,470 --> 00:05:20,130 So we're going to say right from the outset they're only going to run one copy of the account service 84 00:05:20,460 --> 00:05:20,960 all the sudden. 85 00:05:20,970 --> 00:05:26,610 We cannot scale horizontally and I can end up being a huge catastrophic issue down the line as our app 86 00:05:26,610 --> 00:05:32,030 starts to get more popular and more traffic going to say solution one it's not going to work. 87 00:05:32,040 --> 00:05:33,090 It's not an option. 88 00:05:33,090 --> 00:05:39,630 We always have to assume that we're going to be running multiple copies of any given service. 89 00:05:39,700 --> 00:05:42,160 Option number two or possible solution number two. 90 00:05:42,160 --> 00:05:43,760 That is also not going to work. 91 00:05:43,930 --> 00:05:46,270 Now I don't really have a distinct plan here. 92 00:05:46,270 --> 00:05:48,910 This is more just a note that I want to throw out right now. 93 00:05:48,910 --> 00:05:54,700 Very quickly so possible solution number two you might try to figure out every possible concurrency 94 00:05:54,700 --> 00:05:58,900 issue inside of your app and you might decide that Hey we're gonna find all these possible issues and 95 00:05:58,900 --> 00:06:02,430 we're going to write code to solve every single last one. 96 00:06:02,440 --> 00:06:04,810 Well I just want you to get it in your head right now. 97 00:06:04,810 --> 00:06:08,630 I really want you to internalize this inside of any application. 98 00:06:08,640 --> 00:06:12,060 There is an possibly infinite number of concurrency issues. 99 00:06:12,220 --> 00:06:18,310 There can be a lot of different things that could possibly go wrong and you really feasibly can not 100 00:06:18,310 --> 00:06:21,510 sit down and write code to handle every last issue. 101 00:06:21,520 --> 00:06:23,380 Now of course there are exceptions to this. 102 00:06:23,410 --> 00:06:27,790 If you are building some kind of spaceship or something like that something that is absolutely critical 103 00:06:27,790 --> 00:06:30,070 in nature that it always works no matter what. 104 00:06:30,070 --> 00:06:31,990 Of course there are exceptions. 105 00:06:32,140 --> 00:06:37,420 But if you are building some kind of like Twitter clone or something like that does it really matter 106 00:06:37,510 --> 00:06:39,890 if say two tweets are out of order. 107 00:06:39,970 --> 00:06:40,540 Does it matter. 108 00:06:40,540 --> 00:06:46,070 Two tweets are duplicated or forum posts or blog posts. 109 00:06:46,100 --> 00:06:48,530 Does that kind of stuff really matter at the end the day. 110 00:06:48,650 --> 00:06:53,390 And are you going to dedicate a huge amount of engineering time and money to solve that problem. 111 00:06:53,390 --> 00:06:56,560 Well I can tell you right now you're probably not. 112 00:06:56,570 --> 00:07:01,130 So there is a certain point where you might start to identify possible issues and you say you know what. 113 00:07:01,130 --> 00:07:03,940 Realistically that is just not likely to happen. 114 00:07:03,950 --> 00:07:08,810 You know it's not likely for a user to try to create five tweets at the same time while also deleting 115 00:07:08,810 --> 00:07:12,090 two of them and editing the other three or something like that. 116 00:07:12,140 --> 00:07:14,510 So you really got to sit down and make some engineering judgment. 117 00:07:14,510 --> 00:07:16,440 Is this actually worth trying to solve. 118 00:07:16,520 --> 00:07:22,160 Because a lot of the time you're probably going to inevitably say no it's not so this isn't really a 119 00:07:22,160 --> 00:07:24,540 solution for say I just want to throw this out there right away. 120 00:07:24,560 --> 00:07:29,480 Because even in the app that we are going to build you might start to say hey Stephen wait a minute 121 00:07:29,510 --> 00:07:34,190 in our ticketing application if a user creates a ticket and then does this and then this and then this 122 00:07:34,610 --> 00:07:40,130 well they can end up an issue if the timing is just right and some service fails at the same time. 123 00:07:40,220 --> 00:07:40,560 OK. 124 00:07:40,580 --> 00:07:43,020 I accept that there might be holes there might be gaps. 125 00:07:43,070 --> 00:07:48,440 But again at the end of the day we probably cannot write code to capture every single case because number 126 00:07:48,440 --> 00:07:51,050 one it might not just might not matter. 127 00:07:51,050 --> 00:07:55,820 Number two it might just take too much engineering time to fix OK. 128 00:07:55,850 --> 00:07:57,720 So again Kim's short video here. 129 00:07:58,020 --> 00:07:58,320 OK. 130 00:07:58,370 --> 00:08:00,470 Was eight minutes I still failed. 131 00:08:00,470 --> 00:08:01,560 Well any right. 132 00:08:01,670 --> 00:08:02,510 Let's pause right here. 133 00:08:02,510 --> 00:08:07,280 We're gonna come back the next video and take a look at some possible strategies to solve these big 134 00:08:07,280 --> 00:08:07,790 issues.