1 00:00:01,680 --> 00:00:06,210 At this point in time we've been running these two listeners for the last couple of videos or so. 2 00:00:06,210 --> 00:00:11,340 And in this last couple of videos you might have noticed that sometimes you have published an event 3 00:00:11,580 --> 00:00:14,940 and not seen it immediately appear inside one of these listeners. 4 00:00:14,940 --> 00:00:17,040 Let's see if I can replicate that really quickly. 5 00:00:17,190 --> 00:00:19,230 So I'm going to first publish a normal event. 6 00:00:19,860 --> 00:00:20,550 OK I see. 7 00:00:20,550 --> 00:00:22,050 Number 70 right there. 8 00:00:22,050 --> 00:00:27,990 I'm now going to restart both these listeners very very quickly and then published an event very quickly 9 00:00:27,990 --> 00:00:28,750 after that. 10 00:00:28,800 --> 00:00:33,600 And I bet you we are not going to immediately see event number 71 up here right away. 11 00:00:33,600 --> 00:00:37,500 If I do see seventy one up here right away I'm going to restart the publisher yet again and then see 12 00:00:37,500 --> 00:00:40,060 if events somebody to also shows up really quickly. 13 00:00:40,170 --> 00:00:48,260 Let's just give it a shot get I can restart restart publish and you'll notice I do not see seventy one. 14 00:00:48,310 --> 00:00:54,240 Now if we wait for about 30 seconds or so we might see seventy one up here and one is when windows I 15 00:00:54,390 --> 00:00:55,560 can even do another published. 16 00:00:55,590 --> 00:00:55,850 OK. 17 00:00:55,860 --> 00:00:59,390 There's seventy two but without a doubt 71 has been skipped. 18 00:00:59,400 --> 00:01:02,360 So where did events 71 go off to. 19 00:01:02,370 --> 00:01:04,620 Why are we not seeing it appear anywhere. 20 00:01:04,620 --> 00:01:06,950 Well we're going to do a little bit of digging. 21 00:01:06,990 --> 00:01:11,040 I want you to get kind of a natural understanding of what's going on internally with Natsumi server 22 00:01:11,070 --> 00:01:12,750 and hey there's some new one right there. 23 00:01:12,800 --> 00:01:14,740 Why did it take so long right. 24 00:01:14,760 --> 00:01:18,930 So again I want to get you out make sure you have a little bit better idea of what's going on internally 25 00:01:18,930 --> 00:01:20,160 with Nat streaming server. 26 00:01:20,220 --> 00:01:24,920 So it will require us to do a little bit of debugging the first thing I want to begin with. 27 00:01:25,080 --> 00:01:30,990 You might recall that when we put our Nats deployment file together inside of our K directory a little 28 00:01:30,990 --> 00:01:35,390 bit ago on our service we actually exposed to different ports. 29 00:01:35,880 --> 00:01:39,510 So one was for connecting clients that was port for two two two. 30 00:01:39,510 --> 00:01:43,640 There was another port named monitoring on 8 2 2 2. 31 00:01:43,890 --> 00:01:47,370 So we can access the Nat string server on this monitoring ports. 32 00:01:47,370 --> 00:01:51,210 And it's going to give us a lot of information about all the different subscriptions that have been 33 00:01:51,210 --> 00:01:55,650 created all the different clients some incoming traffic statistics stuff like that. 34 00:01:56,160 --> 00:02:00,480 So we're going to take a look at the information that is served to us by this monitoring stuff to get 35 00:02:00,480 --> 00:02:07,840 a better idea of why we are sometimes missing events to do so the first thing we have to do is expose 36 00:02:08,230 --> 00:02:13,510 port eight to two on our local machine so that we can access it and take a look at that monitoring information 37 00:02:14,490 --> 00:02:15,600 so inside my terminal. 38 00:02:15,600 --> 00:02:21,150 Once again I'm going to go back to that window where we had previously set up the port forwarding report 39 00:02:21,150 --> 00:02:22,460 for two 2. 40 00:02:22,650 --> 00:02:26,980 We're going to repeat that same process but for Port 8 2 2 2 instead. 41 00:02:27,010 --> 00:02:29,670 So I can open up another terminal window. 42 00:02:29,760 --> 00:02:31,770 We're going to do that same port forwarding stuff. 43 00:02:31,980 --> 00:02:37,930 So remember we're going to first do a cube Seitel get pods I'm going to find my Nat's deployment pod. 44 00:02:38,070 --> 00:02:40,400 I'm going to copy the name. 45 00:02:40,620 --> 00:02:44,950 Well then do a cube Seitel port dash forward. 46 00:02:45,270 --> 00:02:51,530 I can paste in the pod name and then 8 2 2 2 colon 8 to do 2 47 00:02:54,490 --> 00:03:00,670 we can out open our browser and navigate to a local host 8 2 2 2 and that should print out some monitoring 48 00:03:00,700 --> 00:03:03,630 information about our running and that's server. 49 00:03:03,820 --> 00:03:07,290 Now we're gonna go to a very specific address instead of our browser. 50 00:03:07,300 --> 00:03:10,830 I'm gonna put it to my editor here really quickly just you can read it very easily. 51 00:03:10,900 --> 00:03:18,850 We're going to go to local host colon 8 2 2 2 streaming so I'm going to take that your l put it into 52 00:03:18,850 --> 00:03:22,830 my browser and I'll see something like this right here. 53 00:03:22,850 --> 00:03:26,480 So this is the Nat streaming server monitoring page. 54 00:03:26,480 --> 00:03:31,970 We can click around to these different pages and see some stats or information about our running streaming 55 00:03:31,970 --> 00:03:38,680 server now you'll notice that whenever I click one of these tabs or one these little links I see some 56 00:03:38,680 --> 00:03:44,590 nicely formatted Jason data you might see just a big blob of text I see this nicely formatted because 57 00:03:44,590 --> 00:03:49,720 I'm running a Chrome extension that automatically formats Jason inside my browser if you want to run 58 00:03:49,720 --> 00:03:54,100 the same extension I would encourage you hop on the chrome extension store and just search for Jason 59 00:03:54,130 --> 00:03:59,440 you'll find one very similar very easily so there's a couple of links on this page that I want to highlight 60 00:03:59,440 --> 00:04:00,420 in particular. 61 00:04:00,430 --> 00:04:05,950 First off is clients when I go to clients it's going to print out the information about every client 62 00:04:05,950 --> 00:04:10,910 that is currently connected to my Nat streaming server and in particular it will show the I.D. of each 63 00:04:10,910 --> 00:04:13,480 of them at the very bottom of this array. 64 00:04:13,510 --> 00:04:22,090 I see one client that has an I.D. of ABC and you might recall that our publisher has a client idea of 65 00:04:22,120 --> 00:04:28,730 ABC so this entry right here represents our publisher that is currently connected to Nat streaming server 66 00:04:29,660 --> 00:04:36,290 these two right here represent the two listeners that we currently have running we can also take a look 67 00:04:36,290 --> 00:04:38,070 at channels over here. 68 00:04:38,240 --> 00:04:42,260 You'll notice that on the channels tab as a list of all the active channels currently inside of our 69 00:04:42,260 --> 00:04:43,080 cluster. 70 00:04:43,090 --> 00:04:49,630 So right now we have just one channel of ticket colon created now there's a little bit more information 71 00:04:49,630 --> 00:04:55,810 we can use to kind of extract about these channels and unfortunately it is not super well documented. 72 00:04:55,900 --> 00:04:59,500 In other words there's not really any link on this page to help you understand how to get here. 73 00:04:59,590 --> 00:05:04,150 So I'm going to click on the 10 channels link and then inside the address bar I'm going to put in a 74 00:05:04,150 --> 00:05:08,400 question mark and then subs equals one. 75 00:05:08,500 --> 00:05:11,290 Like so now I'm going to navigate there. 76 00:05:11,320 --> 00:05:15,550 So this is going to print out a lot more information about each of the channels that now streaming server 77 00:05:15,550 --> 00:05:17,090 is currently running. 78 00:05:17,110 --> 00:05:22,000 So now see OK we've got a channel with the name of ticket colon created and we can see that there are 79 00:05:22,000 --> 00:05:24,610 two subscriptions for this channel. 80 00:05:24,610 --> 00:05:32,830 So there are two subscriptions one for each of the two listeners that we are currently running. 81 00:05:32,900 --> 00:05:38,980 They are both members of the same Q Group because they have the same Q Group name there is an ACH wait 82 00:05:38,980 --> 00:05:43,000 property which is the number of seconds that that streaming server is going to wait after sending the 83 00:05:43,000 --> 00:05:49,690 thing a message for that client or the subscription to act or acknowledged the message. 84 00:05:49,690 --> 00:05:53,350 And then there's some other information out of here that you can kind of figure out on your own if you 85 00:05:53,350 --> 00:05:54,740 want to. 86 00:05:54,750 --> 00:05:59,940 So how is this going to help us diagnose this issue with somehow restarting these listeners and then 87 00:05:59,940 --> 00:06:03,150 publishing events and not seeing the event immediately appear. 88 00:06:03,150 --> 00:06:05,370 Well let me show you what's going on. 89 00:06:06,320 --> 00:06:10,940 I want you to take note that inside of my subscription list right now I have two running subscriptions 90 00:06:11,420 --> 00:06:17,560 and then going to go back over I'm going to restart one of the listeners get restarted and now I'm going 91 00:06:17,570 --> 00:06:23,790 to go back over and refresh this page and as soon as I do you'll now see that there are three running 92 00:06:23,790 --> 00:06:28,550 subscriptions. 93 00:06:28,630 --> 00:06:34,850 It looks like they are none of them are online so they all have is offline a false so right now. 94 00:06:34,850 --> 00:06:39,580 When I restarted that listener I without a doubt stopped a listener I stopped it. 95 00:06:39,590 --> 00:06:44,290 I restarted one so the copy of that listener that client that I was running with the subscription it 96 00:06:44,330 --> 00:06:46,790 was closed down and I created a new one. 97 00:06:46,970 --> 00:06:53,180 But for a brief period of time Nat string server is just going to assume that maybe there is some momentary 98 00:06:53,240 --> 00:06:56,900 interrupt in connection or communication with that client. 99 00:06:56,900 --> 00:07:00,830 So for a very brief period of time that streaming service says oh you know what. 100 00:07:00,890 --> 00:07:05,630 That that client that subscription just went offline but I'm sure it'll be back in just a moment or 101 00:07:05,630 --> 00:07:06,200 two. 102 00:07:06,260 --> 00:07:11,570 And so she's going to sit around and wait and wait and wait and wait until eventually it decides you 103 00:07:11,570 --> 00:07:14,180 know what that connection that just went offline. 104 00:07:14,210 --> 00:07:15,630 They're probably not coming back. 105 00:07:15,680 --> 00:07:20,800 And so after some period of time the subscription will eventually be removed from this list. 106 00:07:20,840 --> 00:07:22,550 So it's been about 30 seconds or so. 107 00:07:22,550 --> 00:07:24,290 That's how long I've been yapping on. 108 00:07:24,390 --> 00:07:30,230 So I got to now refresh the page and now it's back to to so that's what's going on behind the scenes. 109 00:07:30,230 --> 00:07:35,600 That's why we are seeing some messages being temporary lost Nordstrom's server thinks that that client 110 00:07:35,600 --> 00:07:39,580 is still around even though we just completely killed it. 111 00:07:39,590 --> 00:07:41,980 So there's some period of time where Nordstrom's servers can say you know what. 112 00:07:41,990 --> 00:07:44,570 I think that thing might be coming back any point time. 113 00:07:44,570 --> 00:07:48,680 I'm just going to hold onto this event and send it to them when they come online. 114 00:07:48,800 --> 00:07:53,100 And that's why it appears that we are losing some messages for some period of time now. 115 00:07:53,120 --> 00:07:56,570 As you can imagine that is not super desirable. 116 00:07:56,570 --> 00:08:01,850 So how can we help Natsumi server understand that when one of these clients goes off line it's not coming 117 00:08:01,850 --> 00:08:02,750 back. 118 00:08:02,750 --> 00:08:04,850 Well there's two things we can do. 119 00:08:04,910 --> 00:08:10,610 The first thing we can do we kind of already did back inside of our Nats deployment file inside the 120 00:08:10,610 --> 00:08:16,650 case directory you might recall that we had that long list of arguments inside of here. 121 00:08:16,660 --> 00:08:25,120 We had arguments of hp hp TI and HP f HP stands for heartbeat a heartbeat is like a little request that 122 00:08:25,130 --> 00:08:30,470 Nat streaming server is going to send to all of its different connected clients every so many seconds. 123 00:08:30,490 --> 00:08:35,200 This is purely a little health check it makes sure that each of these clients is still up and running 124 00:08:36,560 --> 00:08:43,070 HP is how often that's remote server is going to make a heartbeat request to each of its clients HP 125 00:08:43,070 --> 00:08:50,060 TI is how long each client has to respond and SPF is the number of times that each client can fail before 126 00:08:50,060 --> 00:08:54,250 that streaming server is going to assume that that connection is dead and gone. 127 00:08:54,260 --> 00:09:00,560 So when we restart one of these listeners that streaming server is then at some in the next couple of 128 00:09:00,560 --> 00:09:05,360 seconds like zero to five seconds whatever it listener we killed that streaming servers going to send 129 00:09:05,360 --> 00:09:10,980 it a harpy request we're then going to have to wait for that killed process to fail the health check 130 00:09:10,980 --> 00:09:16,010 or the heartbeat twice in a row before Nordstrom's server eventually says OK that thing must really 131 00:09:16,010 --> 00:09:19,160 be dead I'm going to take it off this list of subscriptions. 132 00:09:19,160 --> 00:09:21,790 So that's the first thing we can do to kind of fix this problem. 133 00:09:21,800 --> 00:09:28,040 We can implement tighter harpy checks but even then we still have to wait for 10 seconds or so for this 134 00:09:28,040 --> 00:09:29,420 thing to actually be cleaned up. 135 00:09:29,810 --> 00:09:35,870 So there must be another way to kind of tell that stream server hey our client's going dead don't consider 136 00:09:35,870 --> 00:09:38,830 us to be receiving any messages anymore. 137 00:09:38,840 --> 00:09:43,850 Well let's take a look at that second method of helping Nats understand that a client is going offline 138 00:09:44,210 --> 00:09:44,960 in the next video.