1
00:00:01,110 --> 00:00:05,910
Our application is now working pretty well but at the end of last video we saw a little issue and this

2
00:00:05,910 --> 00:00:09,570
issue might be something that's been kind of nagging in the back of your head throughout this entire

3
00:00:09,570 --> 00:00:10,360
course.

4
00:00:10,410 --> 00:00:14,750
What happens to your application when a service goes down for some period of time.

5
00:00:14,790 --> 00:00:18,030
We just saw that last video at the moderation service going down.

6
00:00:18,090 --> 00:00:23,250
We created a comment while the moderation service was down and now this comment is going to be forever

7
00:00:23,250 --> 00:00:25,690
stuck in the pending state.

8
00:00:25,710 --> 00:00:27,210
So how do we deal with this.

9
00:00:27,210 --> 00:00:31,870
Let's take a look at a diagram to better understand what the real issue here is OK.

10
00:00:31,890 --> 00:00:34,390
So in this diagram kind of hard to understand what's going on.

11
00:00:34,410 --> 00:00:36,870
This is kind of a time sequence diagram.

12
00:00:36,870 --> 00:00:41,880
So we start off time of our application starting with the very first time right up here and we have

13
00:00:41,880 --> 00:00:49,020
time flowing down or vertically in our application we've got the event plus the green lines throughout

14
00:00:49,050 --> 00:00:53,790
all these different services indicate times in which the service was successfully running.

15
00:00:53,850 --> 00:00:58,190
So we're gonna say that our event bus was running the whole time host was running comments and query.

16
00:00:58,200 --> 00:01:02,940
Those are all running for the entire time that our application has been up and running.

17
00:01:02,940 --> 00:01:08,700
And during that time maybe we've had a couple of events flow out into the event one went out to post

18
00:01:08,700 --> 00:01:13,470
comments query and moderation successfully because all those services were up at the time that event

19
00:01:13,500 --> 00:01:14,960
was sent out.

20
00:01:14,970 --> 00:01:20,310
Well let's imagine that for some period of time indicated by this red line right here maybe the moderation

21
00:01:20,310 --> 00:01:22,230
service was down.

22
00:01:22,230 --> 00:01:27,780
So event two and three might have been emitted from the event bus gone over to our post successfully

23
00:01:28,050 --> 00:01:33,540
went to comments and queries successfully but then failed to be delivered over to the moderation service

24
00:01:34,480 --> 00:01:39,370
and then maybe sometime later moderation service came up and it received even four.

25
00:01:39,560 --> 00:01:41,970
But events two and three have now been lost to time.

26
00:01:41,970 --> 00:01:48,240
We don't currently have any way to tell moderation about these events and how they just occurred.

27
00:01:48,250 --> 00:01:52,630
Now this is the scenario in which a service goes down for some little period of time but there's another

28
00:01:52,630 --> 00:01:56,380
scenario that you've probably been wondering about again throughout the first hour.

29
00:01:56,380 --> 00:02:02,110
This course already what happens if we're using this kind of event style approach and we don't bring

30
00:02:02,140 --> 00:02:07,230
up or we don't even create a service until maybe sometime in the future.

31
00:02:07,240 --> 00:02:12,700
So in this diagram let's imagine that we did not initially create our query service and so maybe we've

32
00:02:12,700 --> 00:02:18,430
been running post comments in moderation for days maybe even years and they have tons of data inside

33
00:02:18,430 --> 00:02:20,820
them and they've already received tons of events.

34
00:02:21,010 --> 00:02:25,060
But maybe we only created that query service like one year down the line.

35
00:02:25,060 --> 00:02:29,240
And so we're creating this great service but it's already missed out on all these different events.

36
00:02:29,260 --> 00:02:31,630
How do we somehow get query into sync.

37
00:02:31,660 --> 00:02:36,790
How do we get all the existing posts and comments and results of moderation over to the query service

38
00:02:37,850 --> 00:02:38,630
well as usual.

39
00:02:38,630 --> 00:02:41,360
There's a couple of different ways to handle all these different cases.

40
00:02:41,420 --> 00:02:44,970
So let's take a look at some possible solutions.

41
00:02:45,110 --> 00:02:47,070
So here's option number one perhaps.

42
00:02:47,080 --> 00:02:51,110
Number one we're going to imagine that we've had our post service and common service running for the

43
00:02:51,110 --> 00:02:57,080
entire lifetime of our application and then maybe one year into the future we create the query service

44
00:02:57,460 --> 00:03:02,780
the query service of course needs to know about all of the posts that exist and all of the comments.

45
00:03:02,780 --> 00:03:08,870
So we somehow need to get query into sync with everything else one way or option number one would be

46
00:03:08,870 --> 00:03:11,570
to use synchronous requests.

47
00:03:11,570 --> 00:03:17,420
So the instant we launch query maybe we'd have some code inside of the query service to make a direct

48
00:03:17,420 --> 00:03:21,980
network requests over to posts and say Give me a list of all your posts.

49
00:03:21,980 --> 00:03:27,920
It could get all the different post that exist inside the post service the plane network requests once

50
00:03:27,920 --> 00:03:32,360
it's stored all this post it could then turn round to the com and service and also make a sync request

51
00:03:32,360 --> 00:03:37,280
directly over and say give me all the comments you have and then I could store all those comments with

52
00:03:37,280 --> 00:03:39,510
the associated posts.

53
00:03:39,510 --> 00:03:44,850
Now the downside to this approach is pretty clear we are falling back to synchronous requests.

54
00:03:44,850 --> 00:03:50,130
The real downside here is that we would have to have some code inside of posts and comments just to

55
00:03:50,130 --> 00:03:54,200
service or kind of handle this new service that is coming online.

56
00:03:54,300 --> 00:03:59,760
Right now we do not have any real bulk kind of endpoint where we can say give me all the posts for all

57
00:03:59,760 --> 00:04:04,200
eternity per say well okay technically our blog application does right now but you can imagine that

58
00:04:04,200 --> 00:04:08,760
in a production application we would probably not have an endpoint that just says Give me all posts

59
00:04:08,760 --> 00:04:14,470
that exist probably would not implement something like that so if we did not already have some endpoint

60
00:04:14,470 --> 00:04:19,980
like that we would have to implement it for both posts and comments now one thing to be clear about

61
00:04:20,130 --> 00:04:25,590
after getting all the posts and comments and synchronizing all this data then from that point in time

62
00:04:25,620 --> 00:04:31,230
into the future then the query service would start to receive any events or remitted from posts and

63
00:04:31,230 --> 00:04:31,930
comments.

64
00:04:32,040 --> 00:04:36,690
So eventually query would kind of revert back to our expected behavior but just to get it online and

65
00:04:36,690 --> 00:04:40,120
up and running it could make these initial synchronous requests.

66
00:04:40,300 --> 00:04:46,010
So that's option number one option number two is the one exception to that rule that we discussed way

67
00:04:46,040 --> 00:04:50,860
earlier on INSIDE THE COURSE around every service having its own private database.

68
00:04:50,860 --> 00:04:55,280
So with option number two which is kind of similar to option number one we can say that the instant

69
00:04:55,280 --> 00:05:01,220
that query came online we could give it direct access to the data store or essentially the database

70
00:05:01,280 --> 00:05:02,620
for all the post that we have.

71
00:05:03,110 --> 00:05:04,980
And same thing for comments as well.

72
00:05:05,030 --> 00:05:09,410
So rather than relying upon synchronous network requests directly to the service we could just say you

73
00:05:09,410 --> 00:05:11,990
know what query needs all the information on a database.

74
00:05:12,050 --> 00:05:14,090
Let's just give it access to the database.

75
00:05:14,120 --> 00:05:18,890
Now the upside to this is now the query Service could run its own queries and figure out all the different

76
00:05:18,890 --> 00:05:24,930
data that needs to get out of the Post Database and Commons database after synchronizing all this data.

77
00:05:24,930 --> 00:05:28,420
Once again the query service could start listening to different events.

78
00:05:28,610 --> 00:05:33,420
And the downside to this approach is once again we are making sync requests over which means we are

79
00:05:33,420 --> 00:05:38,790
going to have to implement some code inside of query to work directly with whatever this database is

80
00:05:38,790 --> 00:05:40,220
or this databases.

81
00:05:40,380 --> 00:05:46,920
Let's imagine that maybe the post data store over here was my ask you all and maybe the Commons data

82
00:05:46,920 --> 00:05:48,890
store was Mongo DB.

83
00:05:48,960 --> 00:05:53,700
Now all the sudden we need to write some code inside of query to interface with a mystical database

84
00:05:53,940 --> 00:05:56,100
and a Mongo DB database.

85
00:05:56,100 --> 00:05:58,830
That's a lot of extra code to potentially have to write.

86
00:05:59,070 --> 00:06:04,870
So onto option number three which as you can guess is what we are going to be doing.

87
00:06:04,890 --> 00:06:05,220
All right.

88
00:06:05,340 --> 00:06:10,380
So option number three really falls into the same category that we've gone through several times in

89
00:06:10,380 --> 00:06:13,980
the course where I'm going to tell you this and you're going to say no way not possible.

90
00:06:13,980 --> 00:06:15,690
Totally inefficient can't do it.

91
00:06:15,690 --> 00:06:16,580
Not going to happen.

92
00:06:16,680 --> 00:06:21,090
But once again yes this is how people actually implement micro services.

93
00:06:21,090 --> 00:06:21,900
People do this.

94
00:06:21,900 --> 00:06:25,220
Not making this stuff up with option number three.

95
00:06:25,380 --> 00:06:29,610
We're going to once again say that we are implementing or bringing query online at some point time in

96
00:06:29,610 --> 00:06:30,050
the future.

97
00:06:30,060 --> 00:06:37,410
Like right here now in theory query could work if it had access to all the events that had been emitted

98
00:06:37,410 --> 00:06:38,530
in the past.

99
00:06:38,850 --> 00:06:43,230
So the post service for example is going to emit events maybe one two and three and each of these might

100
00:06:43,230 --> 00:06:45,210
be a post creation event.

101
00:06:45,210 --> 00:06:50,020
Those are all events that query cares about if query just had access to those events.

102
00:06:50,070 --> 00:06:51,990
It can work just fine.

103
00:06:51,990 --> 00:06:54,390
And that's what the strategy is going to rely upon.

104
00:06:54,510 --> 00:07:01,050
We're going to say that whenever any of our services emits any event whatsoever over to the event us

105
00:07:01,590 --> 00:07:04,860
the event buzz will send that event out to all the other services.

106
00:07:04,980 --> 00:07:07,420
But it's going to do something else at the same time.

107
00:07:07,470 --> 00:07:10,040
It's going to store that event internally.

108
00:07:12,270 --> 00:07:17,190
So we can imagine that after event one gets emitted the event buzz is going to store that event internally

109
00:07:17,190 --> 00:07:22,350
in some kind of data structure probably not in memory probably in some kind of database or something

110
00:07:22,350 --> 00:07:28,670
similar because this data store is going to grow to be very very large over time we can then imagine

111
00:07:28,940 --> 00:07:31,150
that maybe at the Postal Service immense event.

112
00:07:31,160 --> 00:07:33,110
Number two.

113
00:07:33,360 --> 00:07:38,790
And so we would add that on now event bus has a record of event one event too.

114
00:07:38,880 --> 00:07:41,310
And then finally event three as well.

115
00:07:41,310 --> 00:07:48,320
And so we can imagine that at that point time the event bus now knows about event 1 2 and 3.

116
00:07:48,430 --> 00:07:53,200
Again I want to repeat these events are still going out to all the other services as they currently

117
00:07:53,200 --> 00:07:53,390
are.

118
00:07:53,410 --> 00:07:55,740
We are just adding one extra little step here.

119
00:07:55,860 --> 00:08:01,470
The event bus is going to store all these events now the upside to this approach is that if query comes

120
00:08:01,470 --> 00:08:07,470
along or online at some point in time the future where we can say over to the event bus Hey give me

121
00:08:07,470 --> 00:08:12,660
access to all the events you have stored all them just throw them over and I'll decide whether or not

122
00:08:12,690 --> 00:08:13,840
I care about them.

123
00:08:13,890 --> 00:08:22,330
And so we can send over event one an event to and event three and now the crew service will be totally

124
00:08:22,330 --> 00:08:27,370
up to speed and we can use all the same code that we've probably already written to handle these is

125
00:08:27,370 --> 00:08:30,770
the exact events so we don't have to write any extra code.

126
00:08:30,880 --> 00:08:37,200
The only extra burden we really have here is storing all these events inside of our event bus data store.

127
00:08:37,210 --> 00:08:43,390
Now admittedly this data store could grow to be extremely large over time but as you saw from that calculation

128
00:08:43,390 --> 00:08:49,180
we did a little bit ago where we said Hey we could pay like 15 dollars a month for 100 million products.

129
00:08:49,180 --> 00:08:54,220
Well even though it might cost some amount of money it probably costs way less than you might think

130
00:08:55,270 --> 00:08:56,830
so that's what we were going to do inside this course.

131
00:08:56,830 --> 00:08:58,700
We are going to be using option number three.

132
00:08:58,810 --> 00:09:04,060
We're going to say that whenever we emit into any event we're going to store it with our event bus so

133
00:09:04,060 --> 00:09:09,220
that if we ever bring a service online in the future we can get access to all those events that occurred

134
00:09:09,220 --> 00:09:11,620
in the past.

135
00:09:11,620 --> 00:09:15,800
This also solves the issue with a service going down for some point in time.

136
00:09:16,000 --> 00:09:21,630
Let's imagine I don't have a diagram prepared for this but I'll just improvise really quick so let's

137
00:09:21,630 --> 00:09:29,970
imagine that moderation misses out on events 2 and 3 but it did receive event one when moderation comes

138
00:09:29,970 --> 00:09:31,170
back on line right here.

139
00:09:31,170 --> 00:09:35,820
It can take a look at one of the last event was that it received maybe moderation would know that it

140
00:09:35,820 --> 00:09:42,680
had received event one and it could say to the event bus Hey give me all the events give me everything

141
00:09:42,680 --> 00:09:50,350
and that occurred during the time that I was down and so event but can say OK you received event one

142
00:09:50,440 --> 00:09:52,470
we'll give you everything since then.

143
00:09:52,630 --> 00:09:58,870
So we can take you back to throw it over to moderation and event three throw it over.

144
00:09:59,060 --> 00:10:01,040
And now moderation is all cut up.

145
00:10:01,040 --> 00:10:06,430
It's received two events two and three even though it was down for some period of time you can see that

146
00:10:06,430 --> 00:10:10,600
this approach actually ends up working out pretty well not only does it solve the issue of bringing

147
00:10:10,600 --> 00:10:15,670
services online in the future but it also solves this problem of events possibly being missed while

148
00:10:15,700 --> 00:10:19,130
a service is experiencing some amount of downtime.

149
00:10:19,130 --> 00:10:19,420
All right.

150
00:10:19,490 --> 00:10:21,100
So let's take a pause right here.

151
00:10:21,110 --> 00:10:22,580
I apologize for the long video.

152
00:10:22,580 --> 00:10:23,600
We'll continue in just a moment.