1 00:00:00,500 --> 00:00:06,540 Inless video we said that we could somehow fix the problem of getting duplicate votes on a single survey 2 00:00:06,540 --> 00:00:12,910 from a single user fixed by recording the list of recipients in a sub document collection. 3 00:00:12,910 --> 00:00:18,600 And we had said that for every one of these sub documents which referred to as a recipient they would 4 00:00:18,600 --> 00:00:23,130 all have an email property and a clipped property as well. 5 00:00:23,130 --> 00:00:27,990 Now I want to talk to you a little bit more about some document collections how they work and how they 6 00:00:27,990 --> 00:00:30,590 differ from normal collections. 7 00:00:30,600 --> 00:00:34,710 First off I'm going to give you a better visual of what's going on here. 8 00:00:34,780 --> 00:00:38,530 Look at this is a diagram of our overall surveys collection. 9 00:00:38,640 --> 00:00:45,360 Remember that the instant we created a new survey model as we just did right here we get a new collection 10 00:00:45,360 --> 00:00:53,590 inside of our Mongo database that stores a list of surveys and we refer to this as the surveys collection. 11 00:00:53,610 --> 00:00:59,130 So inside of that surveys collection we have a bunch of different instances of surveys where a bunch 12 00:00:59,130 --> 00:01:03,300 of different records that represent a single survey. 13 00:01:03,300 --> 00:01:10,980 Now we can store a variety of records inside of this survey that we refer to as a sub document collection. 14 00:01:10,980 --> 00:01:13,680 So each one of these is a recipient. 15 00:01:14,520 --> 00:01:18,020 Now you might ask why would we use some document collections at all. 16 00:01:18,210 --> 00:01:24,330 Well the answer is that we use them whenever we want to make a very very clear association between two 17 00:01:24,330 --> 00:01:25,550 given records. 18 00:01:25,770 --> 00:01:32,010 So a recipient right here like this very particular recipient with an e-mail and they clicked property 19 00:01:32,580 --> 00:01:41,010 only ever will belong to its parent survey and this recipient really has absolutely no use whatsoever 20 00:01:41,250 --> 00:01:45,330 in the context of any other survey inside the surveys collection. 21 00:01:45,330 --> 00:01:49,760 So in other words the recipient is only useful as a child of this survey. 22 00:01:49,770 --> 00:01:53,110 It's not useful in any other regard whatsoever. 23 00:01:53,400 --> 00:01:58,080 And I would kind of have to stretch my mind to think of a case where we would care at all about this 24 00:01:58,080 --> 00:02:01,240 recipient outside of just belonging to this given survey. 25 00:02:01,560 --> 00:02:06,810 So we usually make use of some document collections like this when we want to make some type of ownership 26 00:02:06,810 --> 00:02:10,330 relationship very very clear. 27 00:02:10,350 --> 00:02:13,840 Now when I say that your immediate answer might be. 28 00:02:14,070 --> 00:02:14,860 OK Stephen. 29 00:02:14,880 --> 00:02:22,620 So every survey has a list of recipients and we record those as a sub document collection. 30 00:02:23,010 --> 00:02:29,680 But you had also just told us two minutes ago Stephen that our users have a list of surveys. 31 00:02:29,730 --> 00:02:33,790 So in that case why do our surveys have a reason. 32 00:02:33,840 --> 00:02:37,890 Why does a user have references to these other surveys they've created. 33 00:02:38,130 --> 00:02:46,050 And why does the user not have surveys listed as a sub document collection something like this. 34 00:02:46,050 --> 00:02:50,510 So in this kind of case I know this diagram diagram is a little bit hard to pick up right here. 35 00:02:50,640 --> 00:02:55,680 But in this case I'm saying why don't we have just a single collection user's collection. 36 00:02:55,680 --> 00:03:01,470 And then inside of there every user has a sub document collection of surveys and then every survey has 37 00:03:01,470 --> 00:03:04,080 a sub document collection of recipients. 38 00:03:04,080 --> 00:03:08,840 So why don't we go all the way with the nesting wireless kind of setting up this nesting. 39 00:03:08,840 --> 00:03:10,850 On one level but not the other. 40 00:03:11,280 --> 00:03:14,910 Well I got to tell you there is a very practical reason for this. 41 00:03:14,910 --> 00:03:16,370 Very practical. 42 00:03:17,190 --> 00:03:25,840 So in the Mongo D-B world we refer to each of the records inside of a collection as a document. 43 00:03:25,860 --> 00:03:31,110 So looking at the users collection right here we have a document a document a document and then inside 44 00:03:31,110 --> 00:03:37,380 of the survey collection we have a document that is a survey and that includes everything inside of 45 00:03:37,380 --> 00:03:37,800 it. 46 00:03:37,800 --> 00:03:40,360 So really this is one whole document. 47 00:03:40,410 --> 00:03:42,170 This is one whole document. 48 00:03:42,230 --> 00:03:44,530 So I missed the little drag right there. 49 00:03:44,640 --> 00:03:49,710 So here's one whole document and then another whole document right here as well. 50 00:03:50,100 --> 00:03:51,970 So that is relevant. 51 00:03:52,070 --> 00:03:56,230 And let me tell you why this is going to be quite interesting. 52 00:03:56,700 --> 00:04:03,810 So when ever we use Mongo D-B the size limit for a very particular record is four megabytes. 53 00:04:03,840 --> 00:04:04,300 That's it. 54 00:04:04,320 --> 00:04:10,170 You can only ever stuff four megabytes worth of data into a single document. 55 00:04:10,170 --> 00:04:15,510 So in other words we are saying that this survey right here in all of the data inside of it can only 56 00:04:15,510 --> 00:04:19,140 ever be up to four megabytes large. 57 00:04:19,500 --> 00:04:21,770 So let's kind of do a little bit of math here. 58 00:04:21,990 --> 00:04:23,740 Let's do a little bit of math. 59 00:04:23,940 --> 00:04:31,140 If we take my personal e-mail address as a benchmark and we say that my e-mail is like the average length 60 00:04:31,140 --> 00:04:37,650 of an e-mail address Well my email is 20 bytes large and I've got a little calculator over here to prove 61 00:04:37,650 --> 00:04:37,800 it. 62 00:04:37,800 --> 00:04:41,000 So I put in my email and it says 20 characters. 63 00:04:41,010 --> 00:04:42,270 Yep that's 20 bytes. 64 00:04:42,270 --> 00:04:48,550 So my email address if we assume that it's an average length email it's 20 bytes long. 65 00:04:49,020 --> 00:04:57,090 If you take that 20 bytes long right here and multiply it by 200000 you instantly get up to about 4 66 00:04:57,150 --> 00:04:58,320 megabytes. 67 00:04:58,320 --> 00:05:07,260 So in other words a single survey can only actually store about 200000 e-mail addresses. 68 00:05:07,260 --> 00:05:14,100 Anyways now thinking about a survey that's definitely ok like I could imagine that we would kind of 69 00:05:14,100 --> 00:05:18,810 top out around 200000 e-mails for a very individual survey. 70 00:05:18,960 --> 00:05:23,370 That's definitely something I could palet it's something that I could approach my customers if I was 71 00:05:23,370 --> 00:05:28,710 trying to sell this application and say hey just you know we've got a hard limit on surveys of 200000 72 00:05:28,710 --> 00:05:31,260 recipients and I think that's a very reasonable limit. 73 00:05:31,260 --> 00:05:33,220 You know that's pretty darn large. 74 00:05:33,240 --> 00:05:38,770 However I want you to now imagine what would happen if we change this schema up a little bit. 75 00:05:38,910 --> 00:05:45,420 And we said that every user's list of surveys was stored in the user record where the user document 76 00:05:45,540 --> 00:05:47,100 has a subcollection. 77 00:05:47,100 --> 00:05:53,070 So in other words if we were in this type of situation well if we still had that four megabyte limit 78 00:05:53,160 --> 00:05:58,470 and we do we have it every time we whenever we use Mongo it's always a limit of four megabytes. 79 00:05:58,690 --> 00:06:07,780 And we are now saying that a user can only ever send out 200000 emails worth of surveys. 80 00:06:07,860 --> 00:06:14,310 So if this right here was a survey that contained 100000 emails or 100000 recipients who all of a sudden 81 00:06:14,940 --> 00:06:21,780 the user only has an additional hundred thousand recipients to work with before we can no longer physically 82 00:06:21,870 --> 00:06:24,800 associate any data with their account. 83 00:06:24,810 --> 00:06:30,540 And so as you can imagine this would lead to some really really really crazy bugs inside of our application 84 00:06:30,960 --> 00:06:35,810 because we would have these super power users in our application who are sending out surveys nonstop 85 00:06:36,120 --> 00:06:40,230 and they're giving us a ton of money because they're sending out all these surveys and then all of a 86 00:06:40,230 --> 00:06:47,220 sudden one day they say hey I can't seem to do any more surveys created and then we check our database 87 00:06:47,220 --> 00:06:55,020 logs and we find out that oh boy there is actually a limit to how many emails this single user can send 88 00:06:55,020 --> 00:06:55,570 out. 89 00:06:55,740 --> 00:06:58,330 And that would be very close to a worst case scenario. 90 00:06:58,380 --> 00:07:02,430 I will tell you right now that would be a very bad thing to have happen. 91 00:07:02,490 --> 00:07:10,120 So answering the question of why we set up a sub document collection for the recipient list right here. 92 00:07:10,200 --> 00:07:13,950 But we did not set up one very similarly for the list of surveys. 93 00:07:14,010 --> 00:07:17,770 It's all about physical limitations of Mongo DB. 94 00:07:18,430 --> 00:07:19,020 OK. 95 00:07:19,200 --> 00:07:22,980 So in this video we spoke about some document collections. 96 00:07:22,980 --> 00:07:28,920 We said that they are a great way to kind of associate records or indicate ownership of records but 97 00:07:28,920 --> 00:07:33,750 that we need to be a little bit careful of it and be very aware of the data that we are storing inside 98 00:07:33,750 --> 00:07:34,380 of Mongo. 99 00:07:34,500 --> 00:07:41,480 And make sure that we have some reasonable idea of how many records can fit into a sub document collection. 100 00:07:41,490 --> 00:07:46,290 So with that in mind let's continue in the next video and we're going to talk a little bit more about 101 00:07:46,290 --> 00:07:49,750 how we set up a sub document collection of recipients. 102 00:07:49,800 --> 00:07:52,120 So I'll see you in just a minute.