1 00:00:02,210 --> 00:00:07,520 Back I am in the shell, still connected to my locally running mongod process. 2 00:00:07,520 --> 00:00:14,570 Now I want to introduce you to capped collections and explain what capped collections are. Capped collections 3 00:00:14,600 --> 00:00:20,720 are a special type of collection which you have to create explicitly where you limit the amount of data 4 00:00:20,960 --> 00:00:25,930 or documents that can be stored in there and old documents will simply be deleted 5 00:00:25,940 --> 00:00:28,670 when well this size is exceeded, 6 00:00:28,760 --> 00:00:34,600 so it's basically a store where oldest data is automatically deleted when new data comes in. 7 00:00:34,610 --> 00:00:40,700 This can be efficient for high throughput application logs where you only need the most recent logs 8 00:00:41,000 --> 00:00:47,360 or as a caching service where you cache some data and if the data then was deleted because it 9 00:00:47,360 --> 00:00:53,370 hasn't been used in a while, well then you're fine with that and you can just re-add it. To create such a 10 00:00:53,690 --> 00:00:54,610 collection, 11 00:00:54,650 --> 00:00:57,050 first of all you need to use a database of course, 12 00:00:57,050 --> 00:00:59,360 I'll use a new database, performance 13 00:00:59,660 --> 00:01:06,130 and then you create it with the create collection command, we saw that when we added validation earlier 14 00:01:06,160 --> 00:01:07,340 in the course. 15 00:01:07,400 --> 00:01:12,860 Now first of all, you define a name and I'll just name it capped but of course, the name is totally up to 16 00:01:12,860 --> 00:01:13,060 you, 17 00:01:13,060 --> 00:01:17,180 this collection can be anything, can also be users, whatever you like, 18 00:01:17,180 --> 00:01:19,240 so I'll name it capped. The important part 19 00:01:19,240 --> 00:01:20,620 here are now the options, 20 00:01:20,660 --> 00:01:26,750 so the document you pass as a second argument to create collection. Here, the option you want to set is 21 00:01:26,780 --> 00:01:28,320 capped to true 22 00:01:28,640 --> 00:01:35,990 and this will turn this into a capped collection. Now by default, it will have a size limit of 4 bytes but 23 00:01:35,990 --> 00:01:43,860 you can set size to any other value and it will then automatically be turned into a multiple of 256 bytes. 24 00:01:43,880 --> 00:01:47,310 So you could for example set this to 10000 bytes, 25 00:01:47,330 --> 00:01:52,750 it will be converted a bit and that will be your size and you can also add a Max key, 26 00:01:52,780 --> 00:01:55,720 the size is required, Max is optional 27 00:01:55,910 --> 00:02:01,970 and Max allows you to also define the amount of data that can be stored in there, measured in documents, 28 00:02:01,970 --> 00:02:05,850 so I could cap this to three documents at most. 29 00:02:05,850 --> 00:02:12,060 With that, I hit enter and I get this collection and now I can access capped as a collection 30 00:02:12,200 --> 00:02:19,760 and in that capped collection, we can now of course insert a document just as before. We can insert a document 31 00:02:20,300 --> 00:02:21,070 named Max and 32 00:02:21,080 --> 00:02:22,250 this worked. 33 00:02:22,460 --> 00:02:26,230 I can now also add Manu and this worked 34 00:02:26,480 --> 00:02:29,640 and I can also add Anna and this worked. 35 00:02:29,810 --> 00:02:33,500 Now I have three documents in there which is our max size. 36 00:02:33,680 --> 00:02:40,220 I can now of course also find my documents like this and I get back Max, Manu and 37 00:02:40,220 --> 00:02:47,840 Anna. The important thing is for a capped collection, the order in which we retrieve the documents is always 38 00:02:47,840 --> 00:02:51,590 the order in which they were inserted. For a normal collection, 39 00:02:51,620 --> 00:02:54,170 that may be the case but it's not guaranteed, 40 00:02:54,260 --> 00:03:00,260 so for a normal collection, you always need to sort if you want to sort for example by ID. 41 00:03:00,410 --> 00:03:06,020 You can do that and then you'll sort by that index but by default, in a capped collection, you'll always get 42 00:03:06,020 --> 00:03:14,060 the insertion order. If you want to change the order and sort and reverse, 43 00:03:14,060 --> 00:03:19,820 there is a special key you can use and that is $natural, the natural order by which it is 44 00:03:19,820 --> 00:03:25,390 sorted and then for example use -1 and then it's sorted the other way around, 45 00:03:25,430 --> 00:03:29,430 just as a little side note, this is important to understand or to keep in mind. 46 00:03:29,690 --> 00:03:35,120 You also can create indexes in there and you got an index on ID by default 47 00:03:35,180 --> 00:03:40,770 but of course you don't have to. Now with that, I found my documents, 48 00:03:40,780 --> 00:03:42,890 now let's insert another document, 49 00:03:43,060 --> 00:03:48,250 let's insert Maria and keep in mind that our size limit was three. 50 00:03:48,250 --> 00:03:54,190 Now if I insert Maria, I don't get an error because the idea of a capped collection is not that it gives 51 00:03:54,190 --> 00:04:00,880 me an error when I insert too many documents but that it clears the oldest document which in my case 52 00:04:01,120 --> 00:04:04,790 should be Max because that was the first document I added. 53 00:04:04,810 --> 00:04:11,180 So if I now output all entries again, indeed we see Max is gone, there's only Manu, 54 00:04:11,250 --> 00:04:12,970 Anna and Maria. 55 00:04:13,360 --> 00:04:17,630 So this is a capped collection and there are some use cases where this can be a good idea, 56 00:04:17,860 --> 00:04:21,970 why would you use a capped collection instead of a normal collection then? 57 00:04:22,090 --> 00:04:28,840 Well because of the automatic clear up. You can keep this collection fairly small and therefore, efficient 58 00:04:28,870 --> 00:04:30,260 to work with 59 00:04:30,580 --> 00:04:33,940 and you don't have to worry about manually deleting old data, 60 00:04:33,940 --> 00:04:39,970 so for use cases where you need to get rid of old data anyways with new data coming in or where you 61 00:04:39,970 --> 00:04:45,460 need high throughput and it's ok if you lose old data at some point, like for caching, then the capped 62 00:04:45,460 --> 00:04:50,190 collection is something you should keep in mind as a tool to improve performance. 63 00:04:50,200 --> 00:04:56,950 Obviously this is not a solution for storing anything like products or users or blog posts, 64 00:04:57,010 --> 00:05:03,330 that would be horrible because you would simply lose some users, some products or some blog posts.