1 00:00:02,290 --> 00:00:04,490 So why would we use an index 2 00:00:04,600 --> 00:00:11,170 and what is index to begin with? An index can speed up our find, update or delete queries, 3 00:00:11,200 --> 00:00:16,900 So all the queries where we are looking for certain documents that should match some criteria. Consider 4 00:00:16,900 --> 00:00:21,700 this find query, here I'm looking for all products where the seller is Max. 5 00:00:21,700 --> 00:00:27,040 Now let's say this is my collection of documents, the products collection, now by default if I don't 6 00:00:27,040 --> 00:00:30,410 have an index on seller set, 7 00:00:30,480 --> 00:00:33,730 mongodb will go ahead and do a so-called collection scan, 8 00:00:33,880 --> 00:00:39,520 now that simply means that mongodb to fulfill this query will go through the entire collection, 9 00:00:39,520 --> 00:00:43,760 look at every single document and see if seller equals Max 10 00:00:43,810 --> 00:00:49,990 and as you can imagine for very large collections with thousands or millions of documents, this can take 11 00:00:49,990 --> 00:00:53,740 a while and we'll see this in practice with an example in a second. 12 00:00:54,160 --> 00:01:00,970 So this is the default approach mongodb takes or the only approach it can take when you have no index 13 00:01:00,970 --> 00:01:06,250 set up in order to retrieve maybe two documents out of your thousand documents. 14 00:01:06,250 --> 00:01:12,700 Now you can create an index though, an index is not a replacement for a collection but an addition you 15 00:01:12,700 --> 00:01:13,360 could say, 16 00:01:13,630 --> 00:01:18,850 so you would create an index for the seller key of the products collection here and that index then 17 00:01:18,850 --> 00:01:27,190 exists additionally to the collection and the index is essentially an ordered list of all the values 18 00:01:27,460 --> 00:01:31,060 that are placed or stored in the seller key for all the documents. 19 00:01:31,060 --> 00:01:35,920 So it's not an ordered list of the documents, just of the values for the field for which you created that 20 00:01:35,920 --> 00:01:37,060 index 21 00:01:37,060 --> 00:01:42,610 and it's not just an ordered list of the values, of course every value, every item in the index has a 22 00:01:42,610 --> 00:01:44,830 pointer to the full document 23 00:01:44,830 --> 00:01:46,210 it belongs to. 24 00:01:46,270 --> 00:01:52,660 Now this allows mongodb to do a so-called index scan to fulfill this query, which means it sees that 25 00:01:52,660 --> 00:02:00,580 for seller, such an index exists and it therefore simply goes to that seller index and can quickly jump 26 00:02:00,580 --> 00:02:03,970 to the right values because there, unlike for the collection, 27 00:02:03,970 --> 00:02:08,140 it knows that the values are sorted by that key, 28 00:02:08,140 --> 00:02:12,370 so it doesn't have to look at the first three records if it's only looking for records starting with 29 00:02:12,370 --> 00:02:15,670 M or to be precise, records equal to Max. 30 00:02:15,730 --> 00:02:22,110 So it can very efficiently go through that index and then find the matching products because of that ordering 31 00:02:22,300 --> 00:02:29,140 and because of that pointer, every element in this index has, so mongodb finds the value for this query 32 00:02:29,260 --> 00:02:31,960 and then finds the related documents it can return this, 33 00:02:32,170 --> 00:02:38,200 so it's this direct access that mongodb can use here and that speeds up your queries. 34 00:02:38,220 --> 00:02:43,620 So this is the end how an index works and it also answers the question why you would use one because 35 00:02:43,620 --> 00:02:47,770 creating such indexes can drastically speed up your queries. 36 00:02:47,810 --> 00:02:53,250 However you also shouldn't overdo it with the indexes because you could of course think well ok I got 37 00:02:53,250 --> 00:02:56,840 my product selection and every product has ID, name, age and hobby 38 00:02:57,000 --> 00:03:01,920 and then I could simply create indexes for all fields and I will have the best performance ever, right? 39 00:03:02,070 --> 00:03:05,320 Because no matter for what I look, I got an index for that. 40 00:03:05,760 --> 00:03:09,200 Well this might speed up your find queries, 41 00:03:09,220 --> 00:03:14,340 that is correct, indexes on all fields would speed up find queries because you can query for every field 42 00:03:14,340 --> 00:03:18,970 efficiently but an index does not come for free, 43 00:03:19,200 --> 00:03:26,130 you will pay some performance cost on inserts because that extra index that has to be maintained needs 44 00:03:26,130 --> 00:03:27,910 to be updated with every insert, 45 00:03:27,930 --> 00:03:28,760 makes sense right 46 00:03:28,770 --> 00:03:32,920 because we have an ordered list of elements with pointers to the documents. 47 00:03:33,090 --> 00:03:39,330 So if you add a new document, you also have to add a new element to the index and that might sound simple 48 00:03:39,420 --> 00:03:44,700 and it won't take super long but if you've got 10 indexes for your documents in your collection 49 00:03:44,940 --> 00:03:50,400 and you have to update all 10 indexes for every insert, then you might quickly run into issues because 50 00:03:50,400 --> 00:03:57,000 you'll have to do a lot of work for all these fields, for every insert and for every update too. Therefore, 51 00:03:57,210 --> 00:04:03,640 indexes don't come for free and you really have to find out which indexes makes sense and which indexes 52 00:04:04,100 --> 00:04:07,360 don't and this is exactly what I will also do in this module with you. 53 00:04:07,440 --> 00:04:12,360 I will walk you through all the different index types that exist in mongodb and I will also show 54 00:04:12,360 --> 00:04:16,480 you how you can measure whether an index makes sense or does not make sense.