1 00:00:02,230 --> 00:00:08,440 That's it for this module and I want to conclude it with a wrap up and the first important thing in this 2 00:00:08,440 --> 00:00:13,930 wrap up is that I reiterate how you should think about data modeling and structuring. 3 00:00:14,170 --> 00:00:17,860 Consider the format in which you'll fetch your data, 4 00:00:17,890 --> 00:00:21,060 how does your application or your data scientists, 5 00:00:21,070 --> 00:00:23,580 how do they need the data? 6 00:00:23,800 --> 00:00:29,460 You want to store your data in a way that it's easy to fetch especially if you're building an app or 7 00:00:29,660 --> 00:00:35,660 if you're having a use case where you fetch a lot because that's the next important question, 8 00:00:35,770 --> 00:00:39,430 how often do you fetch data and how often do you change it? 9 00:00:39,460 --> 00:00:45,190 Do you need to optimize for writes or for reads? Often it's for reads but it could be different 10 00:00:45,190 --> 00:00:48,270 for you. If you write your data a lot, 11 00:00:48,310 --> 00:00:51,080 you should definitely avoid duplicates, 12 00:00:51,190 --> 00:00:53,020 make sure you got no duplicates, 13 00:00:53,050 --> 00:00:59,490 if you read a lot, maybe some duplicates are ok if these duplicates don't change that often, 14 00:00:59,500 --> 00:01:04,930 basically what I covered when walking three different relations setup options. 15 00:01:04,930 --> 00:01:08,960 Also keep in mind how much data will you save and how big is it, 16 00:01:08,980 --> 00:01:16,390 do you store all citizens of New York City? Maybe embedding the data is not your best choice then. How 17 00:01:16,390 --> 00:01:20,570 is your data related in general, one-to-one, one-to-many, many-to-many? 18 00:01:20,590 --> 00:01:26,420 This often gives an indication for whether you want to go with embedded documents or references and 19 00:01:26,410 --> 00:01:28,840 in general, will duplicates hurt you, 20 00:01:28,930 --> 00:01:32,990 do you update your data a lot in which you have to update a lot of duplicates, 21 00:01:33,100 --> 00:01:38,740 do you maybe have snapshot data like with orders and products where you don't care about updates to 22 00:01:38,740 --> 00:01:41,920 the most recent data to your products data. 23 00:01:42,490 --> 00:01:49,870 And finally, will you hit any data or storage limits, like do you need to embed 100 levels deep which is 24 00:01:49,870 --> 00:01:50,350 the limit 25 00:01:50,350 --> 00:01:56,380 mongodb imposes? It's not very likely but that is something you should at least be aware of and which 26 00:01:56,380 --> 00:02:01,380 you should keep in mind. With that, here's the general summary. In this module, 27 00:02:01,380 --> 00:02:07,690 we had a detailed look at how you model schemas, that this can make sense even though mongodb does 28 00:02:07,690 --> 00:02:13,700 not enforce a schema onto you because your application typically needs data in a specific structure. 29 00:02:13,810 --> 00:02:19,340 Important factors are the factors I just mentioned, frequency, relations, size and so on. 30 00:02:20,460 --> 00:02:27,480 You also learned about data types there by the way, text, integers, long integers, decimals and I will also come 31 00:02:27,480 --> 00:02:30,540 back to the numbers later in this course. 32 00:02:30,540 --> 00:02:33,590 We also had a detailed look at relations in this module, 33 00:02:33,600 --> 00:02:38,920 you basically have two options, embedding or references and you want to use embedded documents 34 00:02:38,940 --> 00:02:41,470 if you've got a one-to-one or one-to-many relationship 35 00:02:41,580 --> 00:02:48,030 and if it makes sense for your app and the way you interact with your data with all the points I highlighted 36 00:02:48,030 --> 00:02:54,270 throughout this module and a few seconds ago. References makes sense for many-to-many relationships 37 00:02:54,270 --> 00:03:01,350 or if you want to fetch data separated or if you got very large amounts of nested data, like New York 38 00:03:01,350 --> 00:03:08,150 City with all its citizens. Exceptions are always possible and there is no hard rule, 39 00:03:08,160 --> 00:03:14,100 like many-to-many should always use references as you saw with the example of products and orders but 40 00:03:14,670 --> 00:03:20,160 keeping an eye on your app requirements and how your app works and interacts with your data gives you 41 00:03:20,160 --> 00:03:23,820 a good clue and indication of how you want to structure your data 42 00:03:23,820 --> 00:03:30,270 typically. Last but not least in this module, we had a look at schema validation which allows you to enforce 43 00:03:30,270 --> 00:03:36,980 a certain set of rules on incoming data so that no invalid data ends up in your database. 44 00:03:36,990 --> 00:03:40,050 You learned how you can define such rules and how that works with 45 00:03:40,110 --> 00:03:41,280 json schema 46 00:03:41,580 --> 00:03:46,540 and you also learned that you can configure how your rules are applied, 47 00:03:46,710 --> 00:03:54,230 if actions get blocked or if they just get a warning logged to the log files. This is it for this module 48 00:03:54,310 --> 00:04:00,510 and with that you hopefully got a good understanding of how you have to think about structuring data 49 00:04:00,720 --> 00:04:06,170 and how you can then implement your structure, relate your documents and validate incoming data.