1 00:00:02,320 --> 00:00:07,010 This is the aggregation function as we executed it in the last lecture with group. 2 00:00:07,040 --> 00:00:10,620 Now as you saw there, we lost all the existing data 3 00:00:10,670 --> 00:00:13,900 but that made sense because we grouped together our data 4 00:00:14,030 --> 00:00:19,170 and if you do such a grouping, you will typically be fine with losing the data. 5 00:00:19,190 --> 00:00:27,320 Now of course when we ran that method here, when we ran our pipeline like this, what we got was a bunch 6 00:00:27,320 --> 00:00:30,400 of outputs in a totally unsorted order, 7 00:00:30,440 --> 00:00:37,190 of course we can also sort and now here is already something where you see the advantage of the aggregation 8 00:00:37,190 --> 00:00:37,980 pipeline. 9 00:00:38,270 --> 00:00:44,400 You can sort at any place here of course but we probably want to sort by total persons now, 10 00:00:44,660 --> 00:00:49,040 so this is something which we only can do after having grouped, 11 00:00:49,160 --> 00:00:54,530 we can't run the sort on our input data because that will just be the person documents and there, we 12 00:00:54,530 --> 00:00:56,690 can sort on things like the age and so on 13 00:00:56,750 --> 00:01:04,340 but we can sort on the amount of persons in a state because that is a result we only derived here. 14 00:01:04,460 --> 00:01:09,030 So what we can do here is we can of course add a new pipeline stage, the sort 15 00:01:09,060 --> 00:01:15,650 stage and the sort stage also takes a document as an input to define how the sorting should happen 16 00:01:15,650 --> 00:01:23,360 and you can simply sort as you also sorted before. So you can say I now want to sort by 17 00:01:23,360 --> 00:01:29,480 and now of course you can say total persons, referring to that field which we introduced in the last 18 00:01:29,480 --> 00:01:30,710 pipeline stage. 19 00:01:30,750 --> 00:01:37,190 This is not a field existing in our input dataset but this does not matter because as you learned, each 20 00:01:37,490 --> 00:01:44,510 pipeline stage passes some output data to the next stage and that output data is the only data that 21 00:01:44,510 --> 00:01:45,960 next stage has. 22 00:01:45,980 --> 00:01:51,260 So this sort stage does not have access to the original data as we fetched it from the collection, 23 00:01:51,260 --> 00:01:55,160 it only has access to the output data of our group stage. 24 00:01:55,160 --> 00:02:00,890 So there, we will have a total persons field and we can now sort by this in descending order to have 25 00:02:00,890 --> 00:02:02,710 the highest values first. 26 00:02:03,050 --> 00:02:11,030 If we now copy that over into our shell, we indeed see that we have some sorted results here, 27 00:02:11,030 --> 00:02:15,570 so you see we got a bunch of results, 28 00:02:15,580 --> 00:02:20,170 that is the first one with 33 persons in this state, 29 00:02:20,300 --> 00:02:25,680 then we got 22 persons this state, here we got a bit of a longer state name hence the different output 30 00:02:25,700 --> 00:02:27,660 but there we got 24 persons. 31 00:02:27,710 --> 00:02:28,910 So this looks alright to me, 32 00:02:28,910 --> 00:02:34,610 we see the sorting works and the interesting thing here really is that the sorting was done on the 33 00:02:34,610 --> 00:02:37,250 output of our previous stage, 34 00:02:37,250 --> 00:02:44,110 so on the output of the group stage. And I hope this already shows you that you have a lot of power with 35 00:02:44,110 --> 00:02:52,130 these tools already because this is essentially a kind of operation we couldn't do with the normal find 36 00:02:52,160 --> 00:02:58,070 method because there, we can't group and then sort on the result of our group. We would have to do that in 37 00:02:58,070 --> 00:03:04,760 the client side code with just find, well with aggregate we can run it on the mongodb server and 38 00:03:04,760 --> 00:03:10,100 then simply get back the data in the client that we need in our client to work with.