First time here - please go easy… ;)
I'm starting off with MongoDB for the first time - using the offical PHP driver to interact with an application. Here's the first problem I've ran into with regards to the aggregation framework. I have a collection of documents, all of which contain an array of numbers, like in the following shortened example...
{
"_id": ObjectId("51c42c1218ef9de420000002"),
"my_id": 1,
"numbers": [
482,
49,
382,
290,
31,
126,
997,
20,
145
],
}
{
"_id": ObjectId("51c42c1218ef9de420000006"),
"my_id": 2,
"numbers": [
19,
234,
28,
962,
24,
12,
8,
643,
145
],
}
{
"_id": ObjectId("51c42c1218ef9de420000008"),
"my_id": 3,
"numbers": [
912,
18,
456,
34,
284,
556,
95,
125,
579
],
}
{
"_id": ObjectId("51c42c1218ef9de420000012"),
"my_id": 4,
"numbers": [
12,
97,
227,
872,
103,
78,
16,
377,
20
],
}
{
"_id": ObjectId("51c42c1218ef9de420000016"),
"my_id": 5,
"numbers": [
212,
237,
103,
93,
55,
183,
193,
17,
346
],
}
Using the aggregation framework and PHP (which I think is the correct way), I'm trying to work out the average amount of times a number doesn't appear in a collection (within the numbers array) before it appears again. For example, the average amount of times the number 20 doesn't appear in the above example is 1.5 (there's a gap of 2 collections, followed by a gap of 1 - add these values together, divide by number of gaps). I can get as far as working out if the number 20 is within the results array, and then using the $cond operator, passing a value based on the result. Here’s my PHP…</p>
$unwind_results = array(
'$unwind' => '$numbers'
);
$project = array (
'$project' => array(
'my_id' => '$my_id',
'numbers' => '$numbers',
'hit' => array('$cond' => array(
array(
'$eq' => array('$numbers',20)
),
0,
1
)
)
)
);
$group = array (
'$group' => array(
'_id' => '$my_id',
'hit' => array('$min'=>'$hit'),
)
);
$sort = array(
'$sort' => array( '_id' => 1 ),
);
$avg = $c->aggregate(array($unwind_results,$project, $group, $sort));
What I was trying to achieve, was to setup up some kind of incremental counter that reset everytime the number 20 appeared in the numbers array, and then grab all of those numbers and work out the average from there…But im truly stumped.
I know I could work out the average from a collection of documents on the application side, but ideally I’d like Mongo to give me the result I want so it’s more portable.
Would Map/Reduce need to get involved somewhere?
Any help/advice/pointers greatly received!