Been struggling with a map-reduce problem, I’m hoping someone can give me a push in the right direction.
Given the following data set.
[
{ "key1": "category1"
, "key2": "category2"
, "text": "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua"
}
, { "key1": "category1"
, "key2": "category2"
, "text": "Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat."
}
, { "key1": "category3"
, "key2": "category4"
, "text": "Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur."
}
, { "key1": "category3"
, "key2": "category4"
, "text": "Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum."
}
]
I would like a result similar to the following:
[
{ "key": [ "category1", "category2" ]
, "value":
{ "word_count": 36
, "text": "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. "
}
}
, { "key": [ "category3", "category4" ]
, "value":
{ "word_count": 33
, "text": "Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum."
}
}
]
I’ve got a map function that may or may not be in the right direction:
function (doc) {
var text = doc.text
var wordCount = text.split(' ').length
emit([doc.network_type_code, doc.foreign_user_id, doc.time_posted], {
ids: [doc._id],
wordCount: wordCount,
text: text,
});
}
But, I have no idea where to go with the reduce
function. Please help, thank you in advance.