Welcome to WuJiGu Developer Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.7k views
in Technique[技术] by (71.8m points)

mongodb - Can't get allowDiskUse:True to work with pymongo

I'm running into the aggregation result exceeds maximum document size (16MB) error with mongodb aggregation using pymongo.

I was able to overcome it at first using the limit() option. However, at some point I got the

Exceeded memory limit for $group, but didn't allow external sort. Pass allowDiskUse:true to opt in." error.

Ok, I'll use the {'allowDiskUse':True} option. This option works when I use it on the commandline, but when I tried to use in my python code

result = work1.aggregate(pipe, 'allowDiskUse:true')

I get TypeError: aggregate() takes exactly 2 arguments (3 given) error. (that's in spite of the definition given at http://api.mongodb.org/python/current/api/pymongo/collection.html#pymongo.collection.Collection.aggregate: aggregate(pipeline, **kwargs)).

I tried to use runCommand, or rather it's pymongo equivalent:

db.command('aggregate','work1',pipe, {'allowDiskUse':True})

but now I'm back to the 'aggregation result exceeds maximum document size (16MB)' error

In case you need to know

pipe = [{'$project': {'_id': 0, 'summary.trigrams': 1}}, {'$unwind': '$summary'}, {'$unwind': '$summary.trigrams'}, {'$group': {'count': {'$sum': 1}, '_id': '$summary.trigrams'}}, {'$sort': {'count': -1}}, {'$limit': 10000}]

Thank you

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

So, in order:

  • aggregate is a method. It takes 2 positional arguments (self, which is implicitly passed, and pipeline) and any number of keyword arguments (which must be passed as foo=bar -- if there's no = sign, it's not a keyword argument). This means you need to call result = work1.aggregate(pipe, allowDiskUse=True).

  • Your error about maximum document size is inherent to Mongo. Mongo can never return a document (or array thereof) larger than 16 megabytes. I can't tell you why because you have given us neither your data nor your code, but it probably means that the document you're building as an end result is too large. Try decreasing the $limit parameter, maybe? Start by setting it to 1, run a test, then increase it and look at how big the result gets when you do that.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to WuJiGu Developer Q&A Community for programmer and developer-Open, Learning and Share
...