One big GridFS collection in MongoDB or much less?

We are using MongoDB to store video data before and after conversion. Encoding the results of one file from six or more smaller files (several grades of quality and different formats). Both input and output files have the same unique file identifier.

The question is which approach is better from a performance and scalability standpoint: store all output files in one huge gridFS collection with composite keys containing the original file ID, quality class and format, or have a separate collection for each quality / format pair?

For me both of these approaches have their pros and cons, i.e.

  • Using one large final collection uses less configuration at the time of reading, but a more complex query is required to find the file;
  • Using multiple result sets involves simpler and faster queries, but requires additional configuration to choose which collection this query should be processed into.

It's more like a matter of personal choice ... But what about scalability or performance? Has anyone come across such a solution before? Maybe someone can advise?

What is the best strategic approach given the large number of large files? Which approach will be easier to scale and delineate in the future? Maybe there are performance penalties in the long run?

+3


source to share





All Articles