Copying files on a distributed system so that all servers have a copy of all files

Full disclaimer: This is an interview question:

There are M-machines. We need to copy datasets from these M machines to each other so that each server has a copy of all datasets. What is the most optimal algorithm for this?

I know I can solve this problem in O (MN) (where N is the average number of datasets on each machine), iterating through each server. Is there a better approach?

+3


source to share


1 answer


How about a self-replication system?

http://en.wikipedia.org/wiki/Self-replication#A_self-reproducing_computer_program

eg.; If you have M = 100 machines, for each dataset you will have:



1tic: 1machine with the data
2tic: 2machines with the data
3tic: 4machines with the data
4tic: 8machines with the data
5tic: 16machines with the data
6tic: 32machines with the data
7tic: 64machines with the data
8tic: 64machines with the data
9tic: 100+machines with the data

      

I think it is less difficult than O (MN)

0


source







All Articles