Understanding Downloading Source File for Fast Download

The theme of my project is to implement a distributed server that provides multiple clients with multiple files to upload. The server hosts multiple files and we want the server to implement some better algorithms to quickly allow clients to download data from it.

My project implementation idea:

As usual the client downloads the file using some download managers, similarly there must be some server managers / codes / algorithms that quickly download / unload the file so that the client can download the file. There should be no client action other than choosing a file to download!

How would I write code for such a server on the back, similar to the multi-threaded loaders for clients on the front end?

How does the server seed / make use the file to the client if the client only sends the path as a String to the server in Java for download?

Or if I am missing something / my idea is completely wrong, please enlighten me with an alternative process / algorithm that I should implement on the server side. Please remember that the whole purpose of asking this question is the server-side seeding algorithm or equivalent algorithms / methods.

+3


source to share


1 answer


I am assuming your server has a good internet connection with a wide upstream. If so, then the limiting factor when multiple clients download multiple files is the bandwidth of those clients. This way, you are likely to receive as fast as your customers' downstream feed. Thus, you just need to get a ready-made HTTP server library to serve downloads.

If your backend implementation really matters and can improve download performance, many users connect to your server and download many files. First of all, the following points must be considered:

  • TCP has a start time. When you first open a connection, the download speed slowly starts to increase until it reaches the maximum. To minimize this time, when uploading multiple files, the connection open to download one file must be reused for the next file.

  • Downloading multiple files at the same time (on the client side) is impractical when bandwidth is a limiting factor because the client has to run many TCP connections and the data will either be fragmented or written to disk, or (if allocated in advance) the disk will be quite busy during jump between sectors.

  • Your server should generally use a non-blocking IO library (like java.nio ) and refrain from creating a thread for as this results in thrashing , which dramatically decreases server performance.

If you have a really large number of clients downloading from your server at the same time, the limit you are likely to hit would be either:

  • Your provider's upper limit

  • Your hard drive read speed (SSD has ~ 500MB / s as far as I know)

Your server might try to store the most requested files in its memory and serve content from there ( DDR3 RAM reaches 17GB / s ). I doubt there are only a few files left on your server that you could cache them all in your server RAM.



So, the main engineering challenge lies in smartly choosing which content should be cached and which shouldn't. This can be done on a priority basis by assigning higher priorities to specific files or by a metric that encodes the likelihood that one file will be downloaded in the next few minutes. Or just the files that most clients are downloading at this point in time.

With these considerations in mind, you can limit the limits of your download server to a certain point from which a single improvement can be achieved by distributing or copying your files to multiple servers.

If you are heading in a direction where you have to serve millions of customers at the same time, you should consider buying such a service from a CDN. They specialize in fast delivery and have many upstream servers in most AS, so each customer can download their files from a regional CDN server.


I know I have not provided any examples of algorithm or code, but I was not going to fully answer this question. I just wanted to give you some important guidance and thoughts on this topic. Hopefully you can at least use some of these thoughts for your project.

+2


source







All Articles