How can I optimize this export function for a LAMP web application?

I have a function that allows the user to create projects and view them on this page. They can import resources (pdf, img, etc.) that will be supported along with their projects. So now I want to create a function that allows the user to export all their stuff and those people who are in the same group as they are all neatly, with a nice ribbon tied to the zip file.

I am currently using Archive: Zip to secure the file by keeping the CRC32 checksum and running this as a daily cronjob to reduce user wait times. But if there are any changes to any of the files, I will have to restart the whole thing.

My initial benchmark shows me that a 103MB file will last up to 47 seconds. This process includes creating XML that binds them to XSL, copying images, html for iframe, and what not.

I am going to create a table or text file to store the CRC32 checksum or last modified date for all files in the temporary storage area and compare to this list every time the user clicks on export, and if there are new files, I will delete the same file from cached zip file and add it to the new file. Or I'll just keep all the free files and copy and replace new files and then make an archive on every click.

My questions:

  • Is this considered premature or a bad optimization technique?
  • How do I optimize this correctly?
  • Is there some book or resources that I can learn this optimization techniques?
+1


source to share


1 answer


What's wrong with the idea:

  • setting a flag in case of changing user files (adding, deleting or changing a file).
  • running your nightly compression for every user whose files have changed and then clearing this flag.
  • If the user requests export when the flag is set, you will need to compress again before the export is complete (there is no way around this).

To speed up the user experience, you can also separate the export request from the export operation. For example, when a user (whose flag is set) requests an export, tell them that it will be done when it fails and set a different flag. Then modify the second step above to also export the newly created package if this second flag is set.

This gives the user immediate feedback that something will happen, but will lead to rough work in the future.

Alternatively, you don't need to bind the export to compression. You can compress every night, but perform additional compression / export jobs throughout the day as needed. It's still nice to separate the request from the event.



Answering your specific questions.

1 / I don't consider this premature or poor optimization. The "code" is functionally complete as it does whatever you ask it to do, so this is a good time to optimize. In addition, you have identified the bottleneck and optimize the correct area.

2 / See my text above. You have to optimize it by doing exactly what you did - identify the bottleneck and focus on improving it. Given that you are unlikely to get much better compression performance, the decoupling "trick" I suggested is a good one. Similar to progress bars and splash screens, this usually has more to do with users' perception of speed rather than speed.

3 / Books? Don't worry, there are thousands of resources on the net. Keep asking about SO and print all answers. In the end your brain will be as full as mine and every new piece of code will make you temporarily forget your wife's name :-).

+1


source







All Articles