Large-File Upload Solution Required

Hi all!

I have a project where the client will need to be able to upload very large files to the website - i.e., possibly the occasional 200Mb file.

This type of upload cannot be handled simply by changing 'php.ini' configs - we require a more flexible solution. It seems that online services that handle this kind of large-file upload all use the same 2-sided strategy:

  • Browser Side: there is a (usually javascript) library that checksums the large file being uploaded, then ‘breaks it down’ into smaller chunks, numbers them, and sends them one-by-one over several separate POST requests to be reassembled by the server
  • Server Side: there is a server-side script that receives the incoming POST requests with the file parts, reassembles them and checks the file integrity.

I was wondering whether anyone has used - or has developed - such a solution, and how much Kirby’s undocumented ‘upload class’ can help. Any help would be greatly appreciated, as I can see this is looking like a mammoth task, which may prevent me from taking on the project…

I have two suggestions:

1.) Instead of having the user upload the file directly, have them upload a link to google drive or dropbox where the system can then download the file.
2.) Have them SFTP the file directly into the content directory.

I’m unsure why you think adjusting the PHP and webserver allowed upload size (and timeouts) isn’t sufficient. I don’t understand the comment about requiring more flexibility. Webservers routinely serve and receive files over 200MB in size.

In any case, there is no need to do checksums on transmitting files when using TCP. TCP includes checksums in the transmission to protect against corruption.

m.

@mrunkel thank you for the input. Let me clarify some points. The solution has to enable the client to upload files directly through their website - no S/FTP or third-party services like Dropbox. The site will initially be hosted on a shared host, so tweaking all the necessary php.ini configurations in order to accept 200Mb uploads is unfortunately out of the question.

If we google ‘chunk upload’, we will see that there have been many techniques developed over the last 10 years for splitting large files and sending them via http, and then re-assembling on the server. What my short research seems to show, however, is that while the individual pieces do mostly arrive ‘in order’, there are a number of issues that can arise when re-assembling the file - such as not all pieces actually having arrived, or the server-side code assembling them in the wrong sequence. So, many developers seem to suggest that it’s a good idea to get a checksum of the large file before splitting, and compare it with a checksum of the assembled result in the end - nothing to do with the TCP checks for each individual segment.

There may be different techniques for accomplishing this, and I’m certainly happy to hear any suggestions anyone might have.