The upload form I am working on for my company's new quote request page allows files to be uploaded. One thing I had not considered was file name and path length. Are there some best practices to handle these? Also unfortunately I am developing the site on a WAMP stack but the live server is Unix, so things are not identical in both environments.

We're not really concerned about it since very few people actually upload anything and when someone does it's always some reasonable length, but of course we'd like to be prudent and make sure everything is accounted for.

Currently I am checking if the file name is longer than 255 characters and just truncating it (while keeping the extension intact), but this doesn't work since move_upload_file() uses the full file path when creating the new file - which is longer than 255 characters. So it seems like you have to know the full length of the path + file name and deal with that - except that means the file name length can't truly use all 255 characters. Any best practices regarding this? My online searching just comes up with results about checking if a file name equals something specific or a specific extension.

Thanks!

    First thing I'd suggest is front-end verification. If you form field has a character limit and you have some JS checking the inputs prior to form submission, it's perhaps less likely to be a problem?

    Of course, management probably wants to know how long it will take to whip up the JS 😉

    From the notes at [man]move_uploaded_file/man:

    Yousef Ismaeil Cliprz wrote:
    <?php
    
    /**
    * Check $_FILES[][name] length.
    *
    * @param (string) $filename - Uploaded file name.
    * @author Yousef Ismaeil Cliprz.
    */
    function check_file_uploaded_length ($filename)
    {
        return (bool) ((mb_strlen($filename,"UTF-8") > 225) ? true : false);
    }

    Looks fairly helpful?

      You should be careful using any uploaded file names to store user input. This opens the door for hackers to over-write important files. Keep in mind that when uploading files, each is given a (very short) filename usually something like /tmp/fsfasdfsfsafsf

      Do you really need all the path info provided with the original file?

        sneakyimp wrote:

        You should be careful using any uploaded file names to store user input. This opens the door for hackers to over-write important files.

        Not to mention uploaders overwriting each others' files by using the same name.

          Thanks for the replies guys!

          dalecosp;11048869 wrote:

          First thing I'd suggest is front-end verification. If you form field has a character limit and you have some JS checking the inputs prior to form submission, it's perhaps less likely to be a problem?

          I do have client-side verification for this, as well as file size and extension (there are server-side checks, too, of course).

          dalecosp;11048869 wrote:

          Looks fairly helpful?

          It is, and I have something similar. The issue is when you move the file, the destination path has a cap as well - which includes the new file name. My intention is to keep as much of the original file name as possible, mostly for reference so when a potential client uploads 5 files, they can be easily referenced. We are prepending their name and company (if entered) to the file name to avoid overwriting existing files (though I believe my manager moves them every now and then). If neither the name or company is entered, then a Unix timestamp is prepended.

          sneakyimp;11048941 wrote:

          You should be careful using any uploaded file names to store user input. This opens the door for hackers to over-write important files. Keep in mind that when uploading files, each is given a (very short) filename usually something like /tmp/fsfasdfsfsafsf

          I am not sure what you mean by "to store user input". The files are stored in a specific folder. I'm not sure how a hacker would be able to overwrite important files. Am I missing something?

          sneakyimp;11048941 wrote:

          Do you really need all the path info provided with the original file?

          No, but I have to provide the destination path.

          Weedpacket;11048945 wrote:

          Not to mention uploaders overwriting each others' files by using the same name.

          We are prepending their name and company name to the file. If neither are entered then a Unix timestamp is prepended. My manager is not really concerned if there happens to be a file overwritten because so few people use the file upload feature. shrug

          In any case, my manager just instructed me to cap it at 100 characters and be done with it, so that is what I have done.

            Why not just a table with 2 columns path_on_disk, original_upload_name and then you can be sure to have unique paths on disk (by checking if the path exists in the db before moving). Also, then you're not concatenating a bunch of things together and hoping the result isn't too long.

            As for the security/hacker risk, if I upload a file with name ../../../../../../usr/sbin/some_app and you don't do any cleaning of that path, simply prepending your specific path in front does not fix the problem this causes. If that's all you do, you end up with /path/to/your/upload/folder/../../../../../../../path/to/executable see the problem?

              Derokorian;11049039 wrote:

              Why not just a table with 2 columns path_on_disk, original_upload_name and then you can be sure to have unique paths on disk (by checking if the path exists in the db before moving). Also, then you're not concatenating a bunch of things together and hoping the result isn't too long.

              More work than my manager wants to invest in it. We have had the current request form up for about 2.5 years and there are something like 30 files. It's just not a concern to him. If it were my own personal file upload then I would definitely implement measures to avoid duplicate/overwriting file names.

              Derokorian;11049039 wrote:

              As for the security/hacker risk, if I upload a file with name ../../../../../../usr/sbin/some_app and you don't do any cleaning of that path, simply prepending your specific path in front does not fix the problem this causes. If that's all you do, you end up with /path/to/your/upload/folder/../../../../../../../path/to/executable see the problem?

              Are forward slashes even valid in file names? Forgive my ignorance but how would someone even submit such a thing?

                Why not add a random string to it?

                 $length = 10;
                                $random = substr(str_shuffle("0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"), 0, $length);
                

                I done something slightly similar to this when making a photo album upload, fairly easy to implement

                  Bonesnap;11049075 wrote:

                  More work than my manager wants to invest in it. We have had the current request form up for about 2.5 years and there are something like 30 files. It's just not a concern to him. If it were my own personal file upload then I would definitely implement measures to avoid duplicate/overwriting file names.

                  Just make sure you let him know it's his fault if stuff goes wrong. Is anyone really submitting super-long file names?

                  Bonesnap;11049075 wrote:

                  Are forward slashes even valid in file names? Forgive my ignorance but how would someone even submit such a thing?

                  The problem here is not what a file system permits as a valid filename, but what filename a hacker might send as part of a forged request to your system. I believe that $FILES['file_input_name']['name'] and $FILES['file_input_name']['type'] are typically sent by one's browser to the server. A hacker can just spoof this information using cURL or something and concoct some kind of 'name' value that is full of slashes or backslashes or your momma's old stripper name. This information is not trustworthy and you should never rely on it to store the data on your server's file system without validating it first.

                  If your user audience is small or limited somehow and people have to login to upload files to your server, it's no big deal. If your server is exposed to the entire internet, you should be very careful about what you let people upload to your machine.

                  If, by appending or prepending additional strings to your filename, you are somehow exceeding some limit (perhaps your file system limits filenames to 512 chars) then you should reconsider doing that at all. I would recommend using something like [man]uniqid[/man] to generate some random filename, maybe add a userid or something so you know who it belongs to, then keeping a database (this could be a flat file even) in which you store metadata about the file, possibly including its original filename and ext, which user id owns it, etc, etc. The downside is that A3ADFEA34321897.jpg or whatever is not a useful filename if your database gets destroyed (this is why i try to store a userid in the filename), but it solves other problems.

                  Seems pretty simple to me: If there is some constraint against long filenames, you must either remove that constraint or you must shorten your file names.

                    sneakyimp;11049125 wrote:

                    Just make sure you let him know it's his fault if stuff goes wrong. Is anyone really submitting super-long file names?

                    Haha he knows. And no, all uploaded files are of reasonable length. It's more of an after thought, really. Our previous incarnation of this system had very little file checks other than extension. We just wanted to have something in place in case someone submitted something that was very long.

                    sneakyimp;11049125 wrote:

                    The problem here is not what a file system permits as a valid filename, but what filename a hacker might send as part of a forged request to your system. I believe that $FILES['file_input_name']['name'] and $FILES['file_input_name']['type'] are typically sent by one's browser to the server. A hacker can just spoof this information using cURL or something and concoct some kind of 'name' value that is full of slashes or backslashes or your momma's old stripper name. This information is not trustworthy and you should never rely on it to store the data on your server's file system without validating it first.

                    If your user audience is small or limited somehow and people have to login to upload files to your server, it's no big deal. If your server is exposed to the entire internet, you should be very careful about what you let people upload to your machine.

                    If, by appending or prepending additional strings to your filename, you are somehow exceeding some limit (perhaps your file system limits filenames to 512 chars) then you should reconsider doing that at all. I would recommend using something like [man]uniqid[/man] to generate some random filename, maybe add a userid or something so you know who it belongs to, then keeping a database (this could be a flat file even) in which you store metadata about the file, possibly including its original filename and ext, which user id owns it, etc, etc. The downside is that A3ADFEA34321897.jpg or whatever is not a useful filename if your database gets destroyed (this is why i try to store a userid in the filename), but it solves other problems.

                    Seems pretty simple to me: If there is some constraint against long filenames, you must either remove that constraint or you must shorten your file names.

                    I have implemented a quick and dirty string replace for both forward and backslashes. My manager doesn't want to implement a more robust system about storing metadata of files, etc. because so few people use that part of the system. It's just a "nice to have" for those people who have a project outline, business plan, wireframes, etc. and want to send that along with a quote request.

                    I am shortening the file names; I am truncating them at 100 characters. We have never had a file come close to that, but that was the number my manager came back to me with, so that's what I put in.

                      Write a Reply...