Path Traversal in Plain English
By- September 23, 2014
It is sad that the most dangerous vulnerabilities on the internet tend to be the easiest to detect. Today, I’d like to talk about one such kind of vulnerability: path traversal. A path traversal vulnerability allows an attacker to access files on your web server to which they should not have access. They do this by by tricking either the web server or the web application running on it into returning files that exist outside of the web root folder.
Let’s say you have a website running on http://www.example.com. Let’s also suppose that the web server you are using makes it super easy to add pages to your site; all you have to do is add them to the web root folder,
/var/www, on the server’s filesystem and the rest is taken care of. If you add the file
/var/www/products/table.html, then that page can be accessed by anyone if they visit http://example.com/products/table.html. This web server, unfortunately, is super old and vulnerable to path traversal. This allows an attacker to use special character sequences, like
../, which in Unix directories points to its parent directory, to traverse up the directory chain and access files outside of
/var/www, like this.
When receiving this request, the web server appends the relative path specified by the user,
../../configuration.yml, to the directory that holds the web pages,
/var/www/, to obtain the full path
/var/www/../../configuration.yml. In Unix-like systems, each
../ cancels out the directory immediately to the left of it, so if we reduce the path to its simplified form, the final path becomes
And now, the hacker has just obtained sensitive information, maybe even your database credentials, and can use this information to steal your users’ information or cause further damage.
The same type of situation could arise even if your web server is up-to-date and not vulnerable, yet you introduce a path traversal vulnerability in the application itself. Say your application is a little fancier than static pages now, and each page includes a link to download a PDF for more information. These PDF links look something like this:
Using the same
../ technique, an attacker can escape out of the directory containing the PDFs and access anything they want on the system.
Often, building a web application on a web server whose filesystem contains no sensitive files is not possible or too impractical. Tinfoil Security, for example, relies on the existence of many configuration files, not to mention the website’s source code itself, on the web server to run properly. Your application is likely to require the existence of similar configuration files filesystem in order to work. These files could contain the credentials for the site’s database, which an attacker can use to gain access to all of your customers’ information. Path traversal can also be used to reveal your source code, which could lead an attacker to discover even more sensitive information (if you store credentials in source code constants. You don’t do that, do you?) or other vulnerabilities. Worse yet, since attackers have full access to your filesystem, they can access system programs (such as a deletion program) and force them to run, causing potentially irrecoverable damage on your system.
Usually I skip straight to the solutions, but I think it is interesting to study some of the attempts we’ve seen in the past that try (and fail) to prevent path traversal.
Doing a search and removal for
../in the given path.
The idea behind this technique is that if you prevent a user from using
../in the path, they’ll never be able to traverse out of the
/var/wwwdirectory and into more private directories. However, this can easily be bypassed with URL encoding. The URL encoding for
%2E%2E%2F, so the following would break through this defense.
Doing a check to make sure the path ends in
.htmlor some other known extension. This is also easy to bypass. If you stick a null byte right before also inserting the expected extension, the suffix check will succeed, but the file system will use the specified path only up to the null byte and stop reading there. Since an attacker can’t stick a regular null byte into a URL, they again rely on URL encoding to help them.
That said, there are a lot of right ways to mitigate and help prevent path traversal. Any of these solutions work in isolation, but I recommend doing as many of these as you can.
To prevent path traversal in your web server, update your web server and operating system to the latest versions available. This vulnerability has been known for a while, and it is likely your web server’s latest version is not vulnerable. You don’t want to be stuck running an old, vulnerable web server, because then none of the below solutions will help you.
When making calls to the filesystem, you should not rely on user input for any part of the path.
If you must somehow open paths depending on user input, you should have the user input be an index into one of a list of known, safe files. For example, ‘1’ could map to
table.html, and ‘2’ could map to
Run your web server from a separate disk from your system disk (the disk that holds critical operating system files), and, if possible, don’t store any sensitive files in the web server disk.
Use filesystem permissions judiciously. Use a non-superuser to run the web server whose permissions only allow them to read only the files it needs to run. It should not be able to write to any files, since all user data should be stored in a separate database.
If you really, really need to allow users to specify a path, relative or otherwise, then normalize the path (this is how Java does it, and it works pretty well) and check that its prefix matches the directory they should be allowed to access.
> FILE_PREFIX = '/var/www/public/' => "/var/www/public/" > user_input = '../../../etc/passwd' => "../../../etc/passwd" > full_path = normalize(FILE_PREFIX + user_input) => "/etc/passwd" > is_valid = full_path.start_with?(FILE_PREFIX) => false
If you have an existing web application, and you want to know if you’re vulnerable to path traversal, checking is easy, but extremely tedious. For each parameter, URL, or cookie, you could insert a relative paths to files known to exist on your web server’s machine, such as
../../../../../../etc/passwd on Unix-like machines. You’d also have to see if you’re vulnerable to tricks such as
../ removal (by using
%2E%2E%2F) and file extension checking (by sticking a null byte,
%00, before the inserting the valid extension).
As you can imagine, this can get tedious and impractical, so I recommend using an automated web security scanner like Tinfoil Security. Tinfoil is designed specifically to handle vulnerability tests like this, and it will crawl your entire site looking for path traversal vulnerabilities, among many others.