Clickjacking in Plain English

In another blog post, I discussed a vulnerability called Cross-Site Request Forgery (CSRF) that allows an attacker to force a victim to perform actions on their logged-in sites inadvertently. The defense I recommended was called the Synchronizer Token Pattern, as the defense relies on using randomly-generated tokens to which attackers would not have access. Hackers are both clever and resourceful, however, and if your website is vulnerable to a technique called clickjacking, attackers can force victims of your site to perform unwanted actions even if proper CSRF protections are in place.

Clickjacking is a sneaky trick that relies on the ability to place a victim website in an iframe on another website. Through the clever use of an invisible iframe and precise overlaying, an attacker may be able to convince a user to click on a what looks like an innocent button on their site, but is actually a button or link on the invisible iframe. The invisible button in the iframe is typically positioned so it is exactly in the location where the user is likely to click, but some attackers take this further and use Javascript to have the invisible iframe follow the mouse pointer around the page, making the attack possible no matter where the user decides to click.

This attack is best understood visually. Let’s suppose, only for illustration, that eBay is vulnerable to clickjacking—meaning that an attacker could embed an eBay page in an iframe on their own site. A product listing page looks like the following.

An attacker can construct a website that contains a really enticing button, just begging to be clicked.

Then, they can precisely overlay the eBay iframe so that the “Buy It Now” button sits perfectly on top of the “FREE” button. The following uses a semi-transparent iframe for illustration purposes.

Lastly, they can make the iframe invisible, yet still technically “on top of” the site’s content, so when the user thinks they’ve just clicked on a link, they actually became proud owners of a new car.

The Danger

In the above example, an attacker can trick an unsuspecting user into buying a Ford. Even if the “Buy It Now” button is inside of a CSRF-protected form, the request goes through, the website has no way to distinguish an intentional click from one contrived by an attacker. This same attack can be used to cause the user harm, such as tricking them into deleting all of their emails by lining up the “Delete All” button under the “FREE” button. An attacker could also trick the user into opening themselves up to further attack by disabling defenses and enabling your webcam. If that’s not scary enough, or you already cover your webcam with black tape, an attacker can construct a precise combination of malicious text fields and buttons and trick a victim into resetting their password for another service.

The Answer

To prevent this attack, you need to prevent the ability for your site to be embedded as a frame in another site. There are two ways to solve this. In short, there is the established way and the new way. Since we’re in the middle of a transition phase—the new way is still very new and not widely supported—I recommend doing both, to prepare yourself for the inevitable day when the old way is deprecated.

The Established Way

The common way this is done is by adding an HTTP header that serves just this purpose: X-Frame-Options. As you may know, the X- prefix indicates that this is a non-standard header; however, this header is understood by all major browsers, including Chrome, Safari, Opera, Firefox, and Internet Explorer 8 and newer. Browsers refuse to load content in an iframe if that content includes this header and the header disallows it. The header has two possible values: DENY, and SAMEORIGIN. DENY is self-explanatory, and SAMEORIGIN allows the page to be embedded only within the same domain as the embedding site. There does exist a third option called ALLOW-FROM uri, but neither Chrome nor Safari plan to support it. Therefore, I do not recommend using ALLOW-FROM at all.

The New Way

There’s a new all-encompassing HTTP header that’s on its way to becoming a standard. It’s called Content-Security-Policy (CSP). The Content-Security-Policy header enforces whitelists for trusted content that is allowed to be loaded along with your webpage. You can use it to mitigate XSS by whitelisting JavaScript sources and disabling inline JS entirely, you can use it to mitigate unvalidated redirects with a whitelist of trusted redirect sources, and you can even have the browser report security violations back to your web application so you can fix your potential security holes. Even though it’s still in draft form, there is already a second version of CSP that includes a directive called frame-ancestors, a whitelist of trusted sources that are allowed to include your page in a frame.

If you want to prevent framing of your page entirely, use the CSP header like this:

Content-Security-Policy: frame-ancestors 'none'

To allow framing only within your own site, you can replace 'none' with 'self'. Alternatively, you can supply a list of allowed, trusted domains. Like all aspects of CSP, it’s very flexible.

Content-Security-Policy: frame-ancestors tinfoilsecurity.com example.com

You can use an automated web security scanner like Tinfoil to crawl your site and ensure that no page is vulnerable to clickjacking. Tinfoil provides the best web application security solution on the market, and it detects clickjacking vulnerabilities on your website along with many other types of web vulnerabilities.


Angel Irizarry

Angel Irizarry is the Software Samurai of Tinfoil Security, and a self-proclaimed software purist. All he needs to do his best work is a plain Linux machine with Git and Emacs installed. He loves everything about front-end development, like making pages interactive and super fast, even if that means digging in and optimizing some SQL. When he's not writing code, which isn't very often, you'll find him on his iPad scouring his RSS feeds for news and rumors of cool new gadgets.

Tags: plain english clickjacking


The 'Shellshock' Bash Bug in Plain English

I’ve seen a ton of scary articles about a newly discovered Bash vulnerability that has been affectionately named Shellshock by the security community. People are saying Shellshock is bigger than Heartbleed, and that it can affect not just millions of web servers, but also routers, smartphones, and even light bulbs.

 

These articles all follow the same basic template. They say that there is a bug in Bash that can allow a remote attacker to execute any code they want on a vulnerable machine. Then they say that millions of computers run Bash, and as a result we are all doomed. The ones that lean more on the technical side present you with a snippet of Bash code that, when you run it, prints out something menacing like “You’ve been hacked!” if your version of Bash is vulnerable. As a developer, it’s been frustrating that these articles, in an effort to not confuse and lose the attention of their reader base, have shied away from going into the technical details of the bug. I think the details of Shellshock are instructional, and they’re way more interesting to read about than statistics on the pervasiveness of Bash. Yes, you should all patch your Bash right away. But let’s talk about the bug itself.

For fun, here’s that line of code that you can run to see if your version of Bash is vulnerable. Bash comes pre-installed in almost all Linux distributions, and it is the default shell in OS X Terminal. Windows users are safe, unless you manually installed Bash using Cygwin.

env foo='() { :; }; echo "Vulnerable!"' bash -c ':'

If your Bash version is vulnerable to Shellshock, it will print “Vulnerable!”, but why? That’s definitely not the intended behavior of this code. We are setting an environment variable foo to be the string '() { :; }; echo "Vulnerable!"' and then invoking a sub-shell that, in this case, does nothing. The end result should be that nothing is printed on the screen.

The problem stems from the funky way that Bash stores functions in environment variables. Let’s say you open up Bash and define a simple “Hello world” function.

$ function foo {
>   echo "Hello world";
> }
$

You can run the function all you want in this shell, but you can’t call it in programs that run inside it, which are also known as sub-shells.

$ foo
Hello world
$ bash -c 'foo'
bash: foo: command not found
‚Äč$

Let’s say you really want to run a sub-shell that uses the function foo. The standard way to do this is to use the export command to turn foo into an environment variable. The -f flag tells export that you are referring to a function.

$ export -f foo
$

You can also use the export command to make plain-old string variables into environment variables.

$ export bar="I am a string"
$

Using the env command, you can see all of the current environment variables as a list where each element is of the form <variable>=<value>. For functions, Bash uses special characters to distinguish it from the rest of the variables.

$ env
…
foo=() {  echo "hello world"
}
bar=I am a string
…

I’ve highlighted the special function characters so you can see them. This is where the vulnerability comes in. All Bash environment variables are strings, even when they represent functions. Bash uses the characters () { to distinguish a function string from a regular string. When a sub-shell is invoked, a copy of each environment variable is created and made available to the sub-shell. When Bash gets to an environment variable that starts with () {, it realizes this is a function string and evaluates the line in order to turn it into a real function. Unfortunately, up until a few days ago, Bash would just evaluate the entire string as code, blindly, with the same user permissions as given to the sub-shell, and without actually checking if the string is only a function definition. Therefore, it would not only run the function definition, but potentially any code, malicious or otherwise, that followed it. Let’s come back to the original one-line test.

env foo='() { :; }; echo "Vulnerable!"' bash -c ':'

Here, I’m using the () { characters to denote a function definition. However, I’m also ending the function definition and following it with more code. When I invoke the sub-shell using the bash command, the string inside of foo gets evaluated, and the echo is executed!

It gets worse. Exploiting this vulnerability on the web is shockingly (pun-intended) easy. Many web servers invoke Bash scripts in response to requests. One of the many ways that they can do this is by using the Common Gateway Interface, or CGI. It’s common for the web server to pass HTTP request information into the shell script, and the common way to do this is with environment variables. Things like the user agent string, cookies, and the GET parameters are stored in environment variables before running the sub-shell. Since users have access to all of these pieces of information, a malicious user could change their user agent string to be, say, '() { :; }; <malicious code>' and can force the web server to run any code they want.

Since this vulnerability was announced to the public last week, the Bash source code has gotten lots of new attention from security researchers. Many similar bugs in Bash have popped up, most of them similar to the original, and all allow unintended code execution. The latest version of Bash, version 4.3, has been patched three times in the last week to fix the discovered Shellshock variants, and there will likely be more variants discovered in the coming weeks. The best thing you can do is update Bash on all of your machines, even if they aren’t running network services. In addition, we’ll be updating the Tinfoil scanner in the next few days to scan for all of the known variants of Shellshock on your website. Tinfoil includes a free 30-day trial once you sign up, and in addition to the Shellshock update coming shortly, it scans for many more common web vulnerabilities.


Angel Irizarry

Angel Irizarry is the Software Samurai of Tinfoil Security, and a self-proclaimed software purist. All he needs to do his best work is a plain Linux machine with Git and Emacs installed. He loves everything about front-end development, like making pages interactive and super fast, even if that means digging in and optimizing some SQL. When he's not writing code, which isn't very often, you'll find him on his iPad scouring his RSS feeds for news and rumors of cool new gadgets.

Tags: plain english shellshock


Path Traversal in Plain English

It is sad that the most dangerous vulnerabilities on the internet tend to be the easiest to detect. Today, I’d like to talk about one such kind of vulnerability: path traversal. A path traversal vulnerability allows an attacker to access files on your web server to which they should not have access. They do this by by tricking either the web server or the web application running on it into returning files that exist outside of the web root folder.

Let’s say you have a website running on http://www.example.com. Let’s also suppose that the web server you are using makes it super easy to add pages to your site; all you have to do is add them to the web root folder, /var/www, on the server’s filesystem and the rest is taken care of. If you add the file /var/www/products/table.html, then that page can be accessed by anyone if they visit http://example.com/products/table.html. This web server, unfortunately, is super old and vulnerable to path traversal. This allows an attacker to use special character sequences, like ../, which in Unix directories points to its parent directory, to traverse up the directory chain and access files outside of /var/www, like this.

http://www.example.com/../../private/configuration.yml

When receiving this request, the web server appends the relative path specified by the user, ../../configuration.yml, to the directory that holds the web pages, /var/www/, to obtain the full path /var/www/../../configuration.yml. In Unix-like systems, each ../ cancels out the directory immediately to the left of it, so if we reduce the path to its simplified form, the final path becomes /private/configuration.yml.

And now, the hacker has just obtained sensitive information, maybe even your database credentials, and can use this information to steal your users’ information or cause further damage.

The same type of situation could arise even if your web server is up-to-date and not vulnerable, yet you introduce a path traversal vulnerability in the application itself. Say your application is a little fancier than static pages now, and each page includes a link to download a PDF for more information. These PDF links look something like this:

http://www.example.com/download?file=table.pdf

Using the same ../ technique, an attacker can escape out of the directory containing the PDFs and access anything they want on the system.

http://www.example.com/download?file=../../private/configuration.yml

The Danger

Often, building a web application on a web server whose filesystem contains no sensitive files is not possible or too impractical. Tinfoil Security, for example, relies on the existence of many configuration files, not to mention the website’s source code itself, on the web server to run properly. Your application is likely to require the existence of similar configuration files filesystem in order to work. These files could contain the credentials for the site’s database, which an attacker can use to gain access to all of your customers’ information. Path traversal can also be used to reveal your source code, which could lead an attacker to discover even more sensitive information (if you store credentials in source code constants. You don’t do that, do you?) or other vulnerabilities. Worse yet, since attackers have full access to your filesystem, they can access system programs (such as a deletion program) and force them to run, causing potentially irrecoverable damage on your system.

Lazy Solutions

Usually I skip straight to the solutions, but I think it is interesting to study some of the attempts we’ve seen in the past that try (and fail) to prevent path traversal.

  • Doing a search and removal for ../ in the given path.

    The idea behind this technique is that if you prevent a user from using ../ in the path, they’ll never be able to traverse out of the /var/www directory and into more private directories. However, this can easily be bypassed with URL encoding. The URL encoding for ../ is %2E%2E%2F, so the following would break through this defense.

    http://www.example.com/%2E%2E%2Fprivate/configuration.yml
  • Doing a check to make sure the path ends in .html or some other known extension. This is also easy to bypass. If you stick a null byte right before also inserting the expected extension, the suffix check will succeed, but the file system will use the specified path only up to the null byte and stop reading there. Since an attacker can’t stick a regular null byte into a URL, they again rely on URL encoding to help them.

    http://www.example.com/../../private/configuration.yml%00.html

The Answer

That said, there are a lot of right ways to mitigate and help prevent path traversal. Any of these solutions work in isolation, but I recommend doing as many of these as you can.

  • To prevent path traversal in your web server, update your web server and operating system to the latest versions available. This vulnerability has been known for a while, and it is likely your web server’s latest version is not vulnerable. You don’t want to be stuck running an old, vulnerable web server, because then none of the below solutions will help you.

  • When making calls to the filesystem, you should not rely on user input for any part of the path.

  • If you must somehow open paths depending on user input, you should have the user input be an index into one of a list of known, safe files. For example, ‘1’ could map to table.html, and ‘2’ could map to chair.html.

  • Run your web server from a separate disk from your system disk (the disk that holds critical operating system files), and, if possible, don’t store any sensitive files in the web server disk.

  • Use filesystem permissions judiciously. Use a non-superuser to run the web server whose permissions only allow them to read only the files it needs to run. It should not be able to write to any files, since all user data should be stored in a separate database.

  • If you really, really need to allow users to specify a path, relative or otherwise, then normalize the path (this is how Java does it, and it works pretty well) and check that its prefix matches the directory they should be allowed to access.

    > FILE_PREFIX = '/var/www/public/'
    => "/var/www/public/"
    > user_input = '../../../etc/passwd'
    => "../../../etc/passwd"
    > full_path = normalize(FILE_PREFIX + user_input)
    => "/etc/passwd"
    > is_valid = full_path.start_with?(FILE_PREFIX)
    => false
    

If you have an existing web application, and you want to know if you’re vulnerable to path traversal, checking is easy, but extremely tedious. For each parameter, URL, or cookie, you could insert a relative paths to files known to exist on your web server’s machine, such as ../../../../../../etc/passwd on Unix-like machines. You’d also have to see if you’re vulnerable to tricks such as ../ removal (by using %2E%2E%2F) and file extension checking (by sticking a null byte, %00, before the inserting the valid extension).

As you can imagine, this can get tedious and impractical, so I recommend using an automated web security scanner like Tinfoil Security. Tinfoil is designed specifically to handle vulnerability tests like this, and it will crawl your entire site looking for path traversal vulnerabilities, among many others.


Angel Irizarry

Angel Irizarry is the Software Samurai of Tinfoil Security, and a self-proclaimed software purist. All he needs to do his best work is a plain Linux machine with Git and Emacs installed. He loves everything about front-end development, like making pages interactive and super fast, even if that means digging in and optimizing some SQL. When he's not writing code, which isn't very often, you'll find him on his iPad scouring his RSS feeds for news and rumors of cool new gadgets.

Tags: plain english path traversal


Cross-Site Request Forgery (CSRF) in Plain English

Welcome back to my weekly series where I explain different types of website attacks in plain English. So far, I’ve tackled two of the most common vulnerabilities on the web today: SQL injection and Cross-Site Scripting. Today, I’d like to talk about another common vulnerability that the Tinfoil scanner finds all too often: Cross-Site Request Forgery.

Cross-Site Request Forgery (CSRF or XSRF) is another example of how the security industry is unmatched in its ability to come up with scary names. The attack itself is quite simple. A CSRF vulnerability allows an attacker to force a logged-in user to perform an important action without their consent or knowledge. It is the digital equivalent of an attacker forging the signature of a victim on an important document. Furthermore, the attack leaves no evidence behind, since a forged request contains all of the information and comes from the same IP address as a real request from a victim.

The Danger

The most important actions that one can perform on a website also tend to be the ones that require one to log in to the website. Banks need to be able to identify a user to know the bank account from which to withdraw. E-commerce sites need a user’s identity so she can be associated with a credit card number, billing address, and shopping cart. Video-sharing sites need to be able to associate unique upvotes with users. Using CSRF, an attacker could force a victim to send the attacker some money, or buy something from them, or upvote their videos.

As an example, my banking website, example.com, does not protect itself against CSRF. You, an unsuspecting example.com user, also happened to be logged in to example.com. Now, malicious user Mallory sends you (and millions of other example.com users, of course) an HTML e-mail including the following tag.

<img src="https://www.example.com/transfer?amount=1000&amp;destination=mallory">

If you have a webmail client that loads images automatically, the transfer request will be made from your browser using your IP address and your example.com session cookies, exactly as if you made the request yourself. My bank website, therefore, treats this like a legitimate request and sends $1000 from your account to Mallory’s account. All evidence suggests you legitimately made this transaction from your logged-in browser.

If all actions on my site are vulnerable to CSRF, this could even lead to further damage. If the attacker can forge a password reset request, or an e-mail change request, the attacker could subsequently gain full control of the victim’s account. If the victim is an administrative user, the entire website would be under the attacker’s control.

The Answer

There are many ways to protect your website from CSRF, but in this post I will only discuss the most common and most effective solution. It is the solution used by many popular web frameworks, including Ruby on Rails. It’s called the Synchronizer Token Pattern.

For each user session, logged-in or otherwise, the Ruby on Rails server generates a unique token and stores that in the session cookie, which is additionally digitally signed server-side to detect tampering. The server then places this token as a hidden field into every form on every page that it renders. If a user submits the form normally, say, by clicking the “Submit” button, the token will be sent to the server as a form parameter, as well as in the cookie. The server then checks to see that the token in the cookie matches the token in the form parameter. If they don’t match, the request is assumed to be forged, the action is not performed, and the user is forcibly logged out. This only works on POST requests, so it is also up to you to ensure that all of the important actions that can be performed on your site are POST requests.

With this protection in place, if a malicious user attempts to use the same <img> tag trick from above to forge an important action, it won’t work for several reasons. Firstly, the request would be made via GET instead of POST, and the application just won’t accept it because Rails was told that this important action is only to be performed over POST. The attacker can, however, get around that with some clever uses of Javascript.

<img src="https://www.example.com/transfer?amount=1000&amp;destination=mallory" onload="...">

Using an onload handler, the attacker can dynamically create a form and submit it via POST. However, this runs into our main CSRF protection. Because the attacker does not know the secret token that needs to be sent with the request, the request will fail. Unless the website is vulnerable to other kinds of attacks, such as Cross-Site Scripting, the attacker has no way to obtain the secret token, and CSRF is prevented.

As always, if you have any questions about CSRF or other vulnerabilities on your website, feel free to get in touch with us. If you are looking for an automated way to detect CSRF vulnerabilities on your website, check out Tinfoil Security. Tinfoil provides the best web application security solution on the market, and it detects CSRF vulnerabilities on your website along with many other types of web vulnerabilities.


Angel Irizarry

Angel Irizarry is the Software Samurai of Tinfoil Security, and a self-proclaimed software purist. All he needs to do his best work is a plain Linux machine with Git and Emacs installed. He loves everything about front-end development, like making pages interactive and super fast, even if that means digging in and optimizing some SQL. When he's not writing code, which isn't very often, you'll find him on his iPad scouring his RSS feeds for news and rumors of cool new gadgets.

Tags: plain english csrf xsrf


Cross-Site Scripting (XSS) in Plain English

Welcome to my weekly series where I explain different types of website attacks in plain English, steering clear of heavy security jargon commonly found in articles of this nature. Today, I’d like to tackle Cross-Site Scripting, more commonly known by the much scarier acronym XSS.

Modern websites are far more complex than the static pages that used to rule the internet. These days, it is more accurate call them web applications, due to the growing trend of replacing server-side logic with client-side Javascript. While Javascript as a programming language has evolved over the years, the ways that Javascript code is meant to be added to a web page have not. This is why we can still use <script> and </script> tags inside of HTML documents and put any Javascript we want inside of them, and this is the main reason why XSS is still rampant today.

XSS allows malicious users to inject client-side code (mainly Javascript) into web pages to be run by other unsuspecting users. It may be easier to understand with an example. Suppose I’m a web developer creating a hot new search engine: example.com. At its basic level, the search engine requires two pages. The first page, http://www.example.com, only contains a search box.

<form action="/search" method="get">
  <input type="text" name="query" />
</form>

The second page contains the list of search results. As a friendly reminder to the user, it also includes their search term. The server-side code that generates that piece of HTML, here implemented using Sinatra, may look something like this.

require 'sinatra'

get '/search' do
  html = ""
  # ...
  html += "Here are the results found for: #{params[:query]}"
  # ...
  return html
end

The Danger

Using typical string interpolation here presents a problem to the user’s browser because it cannot differentiate between HTML intended by my code and any HTML entities that may exist inside the query parameter. As a result, it is easy for an attacker to exploit this by typing the following into the search box:

<script>alert('hacked!');</script>

Our original intent was to remind the user of what her search term was, so we want everything inside the paragraph tags to be treated as plain text:

...
<p>Here are the results found for: <script>alert('hacked!');</script></p>
...

Unfortunately, the script tags here get parsed just like any other script tag, and the Javascript code between them gets executed. The browser does not know the difference between the script tag inserted via user input and a script tag inserted by us.

...
<p>Here are the results found for: <script>alert('hacked!');</script></p>
...

At this point you might be thinking, “So what? Javascript is client-side, so the attacker only managed to accomplish hacking himself.” Unfortunately, this is not the whole story. At this point the attacker’s URL bar reads http://www.example.com?query=<script>alert('hacked!');</script>, and she could easily copy this URL and paste it somewhere in an effort to get potential victims to click on it. She could post it to public forums, send e-mails to example.com users that include this link (with a tempting title like “Check out these cat pictures!”) or embed this page on her own site using an invisible iframe. In any case, the malicious Javascript code then runs on the unsuspecting victim’s computer. Notice how this differs from another popular attack, SQL injection, in that XSS is aimed at users of the website, not the website itself.

The worst part is that because Javascript is designed to be a powerful tool to manipulate a web page, this kind of attack can be devastating. An attacker can use XSS to steal users’ cookies and use those to impersonate them at example.com, steal their credit card information, or even trick them into installing and downloading malware. Anything that HTML and Javascript can do, the attacker can do.

The Answer

The main defense against XSS is to escape all user input. Escaping user input is the technique of replacing certain characters with other equivalent characters to remove ambiguity for a browser’s parsers. Doing this properly is a solid defense against XSS, because escaped characters signal to a parser that they are to be treated as text and never as code. To do this properly, we have to identify which characters are safe to display without being mistaken for characters can switch out of the current context. Every character not in this safe list needs to be escaped, so that the browser does not treat them as executable code.

Unfortunately, there is no single tool or algorithm to do this, due to the variety of contexts in which one could insert user input, and the different requirements each of those contexts have for properly escaping text. Typically, however, modern web programming frameworks have libraries devoted to escaping user input in a variety of contexts. I recommend strictly using those libraries and not implementing your own. If you’re curious about how these libraries work, in the following sections I discuss the most common contexts in which you would want to insert user input, and the proper ways to use escaping to prevent XSS.

Between Opening and Closing HTML Content Tag

Inside standard content elements is the safest place to insert user input. HTML content elements include tags such as <p>, <div>, and <li>, essentially any element meant to contain other content elements or plain text. In this case, we want to use HTML escaping to ensure user input is never mistaken for an HTML tag or attribute. This means that we have to convert certain dangerous characters into the form &X;, where X is either a number (preceded by a #) or, in certain cases, a name. These constructs are called HTML entities, and they tell the HTML parser that they should be interpreted and displayed as text, and never treated as HTML tags. Below is a complete list of the characters that need to be escaped.

Dangerous Character Named HTML Entity Numerical HTML Entity (in hex)
& &amp; &#38;
< &lt; &#60;
> &gt; &#62;
&quot; &#34;
'   &#39;

 

In our search engine example above, we wanted to place user input inside of <p> tags, even if the input is an attempt at XSS. This can safely be accomplished by using the HTML escaping technique. The raw HTML with proper escaping looks like this:

...
<p>Here are the results found for: &lt;script&gt;alert(&#39;hacked!&#39;);&lt;/script&gt;</p>
...
HTML Attribute Values

While it is possible to allow user input in HTML tag attributes, it is significantly more dangerous than allowing user input between content tags. Because HTML attribute values don’t have to be quoted, there are many more ways for attackers to escape out of them and inject malicious code. In the following contrived example, we construct a page uses a get parameter to set the width of an image.

require 'sinatra'
get '/image' do
  html = ""
  # ...
  html += "<img src=image.jpg height=300 width=#{params[:w]}>"
  # ...
  return html
end

Here, if an attacker constructs the URL http://example.com/image?w=400%20onload=alert('hacked!'), the resulting HTML will cause the malicious Javascript to run with the image is loaded.

...
<img src=image.jpg height=300 width=400 onload=alert('hacked!')>
...

To ensure safety, we have to escape all non-alphanumeric characters in the user input using HTML entities, not just the five characters listed in the previous table. A complete list HTML entities can be found here. In the above example, properly escaped user input would look like this:

...
<img src=image.jpg height=300 width=400&#32;onload&#61;alert&#40;&#39;hacked&#33;&#39;&#41;>
...
JSON String Values

If you want to allow user input to be embedded in your JavaScript code, the only safe place is inside of a quoted string, either as a regular string variable or within a JSON string value. Even here, it is still dangerous to allow user input to be inserted unescaped, as the example below illustrates.

<script>
var string = "</script><script>alert('hacked!');"
</script>

Even though the red </script> is inside of a Javascript string, it closes the Javascript context and starts a new one. This is because browsers have their HTML parsers run before the Javascript parsers, so HTML elements get highest priority. Even my text editor gets this wrong.

The best solution here is to escape every non-alphanumeric character using unicode escaping. The following table has some examples.

Dangerous Character Unicode escape
< \u003C
> \u003E
" \u0022

 

There are other dangerous places to allow user input to be inserted, such as CSS property values and URL get parameters, but the solutions for all of them are the same: always escape user input in every context. Rather than trying to remember all of the escaping rules for each context, it’s much safer to use a library for the job. Read the documentation of your favorite web framework and use its built-in tools to ensure you don’t make any mistakes.

As you’ve seen in the examples above, it is all too easy to expose your site to XSS, and these types of vulnerabilities can be incredibly hard to detect for even trained human eyes. As an added level of security, I highly recommend using an automated tool to scan for and detect XSS vulnerabilities in your site. Tinfoil provides the best web application security solution on the market, and it detects XSS vulnerabilities on your website along with many other types of web vulnerabilities.


Angel Irizarry

Angel Irizarry is the Software Samurai of Tinfoil Security, and a self-proclaimed software purist. All he needs to do his best work is a plain Linux machine with Git and Emacs installed. He loves everything about front-end development, like making pages interactive and super fast, even if that means digging in and optimizing some SQL. When he's not writing code, which isn't very often, you'll find him on his iPad scouring his RSS feeds for news and rumors of cool new gadgets.

Tags: XSS plain english


Tinfoil Security Blog

Tinfoil Security provides the simplest security solution. With Tinfoil Security, your site is routinely monitored and checked for vulnerabilities using a scanner that's constantly updated. Using the same techniques as malicious hackers, we systematically test all the access points, instantly notifying you when there's a threat and giving you step-by-step instructions, tailored to your software stack, to eliminate it. You have a lot to manage; let us manage your website's security.