API Security Scanning: How is it done the right way?

We’re excited to announce our API Security Scanner has been officially launched and is now publicly available! It’s a much needed tool we’ve been building and rigorously testing for the past year and a half, and we can’t wait to start sharing it with the world. Before we go into the details on how the scanner works, it’s important to start by discussing the problem of API security in general, and why such a tool is needed in the first place.

First, when we say API, it’s worth clarifying that we’re talking about web-based APIs such as REST APIs, web services, mobile-backend APIs, and the APIs that power IoT devices. We are not targeting lower-level APIs like libraries or application binary interfaces. This is an important distinction to make, because the sorts of security vulnerabilities that affect web-based APIs are going to mirror the same categories of vulnerabilities we’ve spent the past seven years defending against, with our web application security scanner.

Just as web applications can be vulnerable to issues like Cross-Site Scripting (XSS) or SQL injection, APIs can also fall prey to similar attacks. As always, it isn’t quite that simple, and the nuances of how these vulnerabilities are actually exploited and detected can vary dramatically between the two types of applications. In the case of XSS, for example, the difference between a vulnerable API and a secure API depends not only on the presence of attacker controlled sinks in an HTTP response, but also on the content-types of the responses in question, how those responses are consumed by a client, and whether sufficient content-type sniffing mitigations have been enforced.

Also worthy of consideration is how APIs handle authentication, especially as compared to web applications. In the case of web applications, authentication is more or less a solved problem. For the most part, the user visits a page with a login form, enters their credentials, submits the form, and gets back a cookie. There are minor variations to this -- sometimes people store the session in local storage or session storage, for example -- but for the most part, every web application authenticates in pretty much the same way. APIs, on the other hand? Not so much. At an absolute minimum, you need to account for protocols like OAuth2 (and all of its associated grant types!), OpenID Connect, and increasingly, JSON Web Tokens (JWT). Beyond that, it’s also common to layer on other security requirements, like client certificates, or signed requests. Existing web application security scanners have no concept of any of these standards, and even if you managed to get a scanner to authenticate to your API, you’re not going to have much luck coercing it into properly signing your requests.

Lastly, unlike web applications, APIs aren’t discoverable. Unless you’re one of the dozen companies in the world with a HATEOAS based API, it simply isn’t possible for a security scanner to load up your API, follow all of the links, and automatically discover all of the endpoints in that API, let alone the parameters expected by those endpoints, and any constraints required of them. Without some way of programmatically acquiring this information, API security scanning simply can’t be automated in the same way that web scanning has been.

These are all solvable problems, but they mean that a dynamic security scanner needs to be built from the ground up to understand APIs, how APIs are used, and more importantly, how APIs are attacked. This means that simply repurposing an existing web-application security scanner won’t be sufficient (which is what most other solutions currently do). With this point in mind, our API scanner is an entirely new scanning engine (written in Elixir!), built off of everything we’ve learned over the past seven years of attacking web applications.

To handle the previously mentioned authentication issues, we’ve devised a clever system using something we like to call authenticators. Essentially, we’ve distilled API authentication down to its primitives: whether that’s as simple as adding a header or a parameter to a request, or performing an entire OAuth2 handshake and storing the received bearer token for later. From there, our scanner is able to chain together all of these authenticators together, incrementally transforming unauthenticated requests into authenticated requests. Furthermore, because our scanner has such a nuanced understanding of all the discrete steps of an authentication workflow, it becomes possible to detect when any of those steps have failed, and also when any of them aren’t being honored by the server. This uniquely enables us to fuzz the individual steps of an authentication flow, providing us a powerful tool for determining authorization and authentication bypasses.

To address the discoverability issues inherent with APIs, we approached the problem the same way humans do: with documentation! As a developer looking to use a third-party API, your first stop is always the documentation for that API. Historically, this documentation has almost always been presented as unstructured text, and in a form not conducive to being parsed by software. With standards like Swagger, RAML, and API Blueprint becoming more widespread over recent years, the idea of programmatically specifying an API’s behavior is becoming increasingly popular, and this offers an exciting opportunity for API security scanning. In our experience, we’ve found that Swagger in particular is beginning to win out as the de facto standard for API documentation, and so we’ve designed the first version of our API scanner to ingest Swagger documents, and use them to build a map of an API for scanning.

Reading in documentation like this nicely solves the issue of being unable to crawl an API, but it also allows us to scan APIs with a level of intelligence that black-box dynamic web application scanning has never had access to. In most variants of web application scanning, the scanning engine crawls the application to determine all available input vectors: forms, links, buttons, really anything that might trigger some login on the client or server. From there, these inputs are fuzzed to look for security vulnerabilities. The issue, then, is that because this is entirely black box scanning, it becomes difficult for a scanner to ensure it is generating good payloads to send to the web application. By this we mean payloads that, while still being malicious, conform to the format and structure expected by the application. We could send a server every variation of SQL we can think of, but if the server is blocking our requests because they fail the first level of input validation, then we’re never going to make any progress. Our web application scanner actually addresses this very problem by examining the context in which parameters are used, in order to infer their expected structure. By sidestepping this problem entirely with API scanning, we’ve found that we’re able to more easily achieve an even higher level of coverage typically reserved for highly-skilled, manual penetration testing.

By parsing Swagger documentation, though, this problem can be cleverly avoided. Now, in addition to knowing the endpoints to scan, and the parameters on those endpoints, we’re also aware of the types of those parameters and whatever other constraints are specified in the Swagger documentation. It becomes possible for us to know that a given parameter needs to be a string, resembling an email address, of a specific length, and possibly excluding certain characters. Given all of this information, we can begin intelligently generating attack payloads that conform to various subsets of these constraints, allowing us to audit for holes in the server’s intended validation logic, while also giving a suitable jumping off point for intentionally trying to bypass that validation logic with cleverly constructed payloads.

It’s been a long road to get to this point, but we’re proud to have finally built an API security scanner that approaches the problem from a strong foundation, and with careful thought put into what makes API security scanning difficult. We have a lot of enhancements to make, but what we’ve been shipping to customers over the past year has already filled an important gap in their application security program -- especially with our ever present focus on integrating security scanning into the DevOps process. Just as with our web application scanner, our API scanner is designed to be integrated directly into the software development life-cycle, so that developers can find and fix vulnerabilities as early as possible, and often without waiting for a dedicated security engineer to get involved. We facilitate this with first-party integrations for tools like Jenkins, and also by providing a REST API that can drive the entire scanning and reporting process, from start to finish.

Security is much too important to be dealt with as an afterthought. That’s why we always strive to enable our customers push their security up the stack, so they can empower their developers to find and fix vulnerabilities before they become a problem.

Interested in setting up a demo to see for yourself? Find a time that works for you, and schedule a demo right here.

Quinn Wilton

Quinn Wilton is the Grand Magistrate of Security at Tinfoil Security, and the company's resident programming language theorist. When she isn't coding in a functional language like Elixir, she's probably hacking on an interpreter for an esolang of her own, or playing around with dependent types in Idris. Security is always at the forefront of her thoughts, and she enjoys building tools which make it easy for other engineers to write secure code. Her love for security is matched only by her love for bad movies - and does she ever love bad movies.

Tags: XSS security

Cross-Site Scripting (XSS) in Plain English

Welcome to my weekly series where I explain different types of website attacks in plain English, steering clear of heavy security jargon commonly found in articles of this nature. Today, I’d like to tackle Cross-Site Scripting, more commonly known by the much scarier acronym XSS.

Modern websites are far more complex than the static pages that used to rule the internet. These days, it is more accurate call them web applications, due to the growing trend of replacing server-side logic with client-side Javascript. While Javascript as a programming language has evolved over the years, the ways that Javascript code is meant to be added to a web page have not. This is why we can still use <script> and </script> tags inside of HTML documents and put any Javascript we want inside of them, and this is the main reason why XSS is still rampant today.

XSS allows malicious users to inject client-side code (mainly Javascript) into web pages to be run by other unsuspecting users. It may be easier to understand with an example. Suppose I’m a web developer creating a hot new search engine: example.com. At its basic level, the search engine requires two pages. The first page, http://www.example.com, only contains a search box.

<form action="/search" method="get">
  <input type="text" name="query" />

The second page contains the list of search results. As a friendly reminder to the user, it also includes their search term. The server-side code that generates that piece of HTML, here implemented using Sinatra, may look something like this.

require 'sinatra'

get '/search' do
  html = ""
  # ...
  html += "Here are the results found for: #{params[:query]}"
  # ...
  return html

The Danger

Using typical string interpolation here presents a problem to the user’s browser because it cannot differentiate between HTML intended by my code and any HTML entities that may exist inside the query parameter. As a result, it is easy for an attacker to exploit this by typing the following into the search box:


Our original intent was to remind the user of what her search term was, so we want everything inside the paragraph tags to be treated as plain text:

<p>Here are the results found for: <script>alert('hacked!');</script></p>

Unfortunately, the script tags here get parsed just like any other script tag, and the Javascript code between them gets executed. The browser does not know the difference between the script tag inserted via user input and a script tag inserted by us.

<p>Here are the results found for: <script>alert('hacked!');</script></p>

At this point you might be thinking, “So what? Javascript is client-side, so the attacker only managed to accomplish hacking himself.” Unfortunately, this is not the whole story. At this point the attacker’s URL bar reads http://www.example.com?query=<script>alert('hacked!');</script>, and she could easily copy this URL and paste it somewhere in an effort to get potential victims to click on it. She could post it to public forums, send e-mails to example.com users that include this link (with a tempting title like “Check out these cat pictures!”) or embed this page on her own site using an invisible iframe. In any case, the malicious Javascript code then runs on the unsuspecting victim’s computer. Notice how this differs from another popular attack, SQL injection, in that XSS is aimed at users of the website, not the website itself.

The worst part is that because Javascript is designed to be a powerful tool to manipulate a web page, this kind of attack can be devastating. An attacker can use XSS to steal users’ cookies and use those to impersonate them at example.com, steal their credit card information, or even trick them into installing and downloading malware. Anything that HTML and Javascript can do, the attacker can do.

The Answer

The main defense against XSS is to escape all user input. Escaping user input is the technique of replacing certain characters with other equivalent characters to remove ambiguity for a browser’s parsers. Doing this properly is a solid defense against XSS, because escaped characters signal to a parser that they are to be treated as text and never as code. To do this properly, we have to identify which characters are safe to display without being mistaken for characters can switch out of the current context. Every character not in this safe list needs to be escaped, so that the browser does not treat them as executable code.

Unfortunately, there is no single tool or algorithm to do this, due to the variety of contexts in which one could insert user input, and the different requirements each of those contexts have for properly escaping text. Typically, however, modern web programming frameworks have libraries devoted to escaping user input in a variety of contexts. I recommend strictly using those libraries and not implementing your own. If you’re curious about how these libraries work, in the following sections I discuss the most common contexts in which you would want to insert user input, and the proper ways to use escaping to prevent XSS.

Between Opening and Closing HTML Content Tag

Inside standard content elements is the safest place to insert user input. HTML content elements include tags such as <p>, <div>, and <li>, essentially any element meant to contain other content elements or plain text. In this case, we want to use HTML escaping to ensure user input is never mistaken for an HTML tag or attribute. This means that we have to convert certain dangerous characters into the form &X;, where X is either a number (preceded by a #) or, in certain cases, a name. These constructs are called HTML entities, and they tell the HTML parser that they should be interpreted and displayed as text, and never treated as HTML tags. Below is a complete list of the characters that need to be escaped.

Dangerous Character Named HTML Entity Numerical HTML Entity (in hex)
& &amp; &#38;
< &lt; &#60;
> &gt; &#62;
&quot; &#34;
'   &#39;


In our search engine example above, we wanted to place user input inside of <p> tags, even if the input is an attempt at XSS. This can safely be accomplished by using the HTML escaping technique. The raw HTML with proper escaping looks like this:

<p>Here are the results found for: &lt;script&gt;alert(&#39;hacked!&#39;);&lt;/script&gt;</p>
HTML Attribute Values

While it is possible to allow user input in HTML tag attributes, it is significantly more dangerous than allowing user input between content tags. Because HTML attribute values don’t have to be quoted, there are many more ways for attackers to escape out of them and inject malicious code. In the following contrived example, we construct a page uses a get parameter to set the width of an image.

require 'sinatra'
get '/image' do
  html = ""
  # ...
  html += "<img src=image.jpg height=300 width=#{params[:w]}>"
  # ...
  return html

Here, if an attacker constructs the URL http://example.com/image?w=400%20onload=alert('hacked!'), the resulting HTML will cause the malicious Javascript to run with the image is loaded.

<img src=image.jpg height=300 width=400 onload=alert('hacked!')>

To ensure safety, we have to escape all non-alphanumeric characters in the user input using HTML entities, not just the five characters listed in the previous table. A complete list HTML entities can be found here. In the above example, properly escaped user input would look like this:

<img src=image.jpg height=300 width=400&#32;onload&#61;alert&#40;&#39;hacked&#33;&#39;&#41;>
JSON String Values

If you want to allow user input to be embedded in your JavaScript code, the only safe place is inside of a quoted string, either as a regular string variable or within a JSON string value. Even here, it is still dangerous to allow user input to be inserted unescaped, as the example below illustrates.

var string = "</script><script>alert('hacked!');"

Even though the red </script> is inside of a Javascript string, it closes the Javascript context and starts a new one. This is because browsers have their HTML parsers run before the Javascript parsers, so HTML elements get highest priority. Even my text editor gets this wrong.

The best solution here is to escape every non-alphanumeric character using unicode escaping. The following table has some examples.

Dangerous Character Unicode escape
< \u003C
> \u003E
" \u0022


There are other dangerous places to allow user input to be inserted, such as CSS property values and URL get parameters, but the solutions for all of them are the same: always escape user input in every context. Rather than trying to remember all of the escaping rules for each context, it’s much safer to use a library for the job. Read the documentation of your favorite web framework and use its built-in tools to ensure you don’t make any mistakes.

As you’ve seen in the examples above, it is all too easy to expose your site to XSS, and these types of vulnerabilities can be incredibly hard to detect for even trained human eyes. As an added level of security, I highly recommend using an automated tool to scan for and detect XSS vulnerabilities in your site. Tinfoil provides the best web application security solution on the market, and it detects XSS vulnerabilities on your website along with many other types of web vulnerabilities.

Angel Irizarry

Angel Irizarry is the Software Samurai of Tinfoil Security, and a self-proclaimed software purist. All he needs to do his best work is a plain Linux machine with Git and Emacs installed. He loves everything about front-end development, like making pages interactive and super fast, even if that means digging in and optimizing some SQL. When he's not writing code, which isn't very often, you'll find him on his iPad scouring his RSS feeds for news and rumors of cool new gadgets.

Tags: XSS plain english

Building A Browser Extension? Be Careful Not To Accidentally XSS the Whole Internet.

Update/TL;DRIf you didn’t generate it, assume it’s malicious.

Sometimes, the state of your website’s security can be affected by resources and services outside your control. The topic of today? Browser extensions.

Recently, we disclosed a vulnerability to a well-known company (let’s refer to them as Company) in their browser extension (specifically, in Chrome). To their credit, Company responded rapidly and fixed the issue within 2 days - major props to them for responding so quickly. Before we get into the vulnerability, let’s talk a little about what the extension does.

Chrome extensions? Not much harm can come from that...

From the Company’s feature description, the browser extension automatically takes content you want to share and pops it into a message, ready to go. This sounds great! And for the most part, it is. Occasionally, though, we are reminded that extensions can be dangerous, and now is one of those times. Browser extensions are effectively allowed to run any arbitrary JavaScript they’d like on any page you visit, changing the DOM at will. In Company’s case, they were trying to make their customers’ lives easier by finding any text that looked like a Twitter hashtag or Twitter handle and converting it into a clickable link that automatically searches for that hashtag or handle. Well-intentioned, as is most of what we do as software engineers.

Sounds great. So where’s the vulnerability?

The vulnerability boils down to the following: if a page had a hashtag in its content that, as part of the hashtag, had an HTML-escaped element appended, it would get unescaped and then, by definition, inserted directly into the DOM by Company’s browser extension. Consider the following example:


If this showed up in the text of a page, the extension noticed the #tinfoil hashtag and attempted to convert it into a link. So the above became something like the following:

 <a class="_company_extension" a="" href="#" #tinfoil"="" "javascript:var e = document.createEvent(&quot;CustomEvent&quot;); e.initCustomEvent(&quot;extensionEvent&quot;, true, true, {type: &quot;hash&quot;, value: &quot;#tinfoil&quot;}); document.body.dispatchEvent(e); return false;">#tinfoil</a><script>alert('XSS'​)</script>

You’ll notice that this is malformed HTML to begin with (the #tinfoil attribute of the a tag, for example), but the more important issue is that the Company’s browser extension actually unescapes the escaped HTML for the script tag and, in doing so, inserts it into the DOM. Of course, the link still works, pulling up a search for #tinfoil as it should, so if we weren’t popping up an alert box, the user would be none the wiser.

Uh oh...so what does this mean for me?

Well, effectively this means that if you were using the Company’s browser extension within Chrome, it would execute any malicious JS stored on any website you were visiting. For example, suppose you were logged into a financial service and viewing the discussion forums for help on a topic - someone could have posted this malicious hashtag in response creating a persistent, or stored, XSS. Worse is that even if said financial service had taken the proper precautions to prevent XSS by escaping HTML into its HTML entities (as we recommend), the Company’s browser extension would still have re-encoded it and run the malicious JavaScript. Essentially, Company had accidentally XSS’d the whole internet.

Okay. So what can we learn from this?

User generated content occurs everywhere, and it is always important to escape any input you may receive. The mantra we often use is: “If you didn’t generate it, assume it’s malicious.” In this case, the document returned by the website the user is visiting should be considered potentially malicious. The extension is acting on data it did not create, and as such should treat it more carefully than it would other data, by escaping everything it can. So if you’re building a browser extension, be sure to treat the DOM of whatever page you’re modifying as dangerous, because it is.

We love to talk about these sorts of issues, so feel free to chat with us via email or in our support chat. :)

Michael "Borski" Borohovski

Michael Borohovski is cofounder and CTO at Tinfoil Security. He got his start in security when he was just 13 years old, and has been programming for longer than he can remember. When he's not busy breaking software or building it, he also loves singing, juggling, and magic tricks. Yes, magic tricks.

Tags: XSS browser extensions browser security chrome google google chrome security