fuzzing XSS filters

2009-01-24

Author：Eason

0x00 Preface

This is a learning summary. First of all, I would like to express my gratitude to several teachers whose names I have never met. Through studying your masterpieces, I have gradually entered the ranks of XSSer. Although my current level is far behind that of the masters, and my scripting foundation is not good enough, I will continue to study hard.

0x01 Brief

Once upon a time, N years ago, my summary of network security confrontation was "filtering and anti-filtering". Although this understanding seems too narrow now, filtering and breaking through filtering are indeed very important means to a large extent. The purpose of security is to filter out harmful information and save safe information, while the purpose of hackers is to break through filtering and pass harmful information. Whether it is SQL injection, XSS, overflow, etc., the most important thing is to break through filtering. This article mainly talks about fuzzing against stored XSS filters.

0x02 XSS filter：Security domain boundaries

As XSS attacks become more and more popular, XSS filtering is becoming more mature, especially for stored XSS. Since the attacks are more flexible and varied and their applications are more extensive, the filtering methods are also more complex. From the earliest blacklist string filtering to the current filter that combines blacklists and blacklists based on syntax analysis, the difficulty of the attack is becoming higher and higher.

Since filtering for reflected XSS is relatively simple, this article only discusses filtering for stored XSS. Currently, most of the stored XSS filtering is based on HTML syntax. Why not use simple string filtering? This is because HTML supports many encoding methods, and it is also troublesome to distinguish which is HTML language and which is text content. For example:

<div style=width:expression(alert(0))></div>

should be filtered, but

<div>style=width:expression(alert(0))</div>

It is normal content, so HTML syntax analysis must be performed.

filterAnother issue to consider is user experience. Although some very strict filters are relatively safe, they change normal content too much, and the page looks different, which makes it uncomfortable for normal users. In fact, most XSS filter vulnerabilities are not that XSS cannot be found, but that they cannot be completely eliminated after being found. In some ways, this is also to consider the user experience of normal users.

The filter based on HTML syntax analysis will divide the input content into many different security domains. For example, the content outside the HTML tags is not filtered, which is the lowest security level. The content within the HTML should also be divided. The style sheet is a place where problems are prone to occur. It should be scanned intensively, and even a whitelist should be considered. Even within the style sheet, different areas should be divided and different scanning levels should be used. Because different security domains have different detection methods, the boundary between two security domains is prone to problems. Once the boundary is wrong, it is possible to break through the filter.

0x03 design of fuzzer:break the boundary

When some ordinary attempts fail to successfully break through the filter, we should consider using fuzzer. First, we need to design the test model, which is also the core issue. I often wonder why fuzzing can be successful. It must be that there are some hidden vulnerabilities in the filter, such as the security domain boundary mentioned earlier. Therefore, we design the fuzzing model to focus on testing these places. Let's take an example to illustrate the problem of security domain boundary:

<div style="width:expression(alert(9));">

see

expression(alert(9))

This is the key scanning area, that is, the area with the highest security level. /* */ and expression and so on should be filtered，How does the filter determine this area? First, find the style attribute in the div tag, find the content between "" after =, divide the style name and content according to :, and separate several styles with ;. So here = ":; these are all keywords. If the XSSer submits the following content:

<div style="width:expre/*"*/ssion(alert(9));">

The highest security area should also be

expre/*"*/ssion(alert(9))

However, if the filter divides the area based on the first closed double quote pair, the following content will be filtered:

width:expre/*

This is obviously incorrect, because the content in /* */It is a comment, which is the content to be discarded. If the area boundary is determined based on the first double quote, then the expression cannot be filtered in the green area, which will cause the filter to be bypassed. Therefore, the correct way to determine the boundary here is to find the last double quote to close it. If it is not found, it must be automatically added. Otherwise, it will cause the expansion of the high-level security area. Expansion does not mean it will be safer. The expansion of the security area will cause confusion in the subsequent boundaries and also cause vulnerabilities. Of course, this situation is just my imagination for example. The actual situation is unlikely to be so simple. What I want to explain here is the importance of area boundaries for XSS fuzzing.

Therefore, when we design a fuzzer, we need to add some element combinations as test cases in a template where the boundary may be confused. So the first thing to consider is to determine the template. For example, we can put

<div style="width:expre/*position*/ssion(alert(9));">

As a template, the elements are filled inside /**/. This way, we have a simple fuzz model.

Of course, this is not a good template because it is not complex enough and filters are generally taken into consideration.

Then add something before:

<div id="position1" style="width:expre/*position2*/ssion(alert(9));">

This way, there are two places to fill in elements, which is a little more complicated, but better than the first template. My idea is that the more complex the template and the more fill positions and fill elements, the more likely it is to have vulnerabilities. But in actual applications, because we are doing black box testing, we also need to consider some other issues, such as efficiency, recognizability, etc., which I will talk about later.

With the template, what kind of elements to fill it with is the second important issue we need to consider. First of all, border elements are definitely needed, such as the border elements in the example above.

=":;

what we can remember also includes:

Space、<、\>、</div\>

and so on, Sometimes it seems impossible to determine the error boundary, but in fact the behavior of the filter is often unexpected, which is the meaning of fuzzing.

In addition to boundary elements, another type of element to consider is the element filtered by the filter. For example, the filter filters expression. We also treat expression as a randomly filled element, as well as / and /, onXXX(), etc. When the filter deletes or changes these contents, it may cause changes in the boundary.

Another element is invisible special characters, such as \t, \r, \n, \0, etc., half of a UNICODE character, etc. There are also strings that are reverse-decoded by the filter, such as &#XX, %XX, \XX, etc.

The last thing to consider when determining templates and elements is the browser's undocument behavior, for example

<<div/style="width:expression(alert(9))">

It will be parsed normally by IE. There are two < here, and the div is separated by / instead of space. Some filters may not consider such behavior because it does not conform to HTML syntax specifications. There are many strange parsing methods of browsers like this. The DX who analyzed the cross-site of YAHOO Mail last year will still remember the abnormal url() in CSS. I admire the teacher who discovered this method. This problem seems to be not just a filter problem. Even IE has problems with the division of boundaries. Of course, this problem is just a small BUG for IE and cannot be considered a vulnerability, but it leads to XSS when combined with YAHOO's filter.

0x04 Practical Exercise: Local Fuzzing Example

Before actually performing remote black-box fuzzing, let's construct a local fuzzing example to practice and see if the template construction and element selection can achieve the desired effect.

I chose htmLawed as the fuzzing object. This is an open source HTML filter written in PHP. If you are interested, you can go to http://www.bioinformatics.org/phplabware/internal_utilities/htmLawed/htmLawedTest.php For this website, first test it manually to see if you can bypass its CSS filtering manually. Maybe an expert can do XSS manually, but I tried for a while and failed. There is no way, for I am not smart. Here I am talking about the style sheet XSS in the example above. Don't just enter a <script>, it will not be filtered, because this filter needs to add parameters.

After I downloaded html, I did a test program locally:

<?php
include './htmLawed.php';
$m1=array("'","\""," ","");
$m2=array("","","\"","'","<","","","","","","","","");
$mag=array("'","\""," ","</div>","/*","*/","\\","\\\"","\\\'",";",":","<",">","=","<div","\r\n","","&#","/","*","expression(","w:expression(alert(9));","style=w:expression(alert(9));","");
for($i=0;$i<10000;$i++)
{
$fname = "tc\\hush".$i.".html";
$fp = fopen($fname, "a");
$mtotran = "";
for($j=0;$j<1000;$j++)
{
shuffle($mag);
shuffle($m1);
shuffle($m2);
$mstr=$m2[0];
$mstr.="<div id=";
$mstr.=$m1[0];
$mstr.=$mag[0];
$mstr.=$mag[1];
shuffle($mag);
$mstr.=$mag[0];
$mstr.=$m1[0];
$mstr.=" style=";
shuffle($m1);
$mstr.=$m1[0];
$mstr.="w:exp/*";
shuffle($mag);
$mstr.=$mag[0];
$mstr.=$mag[1];
$mstr.="*/ression(alert(9));";
shuffle($mag);
$mstr.=$mag[0];
$mstr.=$mag[1];
$mstr.=$m1[0];
$mstr.=">".$j."</div>\r\n";
fwrite($fp, $mstr);
$mtotran.=$mstr;
}
fclose($fp);
$outcont = htmLawed($mtotran);
// print $outcont."\r\n";
$fp1 = fopen("C:\\Inetpub\\wwwroot\\out\\hush".$i.".html", "a");
fwrite($fp1, "<HTML>\r\n<HEAD>\r\n<TITLE>".$i."</TITLE>\r\n<meta http-equiv=\"refresh\" content=\"1;url=hush".($i+1).".html\">\r\n</HEAD>\r\n<BODY>\r\n");
fwrite($fp1, $outcont);
fwrite($fp1, "</BODY>\r\n</HTML>");
fclose($fp1);
print $i."\r\n";
// break;
}
?>

Excuse me, I have never written PHP. I just learned it and put together a bad program. It just implements my ideas without considering efficiency and stability. The program is very simple. It randomly fills in some elements according to the simple template mentioned above, generates testcases, and then generates result files after filtering with html. Put it on the WEB server and let it run automatically to see if an alert pops up. As I said before, this template is not very good and too simple, so I generated a little more, a total of 10 million test cases, but the alert dialog box popped up when accessing the first file 0_0. It can be seen that html is not a proven filter.

I looked at the use cases and found that there are many ways to bypass the filtering of HTML. Here is a simple example:

<div id= \""&#  style="w:exp/*\\'<div*/ression(alert(9));'=">722</div>
<div id="\">/" style="w:exp/*&#*/ression(alert(9));&#</div>">723</div>

After these two are passed to the filter together, the filtering result is:

div style="w:exp  '<div*/ression(alert(8));'=">722</div>
<div>/" style="w:exp/*&#*/ression(alert(9));&#</div>">723

An alert will pop up, proving that the filter has been bypassed. After analysis, it was found that the main reason was that there was a problem with the filtering of the <div tag. When two <divs exist at the same time, the second one will be retained and the first one will be discarded, which changes the original security domain boundary. Combined with the double quotes in the second div to enclose the previous content, the style of the second one should have been a low-security domain for content outside the tag. /**/ and expression were not filtered, and after combining with the previous style, it entered the range of the high-security domain, resulting in XSS. Does it make you feel dizzy? This proves that fuzzing can do things that the human brain cannot do (I'm talking about the average human brain here, except for the experts).

0x05 Practice: Remote Fuzzing

The local fuzzing example we just saw is actually a black-box test rather than a white-box test, because we do not consider the filter source code, nor do we directly use the program to load the filter to run. We only care about the filtering results. I have read an article written by a foreigner about XSS fuzzing before, which said that to crack a remote filter system, first simulate the behavior of the filter locally, then run it directly in the program, verify it remotely based on the results, and then modify the local simulated filter based on the remote results. Continue to modify it until all the features of the remote filter are fully implemented locally. Finally, fuzz the local filter directly in the program. This method sounds very good, because fuzzing inside the program is very efficient, and it can be tested tens of thousands or even hundreds of thousands of times per second. If fuzzing is done remotely, sometimes it may not be completed once a minute. Even if hundreds of test cases are sent at a time, the total average time is very time-consuming, including the time to generate test cases, send and receive time, and verify the results. However, in practice, you will know that the method of simulating remote filters locally is more of a theory, because it is impossible to simulate all the features of the filter program based on the input and output results alone, and it will also miss most of the vulnerabilities.

Therefore, if you want to really find the vulnerability, you still have to rely on remote fuzzing, so you need to consider the issue of test efficiency. I think the quality of a fuzzer depends on two factors: the first is the design of the test model, and the second is the efficiency of the fuzzer itself. Even if you use a very bad test case, if the fuzzer is efficient enough, it is still possible to find the vulnerability.

How to improve the test efficiency? This is a question I have been thinking about. For example, if the object of our fuzzing is a web mail system, then the basic design of the fuzzer should be designed like this: generate testcase according to the template->send testcase->verify the result. Then improving efficiency should also start from these three aspects.

First is the testcase generation module. I usually select elements to fill in the template according to random numbers. The generation of random numbers is a key factor. We need to make the random numbers as uniform as possible, which will reduce duplicate data and improve the test efficiency to a certain extent. It is very difficult to generate uniformly distributed random numbers using C language. The generated random numbers are always repeated, and it is a waste of time to judge. Finally, it feels much better to use Python. You can use string.join and random.sample to generate random string combinations.

In fact, the efficiency of generating test cases also depends on the sending efficiency, because even if 10,000 test cases are generated per second, but only 10 can be sent per minute, then it is a waste to generate more. So the sending efficiency is more important. I can't think of any good way here. I can only adopt the strategy of combined sending, just like fuzzing htmLawed before, combining 1,000 or even more test cases. The benefit of doing so is not only to improve efficiency, but also because of the increased degree of confusion after the combination, it often produces unexpected filtering results. The combined sending method considers efficiency first, but sometimes we also need to send one by one, in order to accurately view the filtering situation each time and see how the test case will change during the filter processing process, because each replacement or deletion of the filter may cause a change in the security boundary.

The efficiency of the first two modules is secondary. Haha, the most important thing is actually the efficiency of the verification results. This often depends on the target situation to determine the best verification method. You can verify manually by manually opening the browser to click on the received content, but the efficiency is extremely low, which is obviously not suitable for tens of thousands or hundreds of thousands of test cases. You can also verify through program automation. There are generally two common methods: one is to simulate the browser to receive the result from the WEB application through the program, and then determine whether there is a characteristic string to verify whether it is successful. The other is to open the output page with the help of the browser to verify. The program only controls IE to access the sending case in sequence, which can be done by simulating the mouse and keyboard. The advantage of the former method is high efficiency, but it is easy to false positives and false negatives, like the htmllawed vulnerability found by fuzzing above, which is very difficult to judge through the program. Although the latter method is very inefficient, it will not miss 100%. So I usually take the second method. In fact, there is a third method, which is a combination of the first two methods. First, write a program to obtain content from the WEB, then generate html files that can be automatically opened in sequence locally, and then open them with IE. This is of course a perfect situation, but designing and writing a program is also very troublesome. For general goals, it is not worth the effort. For some goals, haha, it is still worth making a program. There is also a basic requirement here, which is to be able to find the original input corresponding to the filtered string. As long as this is taken into account, it is not difficult to implement.

I have been talking for a long time but still haven’t given any examples. For some reasons, I will not write the actual remote fuzzing code and process. Smart readers can try it by themselves. I tested some commonly used WEB mailboxes at home and abroad, and the mailboxes that successfully fuzzed out XSS vulnerabilities are:

@yahoo.com
@hotmail.com
@aol.com
@hanmail.com
@fastmail.fm
@hushmail.com
@epochtimes.com
@163.com
@sino.com
@sohu.com
@tom.com
@21cn.com
@qq.com

In fact, most of the vulnerabilities in China were not discovered through fuzzing, but through manual testing, because the filters in domestic mailboxes are still relatively basic and imperfect, and some of the simplest tricks can fool the filters. Of course, more vulnerabilities were discovered when testing with fuzzers later.

0x06 Thinking: How to make a perfect filter

If an article only writes about attacks and not about prevention, it will be despised by peers, especially people of my age. So how can fuzzing testing help improve filters? If each company does enough fuzzing testing on its own products, instead of having hackers test them, I think it will greatly improve product security. What's more, the company has the source code itself, which can increase the efficiency of fuzzing testing by several orders of magnitude. I think many companies also have tests in this area. But why are there still vulnerabilities? It is probably because the development and testing personnel of many companies are not from a security background, or at least are not proficient in security. From another perspective, some network companies do not pay enough attention to the security of their products. On the other hand, there is the issue of profit-driven. Black hats can also dig out vulnerabilities that Microsoft cannot detect itself. I don't need to explain this in detail.

So how should we design filters to minimize vulnerabilities? Let me share my own views for discussion. First of all, we must clarify the security boundaries and try to use whitelist filtering. Many filters have done this. Secondly, on the basis of clear boundaries, we must clarify the handling method for illegal data. The safest way is to discard the entire data packet once illegal data is found. Of course, this cannot be done in actual applications because it will greatly affect the user experience of the application. Then we must handle the security domain of the illegal data. As mentioned earlier, whether it is deleted or replaced, there are certain risks, which may cause changes to other security domain boundaries. This requires a re-examination mechanism. For the data that has been processed illegally, each security domain boundary must be judged again, and the cycle continues until no illegal data is found. Doing so may lead to another risk of DoS attacks. It depends on how to make a choice.

However, as long as a program is written by a person, there will be carelessness and vulnerabilities, so two aspects must be combined. First, when developing the filter, clarify the security specifications and do not take things for granted. After the discovered vulnerabilities are patched, check whether they meet the original security requirements, because many times new vulnerabilities are created by patching old vulnerabilities. Second, do as comprehensive black box testing as possible, provided that someone who understands security must be involved in the testing department. Doing these two points still cannot completely prevent the occurrence of vulnerabilities, so there must be a vulnerability discovery mechanism. Relying on user reports is only one aspect, and on the other hand, there must be an automated vulnerability monitoring mechanism. This kind of thing is easier said than done, so I won't say more.

References:

Blackbox Reversing of XSS Filters（Alexander Sotirov） Web application security design ideas（axis）