The article was last updated: November 01, 2019 09:56:55
For the bypassing posture of XSS attack, you may use the code by opening your mouth, but there are so many ways to code, how to code, and what is the principle, most people may not be familiar with it, this time I will use this article to explain Explore the posture and principle of XSS encoding bypassing. If there is any error, please correct me.
0x01 Know the decoding process of the requested webpage
Coding belongs to the basic knowledge of computer systems, and its content can be written in a book, but we all know more or less. In general, coding is to convert characters into binary numbers, and decoding is to restore binary numbers. Numbers are characters. From the browser requesting the url to the display on the page, it has also gone through some encoding and decoding processes. The following is a brief introduction to the process. The specific process can refer to here
- URL encoding URL encoding is to allow non-standard characters such as Chinese characters in the URL. The essence is to convert a character into % and add the hexadecimal number corresponding to UTF-8 encoding. So it is also called Percent-encoding. When the server receives the request, it will automatically perform a URL decoding on the request.
- HTML encoding/decoding When the browser receives the binary data sent by the server, it first decodes it in HTML, and what is presented is the source code we see. The specific decoding method depends on the specific situation, so we need to specify the encoding in the page to prevent the browser from decoding in the wrong way, resulting in garbled characters. For example, the Baidu search homepage specifies the decoding method as UTF-8: (In order to find the key point, code the rest)
However, some characters in HTML conflict with keywords, such as <, >, &. After decoding, the browser will mistake them for tags. How to solve it? In order to display reserved characters correctly, we need to use character entities in HTML source code, such as our common space , Character entities are represented by &beginning + predefined entity names, but not all characters have entity names, but they all have entity numbers, which can also be represented by &# at the beginning + entity number + semicolon. For ex amp le:
Less than sign
greater than sign
After the browser decodes the HTML, it starts parsing the HTML and converts the tags into DOM nodes in the content tree. At this time, when identifying tags, the HTML parser cannot identify the content encoded by the entity. Only by establishing a DOM tree can the Identify the content of each node. If entity encoding occurs, entity decoding will be performed. As long as it is the value of the attribute in the DOM node, it can be HTML encoded and parsed.
Therefore, in PHP, the htmlspecialchars() function is used to convert the predefined characters into HTML entities. Only after the DOM tree is established, the HTML entities will be parsed, which plays a role in XSS protection.
It can be seen that if you want characters to be recognized by JS after encoding, you can perform unicode encoding on the characters.
0x02 XSS coding practice
The whole parsing sequence is 3 links: HTML decoding --> URL decoding --> JS decoding
We can make the following deformations, and the following cases can be successfully popped up:
- It is also possible to use decoding order for mixed encoding:
How about it? After three codes, you can pop up the box, do you feel a little confused? It's best to try it yourself here to see if you can pop the frame.
The question is, since the browser will perform URL decoding on the links in the href, whether it can URL-encode the content in the href as a whole:
The XSS code similar to the above parsing situation is:
Slightly deform it to get:
Try our ultimate killer again
copy1. right alert conduct JS coding <script>\u0077\u0069\u006e\u0064\u006f\u0077(1)</script> Note: this is not correct alert(1)conduct HTML encoded because HTML Found this while parsing DOM node is script，will call JS Parse to parse the contents. But there is a little trick: <svg><script>alert('xss')</script>
And a little detail:
There are so many things about XSS encoding. In short, to learn any knowledge, you need to understand the principle. Looking back, you will always find something you don't know.