MagicTree Documentation: XPath Crash Course - Learning By Example

Using Table View | Table of Contents | Using Node Status >

Introducing XPath

MagicTree uses XPath query language to query data in the tree. XPath is defined in W3C recommendation http://www.w3.org/TR/xpath

XPath is a synax for selecting nodes in an XML document. Suppose you have the following document:

<a>AAA
<b>BBB
<c>CCC</c>
</b>
</a>

If you want to select the node <c>CCC</c> you can use the following XPath expression:

/a/b/c

It means:

  1. Start with the document root. (That's the leading slash)

  2. Select all first-level nodes that have "a" as tag (normally there will be at most one of them, because XML documents have only one first-level node)

  3. Select the child nodes of the nodes that were selected in step 2, but only those that have "b" as tag

  4. Select the child nodes of the nodes that were selected in step 3, but only those that have "c" as tag

You could also have used the following expression:

//c

It means "select nodes anywhere in the document (that's what double slash stands for - anywhere in the document), that have "c" as their tag".

Or, you could have used the following expression:

//*[text()="CCC"]

It means "Select nodes with any tag (that's what the star means - any tag) that have "CCC" as text". The expression in square brackets is called predicate. It is the condition on which the nodes are selected.

Now that you have a taste of what XPath looks like, let's see how it applies to MagicTree.

Using XPath Expression in Table Queries

XPath expressions are used in MagicTree Table View and in report templates. To present the tree data in the table form you need to specify an XPath expression for each table column.

Suppose we want to see all HTTP servers that were found on the network we have scanned.

Let's start with a table with one column. In the table in the top of the Table View frame we will enter the the column specification. In the Title column we will put "host". This will be the title of the column in the results table.

In the Expression column we will enter the following expression:

//host[descendant::service="http"]

It means "find me all host nodes that have HTTP service". Now click on "Run" button to execute the query. You will something like this:



query1.png


Now suppose we want not just hosts, but also the ports where the HTTP service runs. So we want a table with one column containing the IP addresses, and another column containing the ports. Let's modify our query. In the first row we will remove the condition from the expression, so it will become just "//host". Then, by clicking on the "+" (plus) button on the right of the query table, we will add another row to the table. In thsi row, we enter "Port" as the column title and descendant::port[descendant::service="http"] as the expression. Clicking "Run" produces the following results:



query2.png


The query in the second row is executed relative to the results of the query in the first row. So rather then starting from the root of the tree, it starts from each "host" node found by the first query. If the second query does not find any nodes for the given "host", then the results table will not contain any rows for this host. If it finds multiple nodes, then the results table will contain multiple rows for this host. So if host 192.168.1.1 has no HTTP ports, it will not appear in the results. If it has ports 80 and 443 running HTTP services, then the results table will contain a row "192.168.1.1", "80" and "192.168.1.1", "443".

Using "Leaf" and "Hidden" Flags in Table Queries

Each row of the query table allows specifying two flags - "Leaf" and "Hidden". When the query is executed, the expression speciafied in each row is applied to the nodes returned by the previous expression. Sometimes this is not what we want. Let's say. we want the IP addresses and the operating systems for all HTTP hosts and ports. We will add an expression for the "os" after the "host" row. But then the expression for port will be based on the rows returned by "os", and that's not what we want. So. marking "os" as "leaf" tells the query processing engine that the expression in the next row after "os" row should be based on "host" rather thatn on "os". The query below illustrates this idea:



A query using "leaf" flag


The "leaf" flag has also an effect that if no data is returned by a "leaf" expression, the cell is left blank. If the "leaf" flag is not set for an expression, and an expression does not return anything for a given row, the whole row is discarded.

"Hidden" flag hides the column produced by the expression marked as "hidden" from the results shown in the table. The expression gets processed normally, but its results are just not displayed. It can be useful in simplifying regular expressions in queries. It also often comes very useful in report templates.

Using Table View | Table of Contents | Using Node Status >