XPath is a powerful tool for locating elements on a webpage. Here's a guide with beginner-friendly examples covering operators and functions.
What is XPath?
XPath (XML Path Language) is a query language for navigating and locating elements within HTML or XML documents. It’s commonly used in web automation tools like Playwright and Selenium to interact with webpage elements.
Parent-Child Relationships in XPath
A parent element contains child elements directly inside it. A direct child is immediately nested inside its parent, while a descendant is any element inside the parent, even if it’s nested deeper.
Example HTML:
<div> <!-- Parent -->
<p>This is a child of the div.</p>
<span>This is another child of the div.</span>
<div> <!-- Nested div -->
<p>This is a child of the nested div.</p>
</div>
</div>
Understanding Context in XPath
Context is the starting point of your XPath search. You decide where to begin looking for elements.
Choosing Context:
-
Whole Document Context: Use
//to search globally.
Example://p- Finds all<p>elements in the document. -
Specific Element Context: Start at a specific element using unique attributes.
Example://div[@id='outer']- Sets the context to the<div>withid='outer'and searches inside it. -
Refining the Context: Use relationships to narrow down (Direct children:
/, Descendants://).
XPath Operators and Examples
| Operator | What it Does | Example |
|---|---|---|
| / | Selects a direct child. | /html/body/div |
| // | Selects all matching descendants. | //div |
| @ | Selects an attribute. | //input[@name="search"] |
| . | Refers to the current node. | .//span |
| .. | Refers to the parent node. | //div/../header |
| * | Matches all elements. | //div/* |
| text() | Matches elements based on text content. | //button[text()="Submit"] |
XPath Functions and Examples
| Function | What it Does | Example |
|---|---|---|
| contains() | Matches elements containing specific text. | //button[contains(text(), "Login")] |
| starts-with() | Matches elements where attributes start with a value. | //input[starts-with(@name, "user")] |
| normalize-space() | Removes extra spaces and matches clean text. | //p[normalize-space()="Hello World"] |
| last() | Selects the last element in a set. | //ul/li[last()] |
| position() | Selects elements by position. | //ul/li[position()=2] |
XPath Axes and Examples
| Axis | What it Does | Example |
|---|---|---|
| ancestor | Selects all ancestors (parents) of the current node. | //span/ancestor::div |
| child | Selects all direct children of the current node. | //div/child::p |
| descendant | Selects all descendants (children, grandchildren, etc.) | //div/descendant::a |
| following | Selects all nodes after the current node. | //h2/following::p |
| preceding | Selects all nodes before the current node. | //h2/preceding::div |
| self | Refers to the current node itself. | //div/self::div |
Beginner-Friendly Examples
Here are some simple examples to help you get started:
Using Operators
Select a specific input field by its id:
//input=[@id='email']
Finds the <input> element with id="email".
Select all buttons on the page:
//button
Finds all <button> elements.
Select the text of a specific paragraph:
//p[text()='Welcome to our website']
Matches a <p> element containing the exact text "Welcome to our website".
Using Functions
Select elements with partial text using contains:
//a[contains(text(), 'Sign']
Matches <a> tags where the text includes 'Sign', like "Sign In" or "Read this Sign".
Match an element starting with a specific value using starts-with:
//div[starts-with(@class, 'menu')]
Selects all <div> elements where the class attribute starts with "menu".
Count the number of links on the page:
//count(//a)
Returns the total number of <a> elements in the DOM.
Select the last item in a navigation bar:
//nav/ul/li[last()]
Selects the last <li> in a navigation menu.
Using Axes
Find all parent <div> elements of a button:
//button/ancestor::div
Selects all <div> elements that are parents of a <button>.
Find all sibling elements that follow a specific <h1>:
//h1/following-sibling::*
Matches all sibling elements that come after the <h1>.
Find all descendants (children, grandchildren, etc.) of a <div> with a specific class:
//div[@class='container']/descendant::*
Selects all elements inside a <div> with the class container.
Testing XPath in the Browser Console
You can test your XPath expressions directly in the browser:
- Open a webpage in your browser.
- Right-click the element you want to inspect and select Inspect.
- Go to the Console tab.
- Use the
$x()function to test your XPath.
Example:
$x("//input[@name='search']")
Best Practices for Writing XPath
Follow these tips for effective XPath writing:
- Use Relative XPath: More adaptable to changes in the DOM structure:
//div[@class='content']
instead of:
/html/body/div[2]/div[1]/div
- Leverage Unique Attributes: Use
idornameattributes for better precision.
//input[@id='username']
- Combine Multiple Attributes: Refine selection with
andoror.
//button[@type='submit' and contains(text(), 'login')]
- Avoid Overly Complex XPaths: Simple and direct paths are easier to maintain.