Saturday, September 6, 2025
HomeLanguagesJavascriptRegEx to Match Open HTML Tags Except Self-contained XHTML Tags

RegEx to Match Open HTML Tags Except Self-contained XHTML Tags

In this article, we will learn to Create a regular expression pattern that matches open tags in HTML except for self-contained XHTML tags.

A regular expression (RegEx) can be used to match open tags, excluding XHTML self-contained tags(For Eg- <br/>, <img />). This can be achieved by creating a pattern that matches opening angle brackets followed by a tag name, but excluding certain tags that are self-contained in XHTML, which don’t require a closing tag. The pattern can be tailored based on specific requirements and HTML structure.

Here are some common approaches to achieve this :

Approach 1: Using Negative Lookahead

A negative lookahead allows us to specify a pattern that should not be present after the current position in the string.

Syntax:

Regular Expression Pattern: <([a-zA-Z]+)(?![^>]*\/>)>

Example: In this example, we are using the above-explained approach.

Javascript




const regex = /<([a-zA-Z]+)(?![^>]*\/>)>/;
const inputString = 
    '<div><br/><p>Hello</p><span>World</span></div>';
const matches = inputString.match(regex);
  
console.log(matches);


Output

[
  '<div>',
  'div',
  index: 0,
  input: '<div><br/><p>Hello</p><span>World</span></div>',
  groups: undefined
]

Approach 2: Using a Whitelist of HTML Tags

Another approach is to create a whitelist of HTML tags that are considered valid open tags and match against that list.

Syntax:

Regular Expression Pattern: <(div|p|span|...)>

Example: In this example, we are using the above-explained approach.

Javascript




const regex = /<(div|p|span)>/;
const inputString = 
    '<div><br/><p>Hello</p><span>World</span></div>';
const matches = inputString.match(regex);
  
console.log(matches);


Output

[
  '<div>',
  'div',
  index: 0,
  input: '<div><br/><p>Hello</p><span>World</span></div>',
  groups: undefined
]

Approach 3: Using DOM Parse

The DOM Parser is a JavaScript utility that is built-in to HTML/XML strings and converts them into a structured document object model (DOM) representation, making it simple to navigate and manipulate the document’s contents.

Syntax:

parseFromString(string, mimeType);

Example: In this example, we are using the above-explained approach.

Javascript




// Example HTML input
const Data =
    '<div class="container"><p>Hello, <span>world!</span></p></div>';
  
// Create a DOM parser
const parser = new DOMParser();
  
// Parse the HTML string
const inputElement = parser.parseFromString(Data, 'text/html');
  
// Get all elements
const elements = inputElement.getElementsByTagName('*');
  
// Filter open tags
const matches = Array.from(elements).filter((element) =>
    element.outerHTML.match(/<([A-Za-z][A-Za-z0-9]*)\b(?![^>]*\/>)/));
  
// Output the matched open tags
console.log(matches);


Output:

(6) 
0: html
1: head
2: body
3: div.container
4: p
5: span
Whether you’re preparing for your first job interview or aiming to upskill in this ever-evolving tech landscape, neveropen Courses are your key to success. We provide top-quality content at affordable prices, all geared towards accelerating your growth in a time-bound manner. Join the millions we’ve already empowered, and we’re here to do the same for you. Don’t miss out – check it out now!
Dominic
Dominichttp://wardslaus.com
infosec,malicious & dos attacks generator, boot rom exploit philanthropist , wild hacker , game developer,
RELATED ARTICLES

Most Popular

Dominic
32271 POSTS0 COMMENTS
Milvus
82 POSTS0 COMMENTS
Nango Kala
6641 POSTS0 COMMENTS
Nicole Veronica
11807 POSTS0 COMMENTS
Nokonwaba Nkukhwana
11869 POSTS0 COMMENTS
Shaida Kate Naidoo
6754 POSTS0 COMMENTS
Ted Musemwa
7030 POSTS0 COMMENTS
Thapelo Manthata
6705 POSTS0 COMMENTS
Umr Jansen
6721 POSTS0 COMMENTS