Regex Patterns

HTML Tag

Match HTML tags including opening, closing, and self-closing tags with attributes.

What Is This?

This regex pattern matches HTML tags including opening tags (<div>), closing tags (</div>), and self-closing tags (<br />). It also supports attributes with quoted values. Note that regex is not powerful enough to parse arbitrary HTML correctly — use a proper HTML parser for complex parsing tasks.

How to Use

1

The Pattern

Use this pattern for simple HTML tag extraction or validation. For production HTML parsing, use DOMParser (browser), cheerio (Node.js), or htmlparser2. Regex-based HTML parsing fails on nested tags, comments, script/style content, and malformed HTML.

/<\/?[a-zA-Z][\w-]*(?:\s+[a-zA-Z][\w-]*(?:\s*=\s*(?:[^">]+|"[^"]*"|'[^']*'))?)*\s*\/?>/

Examples

Example

Common HTML tags

Matches:
<div>
</div>
<br />
<img src='test.jpg'>
<a href='test'>

Does not match:
<
>
<>
<?xml version='1.0'?>
Example

Tags with attributes

Matches:
<div class='container'>
<input type='text' name='email' />
<meta charset='utf-8'>

Does not match:
<123>
<>text</>
<!-- comment -->

Related Entries

More from this reference:

Frequently Asked Questions

Why is regex bad for parsing HTML?

HTML has nested structures that regex cannot handle correctly. A famous Stack Overflow answer explains: regex cannot parse HTML because HTML is not a regular language. Use HTML parsers for anything beyond simple tag matching.

What about XHTML?

XHTML is stricter and XML-compliant. If your content is XHTML, the pattern will work more reliably. However, most web content is HTML5, which allows optional closing tags, unquoted attributes, and other non-regular constructs.