HTML Tag
Match HTML tags including opening, closing, and self-closing tags with attributes.
What Is This?
This regex pattern matches HTML tags including opening tags (<div>), closing tags (</div>), and self-closing tags (<br />). It also supports attributes with quoted values. Note that regex is not powerful enough to parse arbitrary HTML correctly — use a proper HTML parser for complex parsing tasks.
How to Use
The Pattern
Use this pattern for simple HTML tag extraction or validation. For production HTML parsing, use DOMParser (browser), cheerio (Node.js), or htmlparser2. Regex-based HTML parsing fails on nested tags, comments, script/style content, and malformed HTML.
/<\/?[a-zA-Z][\w-]*(?:\s+[a-zA-Z][\w-]*(?:\s*=\s*(?:[^">]+|"[^"]*"|'[^']*'))?)*\s*\/?>/
Examples
Common HTML tags
Matches: <div> </div> <br /> <img src='test.jpg'> <a href='test'> Does not match: < > <> <?xml version='1.0'?>
Tags with attributes
Matches: <div class='container'> <input type='text' name='email' /> <meta charset='utf-8'> Does not match: <123> <>text</> <!-- comment -->
Related Entries
More from this reference:
Frequently Asked Questions
Why is regex bad for parsing HTML?
HTML has nested structures that regex cannot handle correctly. A famous Stack Overflow answer explains: regex cannot parse HTML because HTML is not a regular language. Use HTML parsers for anything beyond simple tag matching.
What about XHTML?
XHTML is stricter and XML-compliant. If your content is XHTML, the pattern will work more reliably. However, most web content is HTML5, which allows optional closing tags, unquoted attributes, and other non-regular constructs.