Friday, August 8, 2008

Regular Expression: Search All HTML Tags

The following pattern matches all HTML tags:

<(.\n)*?>

// (.\n) - look for any character or a new line
// *? - 0 or more occurences, and make a non-greedy search meaning that the match will stop at the first available '>' it sees, and not at the last one


Using regular expression in C++/CLI:
aSource = System::Text::RegularExpressions::Regex::Replace( aSource, L"<(.\n)*?>", System::String::Empty );


Ref:
Strip HTML tags from a string using regular expressions - ISerializable - Roy Osherove's Blog
Regular expression - Wikipedia, the free encyclopedia

No comments: