Regular Expression (regexp) to remove all HTML tags from a string
To remove all HTML tags from a string, you can use the following regular expression.
This could be used to stop HTML being injected into URL values. Or even just for creating plain text content from a HTML page.
<cfset myString = "<a href='/link.cfm'>Link to my page</a>">
<cfset myString = reReplaceNoCase(myString, "</?\w+(((\s|\n)+\w+((\s|\n)*=(\s|\n)*(?:#chr(34)#.*?#chr(34)#|'.*?'|[^'#chr(34)#>\s]+))?)+(\s|\n)*|(\s|\n)*)/?>", "", "All")>
<cfoutput>There should be no hyperlink: #myString#</cfoutput>
<cfset myString = reReplaceNoCase(myString, "</?\w+(((\s|\n)+\w+((\s|\n)*=(\s|\n)*(?:#chr(34)#.*?#chr(34)#|'.*?'|[^'#chr(34)#>\s]+))?)+(\s|\n)*|(\s|\n)*)/?>", "", "All")>
<cfoutput>There should be no hyperlink: #myString#</cfoutput>
TweetBacks
Comments (Comment Moderation is enabled. Your comment will not appear until approved.)
[Add Comment]
[Subscribe to Comments]
Excellent. I'm totally going to use this. Thanks.
# Posted By David McGuigan
| 26/01/08 03:31
[Add Comment]
