XHTML link tag stripping 2006-11-29 - By Peter Hollas
Hi everyone,
Please could someone provide an example stylesheet of how to strip <a> link tags out of a source XHTML document whilst retaining the remaining node text from within the body. Preferably the output should have normalised whitespace and a space seperating each extracted piece of text. eg.
Source:
<html> <head> <title>Not wanted</title> </head> <body> <a>Not wanted</a> <div class="1">This text is wanted <a href="#">Not wanted</a> and so is this</div> <p>Wanted</p> </body> </html>
Output:
<htmltext>This text is wanted and so is this Wanted</htmltext>
I'm sure that the solution is incredibly simple, but after days of trying I keep hitting a brick wall.
Many thanks, Peter.
Hi everyone,<br><br>Please could someone provide an example stylesheet of how to strip <a> link tags out of a source XHTML document whilst retaining the remaining node text from within the body. Preferably the output should have normalised whitespace and a space seperating each extracted piece of text. eg. <br><br>Source:<br><br><html><br><head><br><title>Not wanted </title><br></head><br><body><br><a>Not wanted</a> <br><div class="1">This text is wanted <a href="#" >Not wanted</a> and so is this</div> <br><p>Wanted</p><br></body><br></html><br><br><br >Output:<br><br><htmltext>This text is wanted and so is this Wanted< /htmltext><br><br>I'm sure that the solution is incredibly simple, but after days of trying I keep hitting a brick wall. <br><br>Many thanks, Peter.<br>
|
|