Subjects
Home
Xalan extension functions
Fomatting question serializing DOM with pretty print
xalan with pull parser
Cannot find the declaration
Apache Xalan drop support to run on JRE 1 1 x
Why does Doctype change processing of a document
Node set to XML string via Java extensions in Xalan J: possible?
Templates/Transformers + thread safety???
Problem evaluating xpath with muliple prefix with different namespace
remove an arbitrary attribute from xsl output
Xalan3 XSLT 2 0 XPath 2 0 support?
Problem using compiled translets with Xalan !!
Xalan and jstl 1 1 problem with transform tag
NullPointer in DOM2DTM getLocalName
URIResolvers base parameter with xsltc and cascaded imports
Performance problem for Xalan J on intel dual core
Standard libraries in JAXP?
Serializing a DOM tree to XML file, customize entities replacement
Library Conflict Involving BCEL Library
A question on how users are using <xsl:message >
Kevin Cormier as a new Apache Xalan J committer
Struggling to iterate over tokenized string
Xalan count() trouble
Problem with recursive xpath
Error when switching to java 1 5
document( ' ')
Problem with Xalan2 7 0 transformation
cr/lf options
entity encoded XML
can xalan transform 2 xml using one xslt?
Xalan J JIRA defect review Monday October 16, 2006 from 2:00 to 3:30 pm ED
xsl transform with cdata section elements
xslt parameters not expanded
Weird behavior of XPath evaluate()
How to avoid <xsl:message > instruction prints stylesheet file informations ?
Cannot find SimpleTransform subdirectory after installing Xalan J
recover from document not found exceptions
jdk1 5 and Xalan jar differences?
Performance Issue
Error/Bug adding floating point numbers
XPathAPI: eval exp using nodes with default namespace
modifying xalan to output invalid XML
NullPointerException
mege two separate xml nodes into one
Is this a XALAN document identification bug?
is StylesheetRoot really java io Serializable ?
transform() fails for DOMSource but succeeds for StreamSource
Thoughts on Transformer parameter passing
HELP, Xalan and jstl 1 1 problem with transformer
Problem with XPath namespace axis?
string utils:replace deleting search string if replacement string is an HTML
help with enumeration values pls
xalan 2 5 1 vs 2 7 performance question
How to insert/update in XML document
HTML Serialization and Handling of Ampersands in HREF Attributes
XHTML link tag stripping
SystemId Unknown; Line #24; Column #49; java lang NullPointerException
xpath text() help
Apostrophe problem with xalan 2 7 0
How to set variables in XML document?
Links
Home
Oracle database error code ...
 
Search:  
Power your search with and, or, +, -, or "some phrase" operators.
XHTML link tag stripping

XHTML link tag stripping

2006-11-29       - By Robert Houben
Reply:     1     2     3  

I happened to have something floating around that was close to what you
asked for, so I modified it and include it here. It doesn't normalize
the space:



<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

    <xsl:output method="xml" version="1.0" encoding="UTF-8"
indent="no"/>

           

           <!-- Kill these in the output tree -->

    <xsl:template match="a">

    </xsl:template>

     

    <xsl:template match="/">

       <htmltext>

           <xsl:apply-templates />

       </htmltext>

    </xsl:template>



           <!--

                       For all other node types, just copy the node and
it's content.

           -->

           <xsl:template match="*|processing-instruction()|comment()">

              <xsl:choose>

                 <xsl:when test="not(node())"><xsl:apply-templates
select="@*"/><xsl:text> </xsl:text></xsl:when>

                 <xsl:otherwise>

                    <xsl:apply-templates select="@*|node()"/><xsl:text>
</xsl:text>

                 </xsl:otherwise>

              </xsl:choose>

           </xsl:template>

           

           <!--

                       For all other attributes, copy the attribute.

           -->

           <xsl:template match="@*">

              <xsl:apply-templates /><xsl:text> </xsl:text>

           </xsl:template>

</xsl:stylesheet>



HTH,





________________________________

From: Peter Hollas [mailto:peterhollas@(protected)]
Sent: Wednesday, November 29, 2006 3:50 AM
To: xalan-j-users@(protected)
Subject: XHTML link tag stripping



Hi everyone,

Please could someone provide an example stylesheet of how to strip <a>
link tags out of a source XHTML document whilst retaining the remaining
node text from within the body. Preferably the output should have
normalised whitespace and a space seperating each extracted piece of
text. eg.

Source:

<html>
<head>
<title>Not wanted</title>
</head>
<body>
<a>Not wanted</a>
<div class="1">This text is wanted <a href="#">Not wanted</a> and so is
this</div>
<p>Wanted</p>
</body>
</html>


Output:

<htmltext>This text is wanted and so is this Wanted</htmltext>

I'm sure that the solution is incredibly simple, but after days of
trying I keep hitting a brick wall.

Many thanks, Peter.


<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft
-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns="http:
//www.w3.org/TR/REC-html40">

<head>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=us-ascii">
<meta name=Generator content="Microsoft Word 11 (filtered medium)">
<!--[if !mso]>
<style>
v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}
</style>
<![endif]-->
<style>
<!--
/* Font Definitions */
@(protected)
  {font-family:Tahoma;
  panose-1:2 11 6 4 3 5 4 4 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
  {margin:0in;
  margin-bottom:.0001pt;
  font-size:12.0pt;
  font-family:"Times New Roman";}
a:link, span.MsoHyperlink
  {color:blue;
  text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
  {color:purple;
  text-decoration:underline;}
span.EmailStyle17
  {mso-style-type:personal-reply;
  font-family:Arial;
  color:navy;}
@(protected) Section1
  {size:8.5in 11.0in;
  margin:1.0in 1.25in 1.0in 1.25in;}
div.Section1
  {page:Section1;}
-->
</style>
<!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
 <o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>

<body lang=EN-US link=blue vlink=purple>

<div class=Section1>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>I happened to have something floating
around that was close to what you asked for, so I modified it and include it
here. It doesn&#8217;t normalize the space:<o:p></o:p></span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'><o:p>&nbsp;</o:p></span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>&lt;xsl:stylesheet version=&quot;1.0&quot;
xmlns:xsl=&quot;http://www.w3.org/1999/XSL/Transform&quot;&gt; <o:p></o:p><
/span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>&nbsp;&nbsp;&nbsp;&nbsp; &lt;xsl:output
method=&quot;xml&quot;
version=&quot;1.0&quot; encoding=&quot;UTF-8&quot; indent=&quot;no&quot;/&gt;
<o:p></o:p></span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;<o:p></o:p></span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
&nbsp;&nbsp;&nbsp;&nbsp; &lt;!-- Kill these in the
output tree --&gt;<o:p></o:p></span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>&nbsp;&nbsp;&nbsp;&nbsp; &lt;xsl:template
match=&quot;a&quot;&gt; <o:p></o:p></span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>&nbsp;&nbsp;&nbsp;&nbsp; &lt;/xsl:template
&gt;<o:p></o:p></span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <o:p></o:p>
</span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>&nbsp;&nbsp;&nbsp;&nbsp; &lt;xsl:template
match=&quot;/&quot;&gt;<o:p></o:p></span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
&lt;htmltext&gt;<o:p></o:p></span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
&nbsp;&nbsp;&nbsp; &lt;xsl:apply-templates /&gt;<o:p></o:p></span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
&lt;/htmltext&gt;<o:p></o:p></span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>&nbsp;&nbsp;&nbsp;&nbsp; &lt;/xsl:template
&gt;<o:p></o:p></span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'><o:p>&nbsp;</o:p></span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
&nbsp;&nbsp;&nbsp;&nbsp; &lt;!--<o:p></o:p></span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
&nbsp;&nbsp;&nbsp; For all other node
types, just copy the node and it's content.<o:p></o:p></span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
&nbsp;&nbsp;&nbsp;&nbsp; --&gt;<o:p></o:p></span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
&nbsp;&nbsp;&nbsp;&nbsp; &lt;xsl:template
match=&quot;*|processing-instruction()|comment()&quot;&gt;<o:p></o:p></span><
/font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp; &lt;xsl:choose&gt;<o:p></o:p></span></font
></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;xsl:when
test=&quot;not(node())&quot;&gt;&lt;xsl:apply-templates
select=&quot;@*&quot;/&gt;&lt;xsl:text&gt; &lt;/xsl:text&gt;&lt;/xsl:when&gt;<o
:p></o:p></span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;xsl:otherwise&gt;<o
:p></o:p></span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
&lt;xsl:apply-templates select=&quot;@*|node()&quot;/&gt;&lt;xsl:text&gt;
&lt;/xsl:text&gt;<o:p></o:p></span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;/xsl:otherwise&gt;<o
:p></o:p></span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp; &lt;/xsl:choose&gt;<o:p></o:p></span><
/font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
&nbsp;&nbsp;&nbsp;&nbsp; &lt;/xsl:template&gt;<o:p></o:p></span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
&nbsp;&nbsp;&nbsp;&nbsp; <o:p></o:p></span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
&nbsp;&nbsp;&nbsp;&nbsp; &lt;!--<o:p></o:p></span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
&nbsp;&nbsp;&nbsp; For all other
attributes, copy the attribute.<o:p></o:p></span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
&nbsp;&nbsp;&nbsp;&nbsp; --&gt;<o:p></o:p></span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
&nbsp;&nbsp;&nbsp;&nbsp; &lt;xsl:template
match=&quot;@*&quot;&gt;<o:p></o:p></span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp; &lt;xsl:apply-templates
/&gt;&lt;xsl:text&gt; &lt;/xsl:text&gt;<o:p></o:p></span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
&nbsp;&nbsp;&nbsp;&nbsp; &lt;/xsl:template&gt;<o:p></o:p></span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>&lt;/xsl:stylesheet&gt;<o:p></o:p></span><
/font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'><o:p>&nbsp;</o:p></span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>HTH,<o:p></o:p></span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'><o:p>&nbsp;</o:p></span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'><o:p>&nbsp;</o:p></span></font></p>

<div>

<div class=MsoNormal align=center style='text-align:center'><font size=3
face="Times New Roman"><span style='font-size:12.0pt'>

<hr size=2 width="100%" align=center tabindex=-1>

</span></font></div>

<p class=MsoNormal><b><font size=2 face=Tahoma><span style='font-size:10.0pt;
font-family:Tahoma;font-weight:bold'>From:</span></font></b><font size=2
face=Tahoma><span style='font-size:10.0pt;font-family:Tahoma'> Peter Hollas
[mailto:peterhollas@(protected)] <br>
<b><span style='font-weight:bold'>Sent:</span></b> Wednesday, November 29, 2006
3:50 AM<br>
<b><span style='font-weight:bold'>To:</span></b> xalan-j-users@(protected)
<br>
<b><span style='font-weight:bold'>Subject:</span></b> XHTML link tag stripping<
/span></font><o:p></o:p></p>

</div>

<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:
12.0pt'><o:p>&nbsp;</o:p></span></font></p>

<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:
12.0pt'>Hi everyone,<br>
<br>
Please could someone provide an example stylesheet of how to strip &lt;a&gt;
link tags out of a source XHTML document whilst retaining the remaining node
text from within the body. Preferably the output should have normalised
whitespace and a space seperating each extracted piece of text. eg. <br>
<br>
Source:<br>
<br>
&lt;html&gt;<br>
&lt;head&gt;<br>
&lt;title&gt;Not wanted&lt;/title&gt;<br>
&lt;/head&gt;<br>
&lt;body&gt;<br>
&lt;a&gt;Not wanted&lt;/a&gt;<br>
&lt;div class=&quot;1&quot;&gt;This text is wanted &lt;a
href=&quot;#&quot;&gt;Not wanted&lt;/a&gt; and so is this&lt;/div&gt; <br>
&lt;p&gt;Wanted&lt;/p&gt;<br>
&lt;/body&gt;<br>
&lt;/html&gt;<br>
<br>
<br>
Output:<br>
<br>
&lt;htmltext&gt;This text is wanted and so is this Wanted&lt;/htmltext&gt;<br>
<br>
I'm sure that the solution is incredibly simple, but after days of trying I
keep hitting a brick wall. <br>
<br>
Many thanks, Peter.<o:p></o:p></span></font></p>

</div>

</body>

</html>