Sep 20 2007

Trimming comments in HTML documents using Apache Ant

Published by Julien Lecomte at 2:45 pm under Web Development

This short article, explaining how to trim unnecessary code (comments, empty lines) from HTML documents, is a follow-up to an article published a couple of weeks ago on this blog: Building Web Applications With Apache Ant. Basically, the idea is to use Ant’s optional replaceregexp task as shown below:

<target name="-trim.html.comments">

    <fileset id="html.fileset"
        dir="${build.dir}"
        includes="**/*.jsp, **/*.php, **/*.html"/>

    <!-- HTML Comments -->
    <replaceregexp replace="" flags="g"
        match="\&lt;![ \r\n\t]*(–([^\-]|[\r\n]|-[^\-])*–[ \r\n\t]*)\&gt;”>
        <fileset refid=”html.fileset”/>
    </replaceregexp>

    <!– Empty lines –>
    <replaceregexp match=”^\s+[\r\n]” replace=”" flags=”mg”>
        <fileset refid=”html.fileset”/>
    </replaceregexp>

</target>

Update: Use this code very carefully as it is dangerous territory (Thanks to my co-worker Ryan Grove for pointing out some of the shortcomings)

6 Responses to “Trimming comments in HTML documents using Apache Ant”

  1. Mike Henkeon 20 Sep 2007 at 6:32 pm

    Awesome. Keep the ant scripts coming they are great.

  2. [...] Ant Homepage Phing Homepage Building Web Applications With Apache Ant - Julien Lecomte Trimming comments in HTML documents using Apache Ant - Julien Lecomte Improve Your Build Process with Ant - ONLamp Ant sucks for FTP deployment - What [...]

  3. [...] desconocido wrote an interesting post today!.Here’s a quick excerptfrom HTML documents, is an update to an article published a couple of weeks ago on this blog: Building Web Applications With Apache Ant. Basically, the idea is to use Ant’s optional replaceregexp task as shown below: … [...]

  4. Steveon 28 Sep 2007 at 11:15 am

    How do you prevent the regex from removing Javascript enclosed in HTML comments in order to hide it from browsers, which have Javascript disabled?
    I fiddled around a lot but never managed to get ant ignore comments which end with //–>. It always results in an infinite loop. :-(

  5. Julien Lecomteon 28 Sep 2007 at 11:53 am

    @Steve

    Do not wrap inline JavaScript code inside HTML comments. Nobody uses Netscape 1 anymore…

  6. Jaime Buezaon 18 Feb 2008 at 5:58 pm

    Here’s one for removing console.log within a combined yui compressed file.


    <replaceregexp file="my_combined_file.js" match="(console\.log\(.*\))" flags="g" replace="\/\/\1"></replaceregexp>

Comments RSS

Leave a Reply