Jump to content

User:OrenBochman/ParserNG/Sanitizer Antlr

From mediawiki.org

Sanitizer Filter Antlr

[edit]

"To make ANTLR generate lexers that behave like the UNIX utility sed (copy standard in to standard out except as specified by the replace patterns), use a filter rule that does the input to output copying:" - antlr docs [1]

class cfgSed extends Lexer;
options {
  k=2;
  filter=IGNORE;
  charVocabulary = '\3'..'\177';
}

FORM  : "<FORM>" 
        { System.out.print("<!-- &gt; FORM &lt; -->"); // filter output
          System.err.print(String.format("munged illegal tag <FORM> at %d,%d",cfgSed.line,cfgSed.column)); // error message
        }
        ;
STYLE : "<STYLE>" 
        { System.out.print("<!-- &gt; style &lt; -->");
          System.err.print("munged illegal tag <Style>");
        }
        ;

protected
IGNORE
  :  ( "\r\n" | '\r' | '\n' )
     {newline(); System.out.println("");}
  |  c:. {System.out.print(c);}
  ;

based on [2]