parsing HTML to a string in Java -


i have email's content (html format) , save string should parsed required details , prepare xml output.

i using james , want done in java. how can dump html page string? think won't problem double inverted commas, spaces, backward slash while parsing?

now testing mailserver on localsystem. sent mail user1@localhost user2@localhost in format html @ other end want convert parse html page create xml document desired values ..

can try example. dumps html page , writes data data.html file. bellow code can append result stringbuffer , replace html special chars.

public class urlreadpagedemo {   public static void main(string[] args) {     try {         url url = new url("http://example.com");          bufferedreader reader = new bufferedreader(new inputstreamreader(url.openstream()));         bufferedwriter writer = new bufferedwriter(new filewriter("data.html"));          string line;         while ((line = reader.readline()) != null) {             system.out.println(line);             writer.write(line);             writer.newline();         }          reader.close();         writer.close();     } catch (malformedurlexception e) {         e.printstacktrace();     }  catch (ioexception e) {         e.printstacktrace();     } } 

}


Comments

Popular posts from this blog

c# - How to set Z index when using WPF DrawingContext? -

razor - Is this a bug in WebMatrix PageData? -

visual c++ - Using relative values in array sorting ( asm ) -