parsing HTML to a string in Java -
i have email's content (html format) , save string should parsed required details , prepare xml output.
i using james , want done in java. how can dump html page string? think won't problem double inverted commas, spaces, backward slash while parsing?
now testing mailserver on localsystem. sent mail user1@localhost user2@localhost in format html @ other end want convert parse html page create xml document desired values ..
can try example. dumps html page , writes data data.html file. bellow code can append result stringbuffer , replace html special chars.
public class urlreadpagedemo { public static void main(string[] args) { try { url url = new url("http://example.com"); bufferedreader reader = new bufferedreader(new inputstreamreader(url.openstream())); bufferedwriter writer = new bufferedwriter(new filewriter("data.html")); string line; while ((line = reader.readline()) != null) { system.out.println(line); writer.write(line); writer.newline(); } reader.close(); writer.close(); } catch (malformedurlexception e) { e.printstacktrace(); } catch (ioexception e) { e.printstacktrace(); } }
}
Comments
Post a Comment