java - HTML Well-formedness parser -


heyy guys, need determine if given html document formed or not.
need simple implementation using java core api classes i.e. no third party stuff jtidy or something.

actually, needed algorithm scans list of tags. if finds open tag, , next tag isn't corresponding close tag, should open tag in turn should have close tag next tag, , if not should open tag , corresponding close tag next, , close tags of previous open tags in reverse order coming 1 after other on list. if list conforms order returns true or else false. i've written methods convert tag close tag.

here skeleton code of i've started working on already. not neat, should give guys basic idea of i'm trying do.

public boolean validatehtml(){      arraylist<string> tags = fetchtags();     //fetchtags returns [<html>, <head>, <title>, </title>, </head>, <body>, <h1>, </h1>, </body>, </html>]      //i create arraylist store tags haven't found corresponding close tag yet     arraylist<string> unclosedtags = new arraylist<string>();      string temp;      (int = 0; < tags.size(); i++) {          temp = tags.get(i);          if(!tags.get(i+1).equals(tagoperations.converttoclosetag(tags.get(i)))){             unclosedtags.add(tags.get(i));             if(){              }          }else{             return true;//well formed html         }     }      return true; } 

two thoughts. first off maybe away using xml parser on html? potentially easier , vastly less time consuming.

i havn't put whole lot of thought me sounds recursion , stack way go. like

public myclass(string htmlinput) {     openedtags = new stack<string>();     this.htmlinput = htmlinput; } public boolean validate() {     return validate(this.htmlinput); } private boolean validate(string html) {     boolean result = true;     string curtag;     while(htmlleft)        //worker loop     {          if(isoneofftag(curtag))                 //matches <tags />             continue;         else if(isopentag(curtag))              //matches <tags>         {             openedtags.push(curtag);             if(!validate(innerhtml))                 return false;         }         else if(isclosetag(curtag))             //matches </tags>         {             string lasttag = (string)openedtags.peek();             if(!tagissimiliar(curtag, lasttag))                 return false;             openedtags.pop();         }     }       return result; } private string nexttag(){return null;} private boolean isopentag(string tag){ return true;} private boolean isclosetag(string tag){ return true;} private boolean isoneofftag(string tag){ return true;} private boolean tagissimiliar(string curtag, string lasttag){return true;} 

*edit 1: should have pushed onto stack.

**edit 2: suppose issue here determine when returning solely boolean you've left off. require kind of pointer know you've left off. idea though believe still work.


Comments

Popular posts from this blog

c# - How to set Z index when using WPF DrawingContext? -

razor - Is this a bug in WebMatrix PageData? -

visual c++ - Using relative values in array sorting ( asm ) -