python - Converting untagged corpora to tagged (NLTK) -


i have plaintext corpora, want tag , save, can use further. what's best way this?

i have tagger made, can't figure out way change corpora isn't messy

take @ other tagged corpora, brown, output examples. give idea of tagged corpus should like. next, load corpus (with plaintextcorpusreader) , iterate on sentences, tagging each sentence. write each tagged sentence file making string tagged sentence, in ' '.join([tuple2str(t) t in tagged_sent]) (after from nltk.tag.util import tuple2str). , it's ok if code "messy" long job correctly. you're not looking elegant algorithm here, you're running specific script create custom corpus.


Comments

Popular posts from this blog

c# - How to set Z index when using WPF DrawingContext? -

razor - Is this a bug in WebMatrix PageData? -

visual c++ - Using relative values in array sorting ( asm ) -