Parsing word list in python -
i have wlist.txt file of 58k words of english language, small excerpt of looks :
aardvark aardwolf aaron aback abacus abaft abalone abandon abandoned abandonment abandons abase abased abasement
what have program search through list , see if word contained in list, , if print word. issue code have written return no, word not in list, when know sure is. code looks this, notice bugs?
match = 'aardvark' f = 'wlist.txt' success = false try: word in open(f): if word == match: success = true break except ioerror: print f, "not found!" if success: print "the word has been found value of", word else: print "word not found"
thanks in advance everyone!!
as others have said, problem stems fact newline characters part of words reading in. best way rid of these use strip()
method of str
.
in addition, code accomplish simple task. need build set
word list , occurrence of word in set. set
far better task list
because checking occurrence of element in set
much faster. should work.
try: open('wordlist.txt', 'ru') infile: wordset = set(line.strip() line in infile) except ioerror: print 'error opening file' aword = 'aardvark' if aword in wordset: print 'found word', aword else: print 'word not found'
note: if aword in wordset
so faster isn't funny. if you're looking word closer end of word list, set
60000 times faster 267000 word list. , it's still marginally faster even if you're looking first word.
Comments
Post a Comment