python - What does `features['contains(%s)' % word.lower()] = True` mean in NLTK? -



python - What does `features['contains(%s)' % word.lower()] = True` mean in NLTK? -

i've been reading nltk document recently.and don't understand next code.

def dialogue_act_features(post): features = {} word in nltk.word_tokenize(post): features['contains(%s)' % word.lower()] = true homecoming features

this feature extractor naivebayesclassifier,but does

features['contains(%s)' % word.lower()] = true

mean?

i think line of code way generate dict,but have no thought how works.

thanks

in code:

>>> import nltk >>> def word_features(sentence): ... features = {} ... word in nltk.word_tokenize(sentence): ... features['contains(%s)' % word.lower()] = true ... homecoming features ... ... ... >>> sent = 'this foobar word extractor function' >>> word_features(sent) {'contains(a)': true, 'contains(word)': true, 'contains(this)': true, 'contains(function)': true, 'contains(extractor)': true, 'contains(foobar)': true} >>>

this line trying populate/fill features dictionary.:

features['contains(%s)' % word.lower()] = true

here's simple illustration of dictionary in python (see https://docs.python.org/2/tutorial/datastructures.html#dictionaries details):

>>> adict = {} >>> adict['key'] = 'value' >>> adict['key'] 'value' >>> adict['apple'] = 'red' >>> adict['apple'] 'red' >>> adict {'apple': 'red', 'key': 'value'}

and word.lower() lowercase string, e.g.

>>> str = 'apple' >>> str.lower() 'apple' >>> str = 'apple' >>> str.lower() 'apple' >>> str = 'apple' >>> str.lower() 'apple'

and when 'contains(%s)' % word it's trying create string contain( , sign operator , ). sign operator assigned outside string, e.g.

>>> = 'apple' >>> o = 'orange' >>> '%s' % 'apple' >>> '%s and' % 'apple and' >>> '%s , %s' % (a,o) 'apple , orange'

the sign operator similar str.format() function e.g.

>>> = 'apple' >>> o = 'orange' >>> '%s , %s' % (a,o) 'apple , orange' >>> '{} , {}'.format(a,o) 'apple , orange'

so when code 'contains(%s)' % word it's trying produce string this:

>>> 'contains(%s)' % 'contains(apple)'

and when set string dictionary key, key such:

>>> adict = {} >>> key1 = 'contains(%s)' % >>> value1 = true >>> adict[key1] = value1 >>> adict {'contains(apple)': true} >>> key2 = 'contains(%s)' % o >>> value = 'orange' >>> value2 = false >>> adict[key2] = value2 >>> adict {'contains(orange)': false, 'contains(apple)': true}

for more information, see

python string formatting: % vs. .format http://www.tutorialspoint.com/python/python_strings.htm https://docs.python.org/2/library/string.html

python string dictionary nlp nltk

Comments

Popular posts from this blog

java - How to set log4j.defaultInitOverride property to false in jboss server 6 -

c - GStreamer 1.0 1.4.5 RTSP Example Server sends 503 Service unavailable -

Using ajax with sonata admin list view pagination -