The nltk.probability.ConditionalFreqDist class is a container for FreqDist instances, with one FreqDist per condition. It is used to count frequencies that are dependent on another condition, such as another word or a class label. It is being used here to create an API-compatible class on top of Redis using the RedisHashFreqDist .
In the code given below, a RedisConditionalHashFreqDist class that extends nltk.probability.ConditionalFreqDist and overrides the __getitem__() method. Override __getitem__() so as to create an instance of RedisHashFreqDist instead of a FreqDist.
Code :
from nltk.probability import ConditionalFreqDist from rediscollections import encode_key class RedisConditionalHashFreqDist(ConditionalFreqDist): def __init__( self , r, name, cond_samples = None ): self ._r = r self ._name = name ConditionalFreqDist.__init__( self , cond_samples) for key in self ._r.keys(encode_key( '% s:*' % name)): condition = key.split( ':' )[ 1 ] # calls self.__getitem__(condition) self [condition] def __getitem__( self , condition): if condition not in self ._fdists: key = '% s:% s' % ( self ._name, condition) val = RedisHashFreqDist( self ._r, key) super (RedisConditionalHashFreqDist, self ).__setitem__( condition, val) return super ( RedisConditionalHashFreqDist, self ).__getitem__(condition) def clear( self ): for fdist in self .values(): fdist.clear() |
An instance of this class can be created by passing in a Redis connection and a base name. After that, it works just like a ConditionalFreqDist as shown in the code below :
Code :
from redis import Redis from redisprob import RedisConditionalHashFreqDist r = Redis() rchfd = RedisConditionalHashFreqDist(r, 'condhash' ) print (rchfd.N()) print (rchfd.conditions()) rchfd[ 'cond1' ][ 'foo' ] + = 1 print (rchfd.N()) print (rchfd[ 'cond1' ][ 'foo' ]) print (rchfd.conditions()) rchfd.clear() |
Output :
0 [] 1 1 ['cond1']
How it works ?
- The RedisConditionalHashFreqDist uses name prefixes to reference RedisHashFreqDist instances.
- The name passed into the RedisConditionalHashFreqDist is a base name that is combined with each condition to create a unique name for each RedisHashFreqDist.
- For example, if the base name of the RedisConditionalHashFreqDist is ‘condhash’, and the condition is ‘cond1’, then the final name for the RedisHashFreqDist is ‘condhash:cond1’.
- This naming pattern is used at initialization to find all the existing hash maps using the keys command.
- By searching for all keys matching ‘condhash:*’, user can identify all the existing conditions and create an instance of RedisHashFreqDist for each.
- Combining strings with colons is a common naming convention for Redis keys as a way to define namespaces.
- Each RedisConditionalHashFreqDist instance defines a single namespace of hash maps.
RedisConditionalHashFreqDist also defines a clear() method. This is a helper method that calls clear() on all the internal RedisHashFreqDist instances. The clear() method is not defined in ConditionalFreqDist.