Path-based Similarity: It is a similarity measure that finds the distance that is the length of the shortest path between two synsets.
Leacock Chordorow (LCH) : It is a similarity measure which is an extended version of Path-based similarity as it incorporates the depth of the taxonomy. Therefore, it is the negative log of the shortest path (spath) between two concepts (synset_1 and synset_2) divided by twice the total depth of the taxonomy (D) as defined in fig below.
Code #1 : Introducing Synsets.
from nltk.corpus import wordnet syn1 = wordnet.synsets( 'hello' )[ 0 ] syn2 = wordnet.synsets( 'selling' )[ 0 ] print ( "hello name : " , syn1.name()) print ( "selling name : " , syn2.name()) |
Output :
hello name : hello.n.01 selling name : selling.n.01
Code #2 : Path Similarity
syn1.path_similarity(syn2) |
Output :
0.08333333333333333
Code #3 : Leacock Chordorow (LCH) Similarity
syn1.lch_similarity(syn2) |
Output :
1.1526795099383855