Prerequisite: Flattening Deep Tree
We have flattened a Deep Tree by only keeping the lowest level subtrees. But here we can keep the highest level subtrees instead.
Code #1 : Lets’ understand shallow_tree()
from nltk.tree import Tree def shallow_tree(tree): children = [] for t in tree: if t.height() < 3 : children.extend(t.pos()) else : children.append(Tree(t.label(), t.pos())) return Tree(tree.label(), children) |
Code #2 : Evaluating
from transforms import shallow_tree from nltk.corpus import treebank print ( "Deep Tree : \n" , treebank.parsed_sents()[ 0 ]) print ( "\nShallow Tree : \n" , shallow_tree(treebank.parsed_sents()[ 0 ]) ) |
Output :
Deep Tree : (S (NP-SBJ (NP (NNP Pierre) (NNP Vinken)) (,, ) (ADJP (NP (CD 61) (NNS years)) (JJ old)) (,, )) (VP (MD will) (VP (VB join) (NP (DT the) (NN board)) (PP-CLR (IN as) (NP (DT a) (JJ nonexecutive) (NN director))) (NP-TMP (NNP Nov.) (CD 29)))) (. .)) Shallow Tree : Tree('S', [Tree('NP-SBJ', [('Pierre', 'NNP'), ('Vinken', 'NNP'), (', ', ', '), ('61', 'CD'), ('years', 'NNS'), ('old', 'JJ'), (', ', ', ')]), Tree('VP', [('will', 'MD'), ('join', 'VB'), ('the', 'DT'), ('board', 'NN'), ('as', 'IN'), ('a', 'DT'), ('nonexecutive', 'JJ'), ('director', 'NN'), ('Nov.', 'NNP'), ('29', 'CD')]), ('.', '.')])
How it works ?
- shallow_tree() function creates new child trees by iterating over each of the top-level subtrees.
- The subtree is replaced by a list of its part-of-speech tagged children, if the height() of a subtree is less than 3.
- If children of a tree are the part-of-speech tagged leaves, the All other subtrees are replaced by a new Tree.
- Thus, eliminates all the nested subtrees while still retaining the top-level subtrees.
Code #3 : height
print ( "height of tree : " , treebank.parsed_sents()[ 0 ].height()) print ( "\nheight of shallow tree : " , shallow_tree(treebank.parsed_sents()[ 0 ]).height()) |
Output :
height of tree : 7 height of shallow tree :3