NLP | Classifier-based Chunking | Set 2

20 June 2025

0

Using the data from the treebank_chunk corpus let us evaluate the chunkers (prepared in the previous article). Code #1 :

Python3

# loading libraries
from chunkers import ClassifierChunker
from nltk.corpus import treebank_chunk
 
train_data = treebank_chunk.chunked_sents()[:3000]
test_data = treebank_chunk.chunked_sents()[3000:]
 
# initializing
chunker = ClassifierChunker(train_data)
 
# evaluation
score = chunker.evaluate(test_data)
 
a = score.accuracy()
p = score.precision()
r = recall
   
print ("Accuracy of ClassifierChunker : ", a)
print ("\nPrecision of ClassifierChunker : ", p)
print ("\nRecall of ClassifierChunker : ", r)

Output :

Accuracy of ClassifierChunker : 0.9721733155838022

Precision of ClassifierChunker : 0.9258838793383068

Recall of ClassifierChunker : 0.9359016393442623

Code #2 : Let’s compare the performance of conll_train

Python3

chunker = ClassifierChunker(conll_train)
score = chunker.evaluate(conll_test)
 
a = score.accuracy()
p = score.precision()
r = score.recall()
   
print ("Accuracy of ClassifierChunker : ", a)
print ("\nPrecision of ClassifierChunker : ", p)
print ("\nRecall of ClassifierChunker : ", r)

Output :

Accuracy of ClassifierChunker : 0.9264622074002153

Precision of ClassifierChunker : 0.8737924310910219

Recall of ClassifierChunker : 0.9007354620620346

the word can be passed through the tagger into our feature detector function, by creating nested 2-tuples of the form ((word, pos), iob), The chunk_trees2train_chunks() method produces these nested 2-tuples. The following features are extracted:

The current word and part-of-speech tag
The previous word and IOB tag, part-of-speech tag
The next word and part-of-speech tag

The ClassifierChunker class uses an internal ClassifierBasedTagger and prev_next_pos_iob() as its default feature_detector. The results from the tagger, which are in the same nested 2-tuple form, are then reformatted into 3-tuples to return a final Tree using conlltags2tree(). Code #3 : different classifier builder

Python3

# loading libraries
from chunkers import ClassifierChunker
from nltk.corpus import treebank_chunk
from nltk.classify import MaxentClassifier
 
train_data = treebank_chunk.chunked_sents()[:3000]
test_data = treebank_chunk.chunked_sents()[3000:]
 
 
builder = lambda toks: MaxentClassifier.train(
            toks, trace = 0, max_iter = 10, min_lldelta = 0.01)
 
chunker = ClassifierChunker(
        train_data, classifier_builder = builder)
 
score = chunker.evaluate(test_data)
   
a = score.accuracy()
p = score.precision()
r = score.recall()
 
print ("Accuracy of ClassifierChunker : ", a)
print ("\nPrecision of ClassifierChunker : ", p)
print ("\nRecall of ClassifierChunker : ", r)

Output :

Accuracy of ClassifierChunker : 0.9743204362949285

Precision of ClassifierChunker : 0.9334423548650859

Recall of ClassifierChunker : 0.9357377049180328

ClassifierBasedTagger class defaults to using NaiveBayesClassifier.train as its classifier_builder. But any classifier can be used by overriding the classifier_builder keyword argument.

NLP | Classifier-based Chunking | Set 2

Python3

Python3

Python3

Working with Titles and Heading – Python docx Module

Creating a Receipt Calculator using Python

One Liner for Python if-elif-else Statements

LEAVE A REPLY Cancel reply

Most Popular

Android 16 QPR2 Beta 3 lands with a flurry of bug fixes

Google is working on dedicated ‘Bills’ and ‘Travel’ folders for Gmail

Mint Mobile’s big bet on 5G home internet might change everything

Interviewed With Kyle Smith – Founder and CEO of Escalated by Shauli Zacks

EDITOR PICKS

Android 16 QPR2 Beta 3 lands with a flurry of bug fixes

Google is working on dedicated ‘Bills’ and ‘Travel’ folders for Gmail

Mint Mobile’s big bet on 5G home internet might change everything

POPULAR POSTS

Android 16 QPR2 Beta 3 lands with a flurry of bug fixes

Google is working on dedicated ‘Bills’ and ‘Travel’ folders for Gmail

Mint Mobile’s big bet on 5G home internet might change everything

POPULAR CATEGORY

ABOUT US

FOLLOW US