With the help of nltk.tokenize.StanfordTokenizer()
method, we are able to extract the tokens from string of characters or numbers by using tokenize.StanfordTokenizer()
method. It follows stanford standard for generating tokens.
Syntax :
tokenize.StanfordTokenizer()
Return : Return the tokens from a string of characters or numbers.
Example #1 :
In this example we can see that by using tokenize.SExprTokenizer()
method, we are able to extract the tokens from stream of characters or numbers using stanford standard.
# import StanfordTokenizer() method from nltk from nltk.tokenize.stanford import StanfordTokenizer # Create a reference variable for Class StanfordTokenizer tk = StanfordTokenizer() # Create a string input gfg = "Geeks f o r Geeks" # Use tokenize method geek = tk.tokenize(gfg) print (geek) |
Output :
[‘Geeks’, ‘f’, ‘o’, ‘r’, ‘Geeks’]
Example #2 :
# import StanfordTokenizer() method from nltk from nltk.tokenize.stanford import StanfordTokenizer # Create a reference variable for Class StanfordTokenizer tk = StanfordTokenizer() # Create a string input gfg = "This is your great author." # Use tokenize method geek = tk.tokenize(gfg) print (geek) |
Output :
[‘This’, ‘is’, ‘your’, ‘great’, ‘author’, ‘.’]