Detecting Delimiter in Text using detect_delimiter in Python

26 July 2024

0

Sometimes while working with a large corpus of text, we can have a problem in which we try to find which character is acting as a delimiter. This can be an interesting and useful utility while working with a huge amount of data and judging the delimiter. A way to solve this problem is discussed in this article using the Python library of detect_delimiter.

Installation

To install this module type the below command in the terminal.

pip install detect_delimiter

The first step is to check for all the whitelist characters’ presence in the input text, if found, then those characters are counted for most frequencies and a maximum of one is returned, ignoring all from the blacklist list if provided. If no delimiter is from the whitelist, then characters avoiding blacklist characters are computed for maximum frequency, if found, that character is returned as the delimiter. If still delimiter is not found, default is returned as a delimiter if provided, else None is returned.

Syntax: detect(text:str, text:str, default=None, whitelist=[‘,’, ‘;’, ‘:’, ‘|’, ‘\t’], blacklist=None)
text : The input string to test for delimiter.
default : The default value to output in case no valid delimiter is found.
whitelist : The first set of characters to be checked for delimiters, if these are found, they are treated as delimiters. Useful in cases one knows out of which delimiters are possible. Defaults to [‘,’, ‘;’, ‘:’, ‘|’, ‘\t’].
blacklist : By default all digits, alphabets and full stop are not considered as blacklist, In case more values one needs to avoid being tagged as delimiters, these will get avoided in check.

Example 1: Working with detect() and default

In this, few examples of detecting the delimiters are demonstrated along with the use of default.

Python3

from detect_delimiter import detect
 
# simple example
print("The found delimiter [base example] : ")
print(detect("GeeksforLazyroar-is-best-for-Lazyroar"))
 
# simple example without default and no delimiter
# . is not considered as delim
print("The found delimiter [no default] : ")
print(detect("GeeksforLazyroar.is.best.for.Lazyroar"))
 
# simple example with default
# . is not considered as delim
# No delim is found, hence, default is printed
print("The found delimiter [with default] : ")
print(detect("GeeksforLazyroar.is.best.for.Lazyroar", default='@'))

Output :

Working with detect() and default

Example 2: Using blacklist and whitelist parameters

Providing whitelist parameter prioritizes any particular delimiter even if its frequency is less than nonwhitelisted delim. The blacklist parameter can help to ignore any delimiter.

Python3

from detect_delimiter import detect
from string import ascii_letters
 
# simple example
# check for , as whitelist picked from default
# - [',', ';', ':', '|', '\t']
print("The found delimiter [default whitelist] : ")
print(detect("GeeksforLazyroar$is-best,for-Lazyroar"))
 
# simple example with whitelist
# ! prioritized
print("The found delimiter [provided whitelist] : ")
print(detect("GeeksforLazyroar-is-best-for!Lazyroar",
             whitelist=['@', "!"]))
 
# simple example with blacklist
# default blacklist overridden
print("The found delimiter [provided blacklist] : ")
print(detect("GeeksforLazyroar-is-best-for!Lazyroar",
             blacklist=['@', "-", 'e']))

Output :

Examples with blacklist and whitelist Parameters.

Detecting Delimiter in Text using detect_delimiter in Python

Installation

Python3

Python3

Java Program for Longest Common Subsequence

Maximum height of Tree when any Node can be considered as Root

Print Fibonacci sequence using 2 variables

LEAVE A REPLY Cancel reply

Most Popular

How to factory reset the Google Pixel 8a

The 2024 YouTube Music Recap could be here any day now

How to install Proton VPN on a Fire TV Stick

Google Messages can now show your profile exactly how it’s supposed to be

Recent Comments

EDITOR PICKS

How to factory reset the Google Pixel 8a

The 2024 YouTube Music Recap could be here any day now

How to install Proton VPN on a Fire TV Stick

POPULAR POSTS

How to factory reset the Google Pixel 8a

The 2024 YouTube Music Recap could be here any day now

How to install Proton VPN on a Fire TV Stick

POPULAR CATEGORY

ABOUT US

FOLLOW US