Monday, November 18, 2024
Google search engine
HomeLanguagesJavaJava Program to Implement Bitap Algorithm for String Matching

Java Program to Implement Bitap Algorithm for String Matching

The Bitap Algorithm is an approximate string matching algorithm. The algorithm tells whether a given text contains a substring which is “approximately equal” to a given pattern. Here approximately equal states that if the substring and pattern are within a given distance k of each other. The algorithm begins by precomputing a set of bitmasks containing one bit for each element of the pattern. This does most of the bitwise operations, which are quick.

The Bitap Algorithm is also known as shift-or, shift-and, or Baeza Yates Gonnet Algorithm. 

Example:

Input:
Text: neveropen
Pattern: geeks
Output:
Pattern found at index: 0

Input:
Text: Youareawesome
Pattern: Youareamazing
Output:
No Match.

Approach:

  • Input the text pattern as a string.
  • We will convert this String to a simple Char Array
  • If the length is 0 or exceeding 63, we return “No Match”.
  • Now, by taking array R which complements 0.
  • Taking an array “pattern_mask” which complements 0, to 1
  • Taking the pattern as an index of “pattern_mask” then by using the and operator we and it with the result of the complement of 1L(long integer) by shifting it to left side by i times.
  • Now by running the loop till text length.
  • We now or it with R and pattern mask.
  • Now by left shifting the 1L by the length of the pattern and then the result is an and with R
  • If it is equal to zero we print it by I-len+1 else return -1
  • Output is the index of the text, where the pattern matches.

Code:

Java




// Java Program to implement Bitap Algorithm.
 
import java.io.*;
import java.io.IOException;
 
public class GFG {
    public static void main(String[] args)
        throws IOException
    {
 
        System.out.println("Bitap Algorithm!");
 
        String text = "neveropen";
 
        String pattern = "geeks";
 
        // This is an object created of the class for
        // extension of functions.
        GFG g = new GFG();
 
        // Now here we go to findPattern functions , we two
        // arguments.
        g.findPattern(text, pattern);
    }
 
    public void findPattern(String t, String p)
    {
 
        // we convert the String text to Character Array for
        // easy indexing
        char[] text = t.toCharArray();
 
        // we convert the String pattern to Character Array
        // to access each letter in the String easily.
        char[] pattern = p.toCharArray();
 
        // Index shows the function bitap search if they are
        // equal at a particular index or not
        int index = bitap_search(text, pattern);
 
        // If the pattern is not equal to the text of the
        // string approximately Then we tend to return -1 If
        // index is -1 Then we print there is No match
        if (index == -1) {
            System.out.println("\nNo Match\n");
        }
 
        else {
 
            // Else if there is a match
            // Then we print the position of the index at
            // where the pattern and the text matches.
            System.out.println(
                "\nPattern found at index: \n" + index);
        }
    }
 
    private int bitap_search(char[] text, char[] pattern)
    {
 
        // Here the len variable is taken
        // This variable accepts the pattern length as its
        // value
        int len = pattern.length;
 
        // This is an array of pattern_mask of all
        // character values in it.
 
        long pattern_mask[]
            = new long[Character.MAX_VALUE + 1];
 
        // Here the variable of being long type is
        // complemented with 1;
 
        long R = ~1;
 
        // Now if the length of the pattern is 0
        // we would return -1
        if (len == 0) {
            return -1;
        }
 
        // Or if the length of the pattern exceeds the
        // length of the character array Then we would
        // declare that the pattern is too long. We would
        // return -1
       
        if (len > 63) {
 
            System.out.println("Pattern too long!");
            return -1;
        }
 
        // Now filling the values in the pattern mask
        // We would run th eloop until the max value of
        // character Initially all the values of Character
        // are put up inside the pattern mask array And
        // initially they are complemented with zero
       
        for (int i = 0; i <= Character.MAX_VALUE; ++i)
 
            pattern_mask[i] = ~0;
 
        // Now len being the variable of pattern length ,
        // the loop is set  till there Now the pattern being
        // the index of the pattern_mask 1L means the long
        // integer is shifted to left by i times The result
        // of that is now being complemented the result of
        // the above is now being used as an and operator We
        // and the pattern_mask and the result of it
       
        for (int i = 0; i < len; ++i)
            pattern_mask[pattern[i]] &= ~(1L << i);
 
       
        // Now the loop is made to run until the text length
        // Now what we do this the R array is used
        // as an Or function with pattern_mask at index of
        // text of i
 
        for (int i = 0; i < text.length; ++i) {
 
             
          R |= pattern_mask];
 
            // Now  result of the r after the above
            // operation
            // we shift it to left side by 1 time
 
            R <<= 1;
 
            // If the 1L long integer if shifted left of the
            // len And the result is used to and the result
            // and R array
            // If that result is equal to 0
            // We return the index value
            // Index=i-len+1
           
            if ((R & (1L << len)) == 0)
 
                return i - len + 1;
        }
 
        // if the index is not matched
        // then we return it as -1
        // stating no match found.
       
        return -1;
    }
}


Output

Bitap Algorithm!

Pattern found at index: 
0
RELATED ARTICLES

Most Popular

Recent Comments