Thursday, December 26, 2024
Google search engine
HomeData Modelling & AISmallest window that contains all characters of string itself

Smallest window that contains all characters of string itself

  • Given a string, find the smallest window length with all distinct characters of the given string. For eg. str = “aabcbcdbca”, then the result would be 4 as of the smallest window will be “dbca” .

Examples: 

Input: aabcbcdbca
Output: dbca
Explanation: 
Possible substrings= {aabcbcd, abcbcd, 
bcdbca, dbca....}
Of the set of possible substrings 'dbca' 
is the shortest substring having all the 
distinct characters of given string. 

Input: aaab
Output: ab
Explanation: 
Possible substrings={aaab, aab, ab}
Of the set of possible substrings 'ab' 
is the shortest substring having all 
the distinct characters of given string.    
Recommended Practice

Solution: Above problem states that we have to find the smallest window that contains all the distinct characters of the given string even if the smallest string contains repeating elements. 
For example, in “aabcbcdb”, the smallest string that contains all the characters is “abcbcd”.

Method 1: This is the Brute Force method of solving the problem using HashMap.

Approach : For solving the problem we first have to find out all the distinct characters present in the string. This can be done using a HashMap. The next thing is to generate all the possible substrings. This follows by checking whether a substring generated has all the required characters(stored in the hash_map) or not. If yes, then compare its length with the minimum substring length which follows the above constraints, found till now.
HashMap: HashMap is a part of Java’s collection since Java 1.2. It provides the basic implementation of the Map interface of Java. It stores the data in (Key, Value) pairs. To access a value one must know its key. HashMap is known as HashMap because it uses a technique called Hashing. Hashing is a technique of converting a large String to small String that represents the same String. A shorter value helps in indexing and faster searches. HashSet also uses HashMap internally. It internally uses a link list to store key-value pairs already explained in HashSet in detail and further articles. 

Algorithm : 

  1. Store all distinct characters of the given string in a hash_map.
  2. Take a variable count and initialize it with value 0.
  3. Generate the substrings using two pointers.
  4. Now check whether generated substring is valid or not-: 
    1. As soon we find that the character of the substring generated has not been encountered before, increment count by 1.
    2. We can use a visited array of max_chars size to find whether the current character has been encountered before or not.
    3. If count is equal to equal to size of hash_map the substring generated is valid
    4. If it is a valid substring, compare it with the minimum length substring already generated.

Pseudo Code:

maphash_map;
for ( i=0 to str.length())
hash_map[str[i]]++;//finding all distinct characters of string
minimum_size=INT_MAX
Distinct_chars=hash_map.size()
for(i=0 to str.length())
count=0;
sub_str="";
visited[256]={0};
 for(j=i to n)
   sub_str+=str[j]
   if(visited[str[j]]==0)
   count++
   visited[str[j]]=1;
   if(count==Distinct_chars)
   end loop

if(sub_str.length()<minimum_size&&
count==Distinct_chars)
ans=sub_str;
  
return ans

Implementation:

CPP




// C++ program to find the smallest
// window containing all characters
// of a pattern.
#include <bits/stdc++.h>
using namespace std;
  
const int MAX_CHARS = 256;
  
// Function to find smallest window containing
// all distinct characters
string findSubString(string str)
{
    int n = str.length();
  
    // Count all distinct characters.
    int dist_count = 0;
    unordered_map<int, int> hash_map;
    for (int i = 0; i < n; i++) {
        hash_map[str[i]]++;
    }
  
    dist_count = hash_map.size();
    int size = INT_MAX;
    string res;
    // Now follow the algorithm discussed in below
    for (int i = 0; i < n; i++) {
        int count = 0;
        int visited[256] = { 0 };
        string sub_str = "";
        for (int j = i; j < n; j++) {
            if (visited[str[j]] == 0) {
                count++;
                visited[str[j]] = 1;
            }
            sub_str += str[j];
            if (count == dist_count)
                break;
        }
        if (sub_str.length() < size && count == dist_count)
        {
            res = sub_str;
            size=res.length();
        }
    }
    return res;
}
  
// Driver Code
int main()
{
    string str = "aabcbcdbca";
    cout << "Smallest window containing all distinct"
            " characters is: "
         << findSubString(str);
    return 0;
}


Java




import java.io.*;
import java.util.*;
  
// Java program to find the smallest
// window containing all characters
// of a pattern.
class GFG 
{
    
    // Function to find smallest window containing
    // all distinct characters
    public static String findSubString(String str)
    {
        int n = str.length();
        
        // Count all distinct characters.
        int dist_count = 0;
        HashMap<Character, Integer> mp = new HashMap<>();
        for (int i = 0; i < n; i++) 
        {
            if (mp.containsKey(str.charAt(i))) 
            {
                Integer a = mp.get(str.charAt(i));
                mp.put(str.charAt(i),a+1);                   
            }
          else
          {
                 mp.put(str.charAt(i), 1);
            }
        }
        dist_count = mp.size();
        int size = Integer.MAX_VALUE;
        String res = "";
        
        // Now follow the algorithm discussed in below
        for (int i = 0; i < n; i++)
        {
            int count = 0;
            int visited[] = new int[256];
            for(int j = 0; j < 256; j++)
              visited[j] = 0;
            String sub_str = "";
            for (int j = i; j < n; j++)
            {
                if (visited[str.charAt(j)] == 0
                {
                    count++;
                    visited[str.charAt(j)] = 1;
                }
                sub_str += str.charAt(j);
                if (count == dist_count)
                    break;
            }
            if (sub_str.length() < size && count == dist_count)
            {
                res = sub_str;
                size=res.length();
            }
        }
        return res;
    }
    
  // Driver code
    public static void main (String[] args)
    {
        String str = "aabcbcdbca";
        System.out.println("Smallest window containing all distinct"+
                " characters is: "+ findSubString(str)) ;
    }
}
  
// This code is contributed by Manu Pathria


C#




// Include namespace system
using System;
using System.Collections.Generic;
  
using System.Collections;
  
// C# program to find the smallest
// window containing all characters
// of a pattern.
public class GFG
{
    
    // Function to find smallest window containing
    // all distinct characters
    public static String findSubString(String str)
    {
        var n = str.Length;
        
        // Count all distinct characters.
        var dist_count = 0;
        var mp = new Dictionary<char, int>();
        for (int i = 0; i < n; i++)
        {
            if (mp.ContainsKey(str[i]))
            {
                var a = mp[str[i]];
                mp[str[i]] = a + 1;
            }
            else 
            {
                mp[str[i]] = 1;
            }
        }
        dist_count = mp.Count;
        var size = int.MaxValue;
        var res = "";
        
        // Now follow the algorithm discussed in below
        for (int i = 0; i < n; i++)
        {
            var count = 0;
            int[] visited = new int[256];
            for (int j = 0; j < 256; j++)
            {
                visited[j] = 0;
            }
            var sub_str = "";
            for (int j = i; j < n; j++)
            {
                if (visited[str[j]] == 0)
                {
                    count++;
                    visited[str[j]] = 1;
                }
                sub_str += str[j];
                if (count == dist_count)
                {
                    break;
                }
            }
            if (sub_str.Length < size && count == dist_count)
            {
                res = sub_str;
                size = res.Length;
            }
        }
        return res;
    }
    
    // Driver code
    public static void Main(String[] args)
    {
        var str = "aabcbcdbca";
        Console.WriteLine("Smallest window containing all distinct" + " characters is: " + GFG.findSubString(str));
    }
}
  
// This code is contributed by mukulsomukesh.


Python3




# Python3 code for the same approach
import sys
  
MAX_CHARS = 256
  
# Function to find smallest window containing
# all distinct characters
def findSubString(str):
  
   n = len(str)
  
   # Count all distinct characters.
   dist_count = 0
   hash_map = {}
   for i in range(n):
      if(str[i] in hash_map):
  
         hash_map[str[i]] = hash_map[str[i]] + 1
  
      else:
  
         hash_map[str[i]] = 1
  
   dist_count = len(hash_map)
   size = sys.maxsize
   res = 0
  
    # Now follow the algorithm discussed in below
   for i in range(n):
      count = 0
      visited= [0]*(MAX_CHARS)
      sub_str = ""
      for j in range(i,n):
         if (visited[ord(str[j])] == 0):
            count += 1
            visited[ord(str[j])] = 1
  
         sub_str += str[j]
         if (count == dist_count):
            break
      if (len(sub_str) < size and count == dist_count):
         res = sub_str
         size = len(res)
   return res
  
# Driver Code
str = "aabcbcdbca"
print(f"Smallest window containing all distinct characters is: {findSubString(str)}")
  
# This code is contributed by shinjanpatra.


Javascript




<script>
  
// JavaScript program to find the smallest
// window containing all characters
// of a pattern.
const MAX_CHARS = 256;
  
// Function to find smallest window containing
// all distinct characters
function findSubString(str)
{
    let n = str.length;
  
    // Count all distinct characters.
    let dist_count = 0;
    let hash_map = new Map();
    for (let i = 0; i < n; i++) {
        if(hash_map.has(str[i])){
            hash_map.set(str[i],hash_map.get(str[i])+1);
        }
        else hash_map.set(str[i],1);
    }
  
    dist_count = hash_map.size;
    let size = Number.MAX_VALUE;
    let res;
    // Now follow the algorithm discussed in below
    for (let i = 0; i < n; i++) {
        let count = 0;
        let visited= new Array(MAX_CHARS).fill(0);
        let sub_str = "";
        for (let j = i; j < n; j++) {
            if (visited[str.charCodeAt(j)] == 0) {
                count++;
                visited[str.charCodeAt(j)] = 1;
            }
            sub_str += str[j];
            if (count == dist_count)
                break;
        }
        if (sub_str.length < size && count == dist_count)
        {
            res = sub_str;
            size = res.length;
        }
    }
    return res;
}
  
// Driver Code
let str = "aabcbcdbca";
document.write("Smallest window containing all distinct characters is: " + findSubString(str),"</br>");
  
// This code is contributed by shinjanpatra.
</script>


Output

Smallest window containing all distinct characters is: dbca

Complexity Analysis: 

  • Time Complexity: O(N^2). 
    This time is required to generate all possible sub-strings of a string of length “N”.
  • Space Complexity: O(N). 
    As a hash_map has been used of size N.

Method 2: Here we have used Sliding Window technique to arrive at the solution. This technique shows how a nested for loop in few problems can be converted to single for loop and hence reducing the time complexity.

Approach: Basically a window of characters is maintained by using two pointers namely start and end. These start and end pointers can be used to shrink and increase the size of window respectively. Whenever the window contains all characters of given string, the window is shrinked from left side to remove extra characters and then its length is compared with the smallest window found so far. 
If in the present window, no more characters can be deleted then we start increasing the size of the window using the end until all the distinct characters present in the string are also there in the window. Finally, find the minimum size of each window.

Algorithm : 

  1. Maintain an array (visited) of maximum possible characters (256 characters) and as soon as we find any in the string, mark that index in the array (this is to count all distinct characters in the string).
  2. Take two pointers start and end which will mark the start and end of window.
  3. Take a counter=0 which will be used to count distinct characters in the window.
  4. Now start reading the characters of the given string and if we come across a character which has not been visited yet increment the counter by 1.
  5. If the counter is equal to total number of distinct characters, Try to shrink the window.
  6. For shrinking the window -: 
    1. If the frequency of character at start pointer is greater than 1 increment the pointer as it is redundant.
    2. Now compare the length of present window with the minimum window length.

Implementation:

C++




// C++ program to find the smallest
// window containing all characters
// of a pattern.
#include <bits/stdc++.h>
using namespace std;
  
const int MAX_CHARS = 256;
  
// Function to find smallest window containing
// all distinct characters
string findSubString(string str)
{
    int n = str.length();
  
    // if string is empty or having one char
    if (n <= 1)
        return str;
  
    // Count all distinct characters.
    int dist_count = 0;
    bool visited[MAX_CHARS] = { false };
    for (int i = 0; i < n; i++) {
        if (visited[str[i]] == false) {
            visited[str[i]] = true;
            dist_count++;
        }
    }
  
    // Now follow the algorithm discussed in below
    // post. We basically maintain a window of characters
    // that contains all characters of given string.
    int start = 0, start_index = -1, min_len = INT_MAX;
  
    int count = 0;
    int curr_count[MAX_CHARS] = { 0 };
    for (int j = 0; j < n; j++) {
        // Count occurrence of characters of string
        curr_count[str[j]]++;
  
        // If any distinct character matched,
        // then increment count
        if (curr_count[str[j]] == 1)
            count++;
  
        // if all the characters are matched
        if (count == dist_count) {
            // Try to minimize the window i.e., check if
            // any character is occurring more no. of times
            // than its occurrence in pattern, if yes
            // then remove it from starting and also remove
            // the useless characters.
            while (curr_count[str[start]] > 1) {
                if (curr_count[str[start]] > 1)
                    curr_count[str[start]]--;
                start++;
            }
  
            // Update window size
            int len_window = j - start + 1;
            if (min_len > len_window) {
                min_len = len_window;
                start_index = start;
            }
        }
    }
  
    // Return substring starting from start_index
    // and length min_len
    return str.substr(start_index, min_len);
}
  
// Driver code
int main()
{
    string str = "aabcbcdbca";
    cout << "Smallest window containing all distinct"
            " characters is: "
         << findSubString(str);
    return 0;
}


Java




// Java program to find the smallest window containing
// all characters of a pattern.
import java.util.Arrays;
public class GFG {
  
    static final int MAX_CHARS = 256;
  
    // Function to find smallest window containing
    // all distinct characters
    static String findSubString(String str)
    {
        int n = str.length();
  
        // if string is empty or having one char
        if (n <= 1)
            return str;
  
        // Count all distinct characters.
        int dist_count = 0;
  
        boolean[] visited = new boolean[MAX_CHARS];
        Arrays.fill(visited, false);
        for (int i = 0; i < n; i++) {
            if (visited[str.charAt(i)] == false) {
                visited[str.charAt(i)] = true;
                dist_count++;
            }
        }
  
        // Now follow the algorithm discussed in below
        // post. We basically maintain a window of
        // characters that contains all characters of given
        // string.
        int start = 0, start_index = -1;
        int min_len = Integer.MAX_VALUE;
  
        int count = 0;
        int[] curr_count = new int[MAX_CHARS];
        for (int j = 0; j < n; j++) {
            // Count occurrence of characters of string
            curr_count[str.charAt(j)]++;
  
            // If any distinct character matched,
            // then increment count
            if (curr_count[str.charAt(j)] == 1)
                count++;
  
            // if all the characters are matched
            if (count == dist_count) {
                // Try to minimize the window i.e., check if
                // any character is occurring more no. of
                // times than its occurrence in pattern, if
                // yes then remove it from starting and also
                // remove the useless characters.
                while (curr_count[str.charAt(start)] > 1) {
                    if (curr_count[str.charAt(start)] > 1)
                        curr_count[str.charAt(start)]--;
                    start++;
                }
  
                // Update window size
                int len_window = j - start + 1;
                if (min_len > len_window) {
                    min_len = len_window;
                    start_index = start;
                }
            }
        }
        // Return substring starting from start_index
        // and length min_len
        return str.substring(start_index,
                             start_index + min_len);
    }
  
    // Driver code
    public static void main(String args[])
    {
        String str = "aabcbcdbca";
        System.out.println(
            "Smallest window containing all distinct"
            + " characters is: " + findSubString(str));
    }
}
// This code is contributed by Sumit Ghosh


Python3




# Python program to find the smallest
# window containing
# all characters of a pattern
from collections import defaultdict
  
MAX_CHARS = 256
  
# Function to find smallest window
# containing all distinct characters
  
  
def findSubString(strr):
  
    n = len(strr)
  
    # if string is empty or having one char
    if n <= 1:
        return strr
  
    # Count all distinct characters.
    dist_count = len(set([x for x in strr]))
  
    curr_count = defaultdict(lambda: 0)
    count = 0
    start = 0
    min_len = n
  
    # Now follow the algorithm discussed in below
    # post. We basically maintain a window of characters
    # that contains all characters of given string.
    for j in range(n):
        curr_count[strr[j]] += 1
  
        # If any distinct character matched,
        # then increment count
        if curr_count[strr[j]] == 1:
            count += 1
  
        # Try to minimize the window i.e., check if
        # any character is occurring more no. of times
        # than its occurrence in pattern, if yes
        # then remove it from starting and also remove
        # the useless characters.
        if count == dist_count:
            while curr_count[strr[start]] > 1:
                if curr_count[strr[start]] > 1:
                    curr_count[strr[start]] -= 1
  
                start += 1
  
            # Update window size
            len_window = j - start + 1
  
            if min_len > len_window:
                min_len = len_window
                start_index = start
  
    # Return substring starting from start_index
    # and length min_len """
    return str(strr[start_index: start_index +
                    min_len])
  
  
# Driver code
if __name__ == '__main__':
  
    print("Smallest window containing "
          "all distinct characters is: {}".format(
              findSubString("aabcbcdbca")))
  
# This code is contributed by
# Subhrajit


C#




// C# program to find the smallest window containing
// all characters of a pattern.
using System;
  
class GFG {
  
    static int MAX_CHARS = 256;
  
    // Function to find smallest window containing
    // all distinct characters
    static string findSubString(string str)
    {
        int n = str.Length;
  
        // if string is empty or having one char
        if (n <= 1)
            return str;
  
        // Count all distinct characters.
        int dist_count = 0;
        bool[] visited = new bool[MAX_CHARS];
        for (int i = 0; i < n; i++) {
            if (visited[str[i]] == false) {
                visited[str[i]] = true;
                dist_count++;
            }
        }
  
        // Now follow the algorithm discussed in below
        // post. We basically maintain a window of
        // characters that contains all characters of given
        // string.
        int start = 0, start_index = -1,
            min_len = int.MaxValue;
  
        int count = 0;
        int[] curr_count = new int[MAX_CHARS];
        for (int j = 0; j < n; j++) {
            // Count occurrence of characters of string
            curr_count[str[j]]++;
  
            // If any distinct character matched,
            // then increment count
            if (curr_count[str[j]] == 1)
                count++;
  
            // if all the characters are matched
            if (count == dist_count) {
                // Try to minimize the window i.e., check if
                // any character is occurring more no. of
                // times than its occurrence in pattern, if
                // yes then remove it from starting and also
                // remove the useless characters.
                while (curr_count[str[start]] > 1) {
                    if (curr_count[str[start]] > 1)
                        curr_count[str[start]]--;
                    start++;
                }
  
                // Update window size
                int len_window = j - start + 1;
                if (min_len > len_window) {
                    min_len = len_window;
                    start_index = start;
                }
            }
        }
  
        // Return substring starting from start_index
        // and length min_len
        return str.Substring(start_index, min_len);
    }
  
    // Driver code
    public static void Main(String[] args)
    {
        string str = "aabcbcdbca";
        Console.WriteLine(
            "Smallest window containing all distinct"
            + " characters is: " + findSubString(str));
    }
}
  
// This code contributed by Rajput-Ji


Javascript




<script>
  
// JavaScript program to find the smallest
// window containing all characters
// of a pattern.
const MAX_CHARS = 256;
  
// Function to find smallest window containing
// all distinct characters
function findSubString(str)
{
    let n = str.length;
  
    // if string is empty or having one char
    if (n <= 1)
        return str;
  
    // Count all distinct characters.
    let dist_count = 0;
    let visited = new Array(MAX_CHARS).fill(false);
    for (let i = 0; i < n; i++) {
        if (visited[str.charCodeAt(i)] == false) {
            visited[str.charCodeAt(i)] = true;
            dist_count++;
        }
    }
  
    // Now follow the algorithm discussed in below
    // post. We basically maintain a window of characters
    // that contains all characters of given string.
    let start = 0, start_index = -1, min_len = Number.MAX_VALUE;
  
    let count = 0;
    let curr_count = new Array(MAX_CHARS).fill(0);
    for (let j = 0; j < n; j++) {
        // Count occurrence of characters of string
        curr_count[str.charCodeAt(j)]++;
  
        // If any distinct character matched,
        // then increment count
        if (curr_count[str.charCodeAt(j)] == 1)
            count++;
  
        // if all the characters are matched
        if (count == dist_count) {
            // Try to minimize the window i.e., check if
            // any character is occurring more no. of times
            // than its occurrence in pattern, if yes
            // then remove it from starting and also remove
            // the useless characters.
            while (curr_count[str.charCodeAt(start)] > 1) {
                if (curr_count[str.charCodeAt(start)] > 1)
                    curr_count[str.charCodeAt(start)]--;
                start++;
            }
  
            // Update window size
            let len_window = j - start + 1;
            if (min_len > len_window) {
                min_len = len_window;
                start_index = start;
            }
        }
    }
  
    // Return substring starting from start_index
    // and length min_len
    return str.substring(start_index, min_len + start_index);
}
  
// Driver code
let str = "aabcbcdbca";
document.write("Smallest window containing all distinct characters is: "
findSubString(str),"</br>");
  
// This code is contributed by shinjanpatra.
</script>


Output

Smallest window containing all distinct characters is: dbca

Complexity Analysis: 

  • Time Complexity: O(N). 
    As the string is traversed using two pointers only once.
  • Space Complexity: O(N). 
    As a hash_map is used of size N

Related Article: 

  1. Length of the smallest sub-string consisting of maximum distinct characters
  2. https://www.geeksforgeeks.org/find-the-smallest-window-in-a-string-containing-all-characters-of-another-string/

This article is contributed by Sahil Chhabra. If you like neveropen and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to review-team@geeksforgeeks.org. See your article appearing on the neveropen main page and help other Geeks.

Feeling lost in the world of random DSA topics, wasting time without progress? It’s time for a change! Join our DSA course, where we’ll guide you on an exciting journey to master DSA efficiently and on schedule.
Ready to dive in? Explore our Free Demo Content and join our DSA course, trusted by over 100,000 neveropen!

RELATED ARTICLES

Most Popular

Recent Comments