How to determine a string PARTIALLY contains anoth

2019-08-09 10:16发布

问题:

I'm doing a java project which needs to determine whether a string contains another string following below logic:

  • A parent string "This is a parent string" contais "isstr" should return true. Because all the characters of "isstr" can be found in parent string preserving substring order.
  • contains should be case-insensitive.

Is there anyone could kindly help me how to write the logic in a simple and efficient way, or any library is also much appreciated!

回答1:

public static void main(String[] args) {
    String parentStr = "This is a parent string", childStr = "iSStr";
    //Turn both to lowcase.
    parentStr.toLowerCase(); childStr.toLowerCase();
    Integer childStrIndex = 0;
    //Run over the parent string and if you found a match then keep comparing with the next
    //character in the child string.
    for (int index = 0 ; index < parentStr.length(); index++) {
        Character currChar = parentStr.charAt(index);
        if (childStr.length() <= childStrIndex)
            break;
        if (currChar.equals(childStr.charAt(childStrIndex)))
            childStrIndex++;
    }
    // If at the end you are in the last character of the child string, then is true.
    if (childStrIndex >= childStr.length())
        System.out.print(true);
    else
        System.out.print(false);
}

Hope this helps. BTW this sounds like homework.



回答2:

It could be as simple as this:

public boolean contains(final String base, final String search){
    final String baseLowerCase = base.toLowerCase(Locale.ENGLISH);
    for(final char c : search.toLowerCase(Locale.ENGLISH).toCharArray())
        if(baseLowerCase.indexOf(c) < 0)
            return false;
    return true;
}

For example: contains("This is a parent string", "isstr"); returns true.

What you are pretty much trying to do here is convert the String you're searching for in to a char[] in which you will be iterated upon. Then you want to check to see if the base String contains that char (using String#indexOf(char)). You want to return false on the first occurrence that it does not contain the char (meaning String#indexOf(char) returns a value < 0)



回答3:

Lets say "This is a parent string" is your parent string. and "isstr" is query string.

For case insensitive matching, convert both parent string and query string to lowercase.

You can split parent string into keywords and look for each keyword into the query string.

Reverse the query string ("isstr") and push it onto a stack, since you would like to preserve the order.

Stack<Character> stack = new Stack<Character>();
String reversedQueryString = new StringBuilder(queryString).reverse().toString();
for (char ch: reversedQueryString.toCharArray()) {
    stack.push(ch);
}

Pop out alphabets from the query string, when they match alphabets in the parent string. Stack is useful in this case, as we don't care if the same characters are found again.

String[] keywords = parentString.split(" "); \\ to split on spaces.
for(String keyword : keywords){
    processKeyword(keyword);
}

void processKeyword(String keyword){
    for (char c: keyword.toCharArray()) {
        if(stack.top().equals(c)){
            stackCheck();
        }  
    } 
}

void stackCheck(){
    if(!stack.isEmpty())
       stack.pop();
    else{
       System.out.println("Eureka");

    }
}

This is just an example, your implementation can vary. For example you might want to check for two characters in a keyword to believe its partially contains the query string.



回答4:

The search is going to involve backtracking, so if you are going to implement this by hand then a recursive solution is probably warranted.

But a simple approach would be to pre-process the input string into a regex, and then do a regex "find". For instance, if the search string is "instr" then the regex could be "[iI].*[nN].*[sS].*[tT].*[rR]".

Note that this search is inevitably going to be expensive, and even more so if it is done with a regex. In fact a naive implementation is O(M^N) where M is the length of the input string and N is the length of the search string.