find the count of substring in string

2019-03-15 13:48发布

I have to find the count of a substring in a string using the C language. I'm using the function strstr but it only finds the first occurrence.

My idea of the algorithm is something like searching in the string while strstr does not return null and to substring the main string on each loop. My question is how to do that?

5条回答
何必那么认真
2楼-- · 2019-03-15 14:30
/* 
 * C Program To Count the Occurence of a Substring in String 
 */
#include <stdio.h>
#include <string.h>

char str[100], sub[100];
int count = 0, count1 = 0;

void main()
{
    int i, j, l, l1, l2;

    printf("\nEnter a string : ");
    scanf("%[^\n]s", str);

    l1 = strlen(str);

    printf("\nEnter a substring : ");
    scanf(" %[^\n]s", sub);

    l2 = strlen(sub);

    for (i = 0; i < l1;)
    {
        j = 0;
        count = 0;
        while ((str[i] == sub[j]))
        {
            count++;
            i++;
            j++;
        }
        if (count == l2)
        {
            count1++;                                   
            count = 0;
        }
        else
            i++;
    }    
    printf("%s occurs %d times in %s", sub, count1, str);
}
查看更多
\"骚年 ilove
3楼-- · 2019-03-15 14:36

The results can be different depending whether you allow an overlap or not:

// gcc -std=c99
#include <stdbool.h>
#include <stdio.h>
#include <string.h>

static int
count_substr(const char *str, const char* substr, bool overlap) {
  if (strlen(substr) == 0) return -1; // forbid empty substr

  int count = 0;
  int increment = overlap ? 1 : strlen(substr);
  for (char* s = (char*)str; (s = strstr(s, substr)); s += increment)
    ++count;
  return count;
}

int main() {
  char *substrs[] = {"a", "aa", "aaa", "b", "", NULL };
  for (char** s = substrs; *s != NULL; ++s)
    printf("'%s' ->  %d, no overlap: %d\n", *s, count_substr("aaaaa", *s, true),
       count_substr("aaaaa", *s, false));
}

Output

'a' ->  5, no overlap: 5
'aa' ->  4, no overlap: 2
'aaa' ->  3, no overlap: 1
'b' ->  0, no overlap: 0
'' ->  -1, no overlap: -1
查看更多
萌系小妹纸
4楼-- · 2019-03-15 14:39

You could do something like

int count = 0;
const char *tmp = myString;
while(tmp = strstr(tmp, string2find))
{
   count++;
   tmp++;
}

That is, when you get a result, start searching again at the next position of the string.

strstr() doesn't only work starting from the beginning of a string but from any position.

查看更多
Root(大扎)
5楼-- · 2019-03-15 14:39

USE KMP and you can do it in O(n)

int fail[LEN+1];
char s[LEN];
void getfail()
{
    //f[i+1]= max({j|s[i-j+1,i]=s[0,j-1],j!=i+1})
    //the correctness can be proved by induction
    for(int i=0,j=fail[0]=-1;s[i];i++)
    {
        while(j>=0&&s[j]!=s[i]) j=fail[j];
        fail[i+1]=++j;
        if (s[i+1]==s[fail[i+1]]) fail[i+1]=fail[fail[i+1]];//optimizing fail[]
    }
}

int kmp(char *t)// String s is pattern and String t is text!
{
    int cnt=0;
    for(int i=0,j=0;t.s[i];i++)
    {
        while(j>=0&&t.s[i]!=s[j]) j=fail[j];
        if (!s[++j])
        {
            j=fail[j];
            cnt++;
        }
    }
    return cnt;// how many times s appeared in t.
}

You can see wiki pages for details

查看更多
倾城 Initia
6楼-- · 2019-03-15 14:46

Should already processed parts of the string should be consumed or not?

For example, what's the expect answer for case of searching oo in foooo, 2 or 3?

  • If the latter (we allow substring overlapping, and the answer is three), then Joachim Isaksson suggested the right code.

  • If we search for distinct substrings (the answer should be two), then see the code below (and online example here):

    char *str = "This is a simple string";
    char *what = "is";
    
    int what_len = strlen(what);
    int count = 0;
    
    char *where = str;
    
    if (what_len) 
        while ((where = strstr(where, what))) {
            where += what_len;
            count++;
        }
    
查看更多
登录 后发表回答