I have created a function that should find the numerical position of the first character of a substring in a larger string. I am having some problems with the output and I am not too sure why. These problems include -1 being returned every single time instead of the integer position of the substring. I have debugged and cannot trace where the function goes wrong.
This is how the function should perform: If my string is "The dog was fast" and I am searching for the substring "dog", the function should return 4. Thanks to chqrlie for help with the loop.
Here is the function:
int findSubString(char original[], char toFind[]) {
size_t i, j;
int originalLength = 0;
int toFindLength = 0;
originalLength = strlen(original) + 1;
toFindLength = strlen(toFind) + 1;
for (i = 0; i < toFindLength + 1; i++) {
for (j = 0; j < originalLength + 1; j++) {
if (toFind[j] == '\0') {
return i;
}
if (original[i + j] != toFind[j]) {
break;
}
}
if (original[i] == '\0') {
return -1;
}
}
}
The function parameters cannot be modified, this is a requirement. Any help appreciated!
These statements inside the loops
if (toFind[j] == '\0') {
return i;
}
results in undefined behavior because the string toFind
can be shorter than the string original
.
The same is valid for this loop
if (original[i + j] != toFind[j]) {
break;
}
because i + j
can be greater than the length of the string original
.
And there is no need to scan all characters of the string original
if you are going to find a substring inside it.
Also you should check whether the length of the string original
is not less than the length of the string toFind
.
If you want to find only the first character of the string toFind
in the string original
it is enough to use standard C function strchr
. If you want to find the whole string toFind
in the string original
then you could use another C standard function strstr
.
If you want to write the function yourself to find a string in other string then it can look for example the following way
I declared the function like
long long int findSubString( const char original[], const char toFind[] );
however you can write its declaration as you like for example like
int findSubString( char original[], char toFind[] );
But in this case you should declare function local variable success
like
int success = -1;
and output the result using format specifier "%d"
instead of "%lld"
.
Here you are.
#include <stdio.h>
#include <string.h>
#include <stddef.h>
long long int findSubString( const char original[], const char toFind[] )
{
size_t n = strlen( original );
size_t m = strlen( toFind );
long long int success = -1;
if ( !( n < m ) )
{
n = n - m + 1;
for ( size_t i = 0; success == -1 && i < n; i++ )
{
size_t j = 0;
while ( j < m && original[i+j] == toFind[j] ) j++;
if ( j == m ) success = i;
}
}
return success;
}
int main(void)
{
printf( "%lld\n", findSubString( "The dog was fast", "dog" ) );
return 0;
}
Its output is
4
Your loops are reversed. The outer loop should walk positions from zero to originalLength
, inclusive; the nested loop should walk positions from zero to toFindLength
, inclusive.
Both originalLength
and toFindLength
should be set to values returned by strlen
, not strlen
plus one, because null terminator position is not a good start.
Finally, you are returning -1
from inside the outer loop. This is too early - you should be returning -1
only after you are done with the outer loop as well.
Your loop counter tests are incorrect: wrong upper limit and the limits are off by one. Note that the tests are actually not necessary as you exit both loops when hitting the '\0'
terminators.
Here is a simpler version:
int findSubString(const char *original, const char *toFind) {
for (size_t i = 0;; i++) {
for (size_t j = 0;; j++) {
if (toFind[j] == '\0') {
return i;
}
if (original[i + j] != toFind[j]) {
break;
}
}
if (original[i] == '\0') {
return -1;
}
}
}
There is a small advantage at computing the string lengths to reduce the number of comparisons in pathological cases such as findSubString("aaaaaaaaaaa", "aaaaaaaaaaaa");
int findSubString(const char *original, const char *toFind) {
size_t originalLength = strlen(original);
size_t toFindLength = strlen(toFind);
if (toFindLength <= originalLength) {
for (size_t i = 0; i <= originalLength - toFindLength; i++) {
for (size_t j = 0;; j++) {
if (toFind[j] == '\0') {
return i;
}
if (original[i + j] != toFind[j]) {
break;
}
}
}
}
return -1;
}