Convert string to number & vice versa complexity

2019-03-19 03:51发布

问题:

What would be the complexity of converting a string to its equivalent number or vice versa? Does it change depending on programming language?

On the face of it, one needs to traverse the entire string to convert it to a number, so it is O(n), or is some typecasting used?

This doubt came about when I was writing a routine to check if a given number is a palindrome or not. One approach would be to keep dividing the number by the base (here 10), accumulate digits, and put them together at the end. Example: 309/10=rem(9), 30/10=rem(0), 3/10=rem(3). we get 903.

Another approach I took was to convert this number into a string, and since strings have loads of member functions to split, reverse etc., the code was a lot shorter and cleaner, but is this the best way to do this?

回答1:

Numeric strings are numbers formatted in positional notation - so the value of each digit multiplied by a power of the base needs to be taken into account in order to convert the number into a binary format.

So yeah, it's an O(N) operation because the running time increases linearly as more digits are added. However, in practice N may be limited by whatever numeric data-types the language supports (e.g. int32_t, int64_t). But if arbitrary precision number types are used (which some languages, like Python, use by default) then there is no limit to the number of digits (other than available memory obviously).



回答2:

To convert to a number you always have to read all digits. So it's at least O(n).

Now doing something like (pseudocode)

a = 0
foreach digit in string
do
   a = 10 * a + digit
end

Is O(n). So the complexity is O(n)



回答3:

If you're converting a number N to a string. It takes O(log(N)) with base 10. (If you divide by 10 and keep the remainder) If you're converting a string with length N, then it takes O(N). (If you use the algorithm whic keeps adding to your number 10^(N)*digit(N))

If you use functions which are not yours (let's say for string), you can only expect them to be slower.



回答4:

I'm reasonably certain that dealing with pure numeric operators (in c++ and c# I think it would be the "%" modulus operator) is going to be more efficient if coded correctly because at some level you have to check for similar features (does the end match the beginning) and performing a conversion between string and number can only add to the complexity of the operation if you can do the same thing without performing that conversion.

That said, I would not worry about the performance impact of converting between numbers and strings because it's probably insignificant compared to the performance impact of most other areas of a program. Numeric types are limited to 64 bits, which puts a relatively low cap on the number of digits you can plan to parse anyway, unless you are implementing/using custom-coded large-number types.

You don't have to worry about the complexity being O(n) where n is the magnitude of the number. It would be more like O(n) where n is the number of digits (which has the low cap I mentioned) or (as mentioned in another reply) O(log(n)) if n is the magnitude of the number. Relatively insignificant performance impact.

Now if, as you suggest, you have no limit on N (which is not possible because with 2 GB of RAM you can only store numbers with up to 2 billion digits), then we might have to think more about the performance of performing mathematical operators. Consider the performance of the "%" and "/" operator on this large number type. But then realize that in order to convert the number to a string, it's basically using those same operators anyway. Once again, you can't beat handling it as a number directly if you do it right.



回答5:

C# and C/C++ don't have any special information in the strings which represents the (possible) numerical value. Therefore they need to parse the string digit by digit when converting.

However, the number of digits is limited, so we've got just O(1): the conversion time is bounded (usually by the conversion of the biggest number). For a 32-bit int, the conversion has to consider maximum of 10 decimal digits (and possibly a sign).

Conversion from string is actually O(1) as well, because during parsing of it, it's enough to consider only a bounded number of characters (10+1 in case of 32-bit int).

Strictly speaking, we cannot use O-notation for the case of int-to-string conversion, since the maximal value of int is bounded. Anyway, the time needed for conversion (in both directions) is limited by a constant.

As @Charles suggests, other languages (Python) actually can use arbitrary-precision numbers. For parsing such numbers, the time is O(number of digits), which O(string length) and O(log(number)) for both conversions, respectively. With arbitrary-precision numbers, one cannot do it faster, since for both conversions every digit must be considered. For the conversions to/from a limited-precision numbers, the same O(1) reasoning applies. However I didn't profile the parsing in Python myself, so maybe a less efficient algorithm is used there.


EDIT: following the @Steve suggestion, I checked that parsing in C/C++ and C# skips the initial whitespace, so the time for string->int conversion is actually O(input length). In case it's known that the string is trimmed, the conversion is again O(1).