I have input such as "(50.1003781N, 14.3925125E)"
.These are latitude and longitude.
I want to parse this with
sscanf(string,"(%lf%c, %lf%c)",&a,&b,&c,&d);
but when %lf
sees E
after the number, it consumes it and stores it as number in exponential form. Is there way to disable this?
I think you'll need to do manual parsing, probably using strtod()
. This shows that strtod()
behaves sanely when it comes up against the trailing E
(at least on Mac OS X 10.10.3 with GCC 4.9.1 — but likely everywhere).
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void)
{
const char latlong[] = "(50.1003781N, 14.3925125E)";
char *eptr;
double d;
errno = 0; // Necessary in general, but probably not necessary at this point
d = strtod(&latlong[14], &eptr);
if (eptr != &latlong[14])
printf("PASS: %10.7f (%s)\n", d, eptr);
else
printf("FAIL: %10.7f (%s) - %d: %s\n", d, eptr, errno, strerror(errno));
return 0;
}
Compilation and run:
$ gcc -O3 -g -std=c11 -Wall -Wextra -Werror latlong.c -o latlong
$ ./latlong
PASS: 14.3925125 (E))
$
Basically, you'll skip white space, check for an (
, strtod()
a number, check for N
or S
or lower case versions, comma, strtod()
a number, check for W
or E
, check for )
maybe allowing white space before it.
Upgraded code, with moderately general strtolatlon()
function based on strtod()
et al. The 'const cast' is necessary in the functions such as strtod()
which take a const char *
input and return a pointer into that string via a char **eptr
variable.
#include <ctype.h>
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define CONST_CAST(type, value) ((type)(value))
extern int strtolatlon(const char *str, double *lat, double *lon, char **eptr);
int strtolatlon(const char *str, double *lat, double *lon, char **eptr)
{
const char *s = str;
char *end;
while (isspace(*s))
s++;
if (*s != '(')
goto error;
*lat = strtod(++s, &end);
if (s == end || *lat > 90.0 || *lat < 0.0)
goto error;
int c = toupper((unsigned char)*end++);
if (c != 'N' && c != 'S') // I18N
goto error;
if (c == 'S')
*lat = -*lat;
if (*end != ',')
goto error;
s = end + 1;
*lon = strtod(s, &end);
if (s == end || *lon > 180.0 || *lon < 0.0)
goto error;
c = toupper((unsigned char)*end++);
if (c != 'W' && c != 'E') // I18N
goto error;
if (c == 'E')
*lon = -*lon;
if (*end != ')')
goto error;
if (eptr != 0)
*eptr = end + 1;
return 0;
error:
if (eptr != 0)
*eptr = CONST_CAST(char *, str);
errno = EINVAL;
return -1;
}
int main(void)
{
const char latlon1[] = "(50.1003781N, 14.3925125E)";
const char latlon2[] = " (50.1003781N, 14.3925125E) is the position!";
char *eptr;
double d;
errno = 0; // Necessary in general, but Probably not necessary at this point
d = strtod(&latlon1[14], &eptr);
if (eptr != &latlon1[14])
printf("PASS: %10.7f (%s)\n", d, eptr);
else
printf("FAIL: %10.7f (%s) - %d: %s\n", d, eptr, errno, strerror(errno));
printf("Converting <<%s>>\n", latlon2);
double lat;
double lon;
int rc = strtolatlon(latlon2, &lat, &lon, &eptr);
if (rc == 0)
printf("Lat: %11.7f, Lon: %11.7f; trailing material: <<%s>>\n", lat, lon, eptr);
else
printf("Conversion failed\n");
return 0;
}
Sample output:
PASS: 14.3925125 (E))
Converting << (50.1003781N, 14.3925125E) is the position!>>
Lat: 50.1003781, Lon: -14.3925125; trailing material: << is the position!>>
That is not comprehensive testing, but it is illustrative and close to production quality. You might need to worry about infinities, for example, in true production code. I don't often use goto
, but this is a case where the use of goto
simplified the error handling. You could write the code without it; if I had more time, maybe I would upgrade it. However, with seven places where errors are diagnosed and 4 lines required for reporting the error, the goto
provides reasonable clarity without great repetition.
Note that the strtolatlon()
function explicitly identifies errors via its return value; there is no need to guess whether it succeeded or not. You can enhance the error reporting if you wish to identify where the error is. But doing that depends on your error reporting infrastructure in a way this does not.
Also, the strtolatlon()
function will accept some odd-ball formats such as (+0.501003781E2N, 143925125E-7E)
. If that's a problem, you'll need to write your own fussier variant of strtod()
that only accepts fixed-point notation. On the other hand, there's a meme/guideline "Be generous in what you accept; be strict in what you produce". That implies that what's here is more or less OK (it might be good to allow optional white space before the N, S, E, W letters, the comma and the close parenthesis). The converse code, latlontostr()
or fmt_latlon()
(with strtolatlon()
renamed to scn_latlon()
, perhaps) or whatever, would be careful about what it produces, only generating upper-case letters, and always using the fixed format, etc.
int fmt_latlon(char *buffer, size_t buflen, double lat, double lon, int dp)
{
assert(dp >= 0 && dp < 15);
assert(lat >= -90.0 && lat <= 90.0);
assert(lon >= -180.0 && lon <= 180.0);
assert(buffer != 0 && buflen != 0);
char ns = 'N';
if (lat < 0.0)
{
ns = 'S';
lat = -lat;
}
char ew = 'W';
if (lon < 0.0)
{
ew = 'E';
lon = -lon;
}
int nbytes = snprintf(buffer, buflen, "(%.*f%c, %.*f%c)", dp, lat, ns, dp, lon, ew);
if (nbytes < 0 || (size_t)nbytes >= buflen)
return -1;
return 0;
}
Note that 1 unit at 7 decimal places of a degree (10-7 ˚) corresponds to about a centimetre on the ground (oriented along a meridian; the distance represented by a degree along a parallel of latitude varies with the latitude, of course).
Process the string first using
char *p;
while((p = strchr(string, 'E')) != NULL) *p = 'W';
while((p = strchr(string, 'e')) != NULL) *p = 'W';
// scan it using your approach
sscanf(string,"(%lf%c, %lf%c)",&a,&b,&c,&d);
// get back the original characters (converted to uppercase).
if (b == 'W') b = 'E';
if (d == 'W') d = 'E';
strchr()
is declared in the C header <string.h>
.
Note: This is really a C approach, not a C++ approach. But, by using sscanf()
you are really using a C approach.