please anyone can help me to import this data into R from a text or dat file. It has space delimited, but cities names should not considered as two names. Like NEW YORK.
1 NEW YORK 7,262,700
2 LOS ANGELES 3,259,340
3 CHICAGO 3,009,530
4 HOUSTON 1,728,910
5 PHILADELPHIA 1,642,900
6 DETROIT 1,086,220
7 SAN DIEGO 1,015,190
8 DALLAS 1,003,520
9 SAN ANTONIO 914,350
10 PHOENIX 894,070
Expanding on @Hugh's answer I would try the following, although its not particularly efficient.
A variation on a theme... but first, some sample data:
Step 1: Read the data in with
readLines
Step 2: Figure out a regular expression that you can use to insert delimiters. Here, the pattern seems to be (looking from the end of the lines) a set of numbers and commas preceded by space preceded by some words in ALL CAPS. We can capture those groups and insert some "tab" delimiters (
\t
). The extra slashes are to properly escape them.Step 3: Since we know our
gsub
is working, and we know thatread.delim
has a "text
" argument that can be used instead of a "file
" argument, we can useread.delim
directly on the result ofgsub
:One possible last step would be to convert the third column to numeric:
For your particular data frame, where true spaces only occur between capital letters, consider using a regular expression:
You can then interpret spaces as field separators.