I have a R data.frame containing longitude, latitude which spans over the entire USA map. When X number of entries are all within a small geographic region of say a few degrees longitude & a few degrees latitude, I want to be able to detect this and then have my program then return the coordinates for the geographic bounding box. Is there a Python or R CRAN package that already does this? If not, how would I go about ascertaining this information?
相关问题
- how to define constructor for Python's new Nam
- streaming md5sum of contents of a large remote tar
- How to get the background from multiple images by
- Evil ctypes hack in python
- Correctly parse PDF paragraphs with Python
I was able to combine Joran's answer along with Dan H's comment. This is an example ouput:
The python code emits functions for R: map() and rect(). This USA example map was created with:
and then you can apply the rect()'s accordingly from with in the R GUI interpreter (see below).
Here is an example TSV file (site.tsv)
With my data set, the output of my python script, shown on the USA map. I changed the colors for clarity.
Addition on 2013-05-01 for Yacob
These 2 lines give you the over all goal...
If you want to narrow in on a portion of a map, you can use
ylim
andxlim
You will want to use the 'world' map...
It has been a long time since I have used this python code I have posted below so I will try my best to help you.
Here is a complete example. The TSV file is located on pastebin.com. I have also included an image generated from R that contains the output of all of the rect() commands.
A few ideas:
Each of these has different costs in dollars and time (in the learning curve)... and different degrees of geospatial accuracy. You have to pick what suits your budget and/or requirements.
maybe something like
if you use shapely, you could extend my cluster_points function to return the bounding box of the cluster via the .bounds property of the shapely geometry , for example like this:
I'm doing this on a regular basis by first creating a distance matrix and then running clustering on it. Here is my code.
I'm not sure if it completely solves your problem. You might want to test with different k, and also perhaps do a second run of clustering of some of the first clusters in case they are too big, like if you have one point in Minnesota and a thousand in California. When you have the points.to.group$group, you can get the bounding boxes by finding max and min lat lon per group.
If you want X to be 20, and you have 18 points in New York and 22 in Dallas, you must decide if you want one small and one really big box (20 points each), if it is better to have have the Dallas box include 22 points, or if you want to split the 22 points in Dallas to two groups. Clustering based on distance can be good in some of these cases. But it of course depend on why you want to group the points.
/Chris