Count number of points within certain distance ran

2019-07-31 01:45发布

问题:

I have the following, which gives me the number of customers within 10,000 meters of any store location:

SELECT COUNT(*) as customer_count FROM customer_table c
WHERE EXISTS(
SELECT 1 FROM locations_table s
WHERE ST_Distance_Sphere(s.the_geom, c.the_geom) < 10000
)

What I need is for this query to return not only the number of customers within 10,000 meters, but also the following. The number of customers within...

  1. 10,000 meters
  2. more than 10,000, but less than 50,000
  3. more than 50,000, but less than 10,0000
  4. more than 100,000

...of any location.

I'm open to this working a couple of ways. For a given customer, only count them one time (the shortest distance to any store), which would count everyone exactly once. I realize this is probably pretty complex. I'm also open to having people be counted multiple times, which is really the accurate values anyway and think should be much simpler.

Thanks for any direction.

回答1:

You can do both types of queries relatively easily. But an issue here is that you do not know which customers are associated with which store locations, which seems like an interesting thing to know. If you want that, use the PK and store_name of the locations_table in the query. See both options with location id and store_name below. To emphasize the difference between the two options:

  • The first option indicates how many customers are in every distance class for every store location, for all customers for every store location.
  • The second option indicates how many customers are in every distance class for every store location, for the nearest store location for each customer only.

This is a query of O(n x m) running order (implemented with the CROSS JOIN between customer_table and locations_table) and likely to become rather slow with increasing numbers of rows in either table.

Count customers in all distance classes

You should make a CROSS JOIN between the distances of customers from store locations and then group them by the store location id, name and classes of maximum distance that you define. You can create a "table" from your distance classes with the VALUES command which you can then simply use in any query:

SELECT loc_dist.id, loc_dist.store_name, grps.grp, count(*)
FROM (
    SELECT s.id, s.store_name, ST_Distance_Sphere(s.the_geom, c.the_geom) AS dist
    FROM customer_table c, locations_table s) AS loc_dist
JOIN (
    VALUES(1, 10000.), (2, 50000.), (3, 100000.), (4, 1000000.)
  ) AS grps(grp, dist) ON loc_dist.dist < grps.dist
GROUP BY 1, 2, 3
ORDER BY 1, 2, 3;

Count customers in the nearest distance class

If you want customers listed in the nearest distance class only, then you should make the same CROSS JOIN on customer_table and locations_table as in the previous case, but then simply select the lowest group (i.e. the closest store) using a CASE clause in the query and GROUP BY store location id, name and distance class as before:

SELECT 
  id, store_name,
  CASE
    WHEN dist <  10000. THEN 1
    WHEN dist <  50000. THEN 2
    WHEN dist < 100000. THEN 3
    ELSE 4
  END AS grp,
  count(*)
FROM (
    SELECT s.id, s.store_name, ST_Distance_Sphere(s.the_geom, c.the_geom) AS dist
    FROM customer_table c, locations_table s) AS loc_dist
GROUP BY 1, 2, 3
ORDER BY 1, 2, 3;