I have to implement DBSCAN using python, and the epsilon estimation has been posing problems as the already suggested method in the original research paper assumes blob like distribution of the dataset, where as in my case it is more of a cure fittable data with jumps at some intervals. The jumps cause the DBSCAN to form different clusters of various datasets in the intervals between jumps(which is good enough for me), but the epsilon calculation dynamically for different datasets does not produce desired results as the points tend to lie on a straight line for many intervals, and changing 'k' value cause a considerable change in the eps value.
可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
回答1:
Try using OPTICS algorithm, you won't need to estimate eps in that.
Also, I would suggest recursive regression, where you use the python's best curve fit scipy.optimize.curve_fit
to get best curve, and then find the rms error of all the points wrt the curve. Then remove 'n' percent of points, and recursively repeat this untill your rms error is less than your threshold.