Loading...
Thumbnail Image
Publication

Proximity curves for potential-based clustering

Csenki, Attila
Torgunov, Denis
Micic, Natasha
Publication Date
2020
End of Embargo
Supervisor
Rights
© The Author(s) 2019. This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Peer-Reviewed
Yes
Open Access status
openAccess
Accepted for publication
2019
Institution
Department
Awarded
Embargo end date
Additional title
Abstract
The concept of proximity curve and a new algorithm are proposed for obtaining clusters in a finite set of data points in the finite dimensional Euclidean space. Each point is endowed with a potential constructed by means of a multi-dimensional Cauchy density, contributing to an overall anisotropic potential function. Guided by the steepest descent algorithm, the data points are successively visited and removed one by one, and at each stage the overall potential is updated and the magnitude of its local gradient is calculated. The result is a finite sequence of tuples, the proximity curve, whose pattern is analysed to give rise to a deterministic clustering. The finite set of all such proximity curves in conjunction with a simulation study of their distribution results in a probabilistic clustering represented by a distribution on the set of dendrograms. A two-dimensional synthetic data set is used to illustrate the proposed potential-based clustering idea. It is shown that the results achieved are plausible since both the ‘geographic distribution’ of data points as well as the ‘topographic features’ imposed by the potential function are well reflected in the suggested clustering. Experiments using the Iris data set are conducted for validation purposes on classification and clustering benchmark data. The results are consistent with the proposed theoretical framework and data properties, and open new approaches and applications to consider data processing from different perspectives and interpret data attributes contribution to patterns.
Version
Published version
Citation
Csenki A, Neagu CD, Torgunov D et al (2020) Proximity curves for potential-based clustering. Journal of Classification. 37: 671-695.
Link to publisher’s version
Link to published version
Type
Article
Qualification name
Notes

Version History

Now showing 1 - 2 of 2
VersionDateSummary
2*
2025-04-09 13:40:19
Edited author entries
2020-01-11 16:21:33
* Selected version