BRADFORD SCHOLARS

    • Sign in
    View Item 
    •   Bradford Scholars
    • Engineering and Informatics
    • Engineering and Informatics Publications
    • View Item
    •   Bradford Scholars
    • Engineering and Informatics
    • Engineering and Informatics Publications
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Browse

    All of Bradford ScholarsCommunitiesAuthorsTitlesSubjectsPublication DateThis CollectionAuthorsTitlesSubjectsPublication Date

    My Account

    Sign in

    HELP

    Bradford Scholars FAQsCopyright Fact SheetPolicies Fact SheetDeposit Terms and ConditionsDigital Preservation Policy

    Statistics

    Most Popular ItemsStatistics by CountryMost Popular Authors

    Classification of heterogeneous data based on data type impact of similarity

    • CSV
    • RefMan
    • EndNote
    • BibTex
    • RefWorks
    Thumbnail
    View/Open
    Neagu_ACIS_Chapter.pdf (574.6Kb)
    Download
    Publication date
    2019
    Author
    Ali, N.
    Neagu, Daniel
    Trundle, Paul R.
    Keyword
    Heterogeneous datasets
    Similarity measures
    Two-dimensional similarity space
    Classification algorithms
    Rights
    © Springer Nature Switzerland AG 2019. Reproduced in accordance with the publisher's self-archiving policy. The final publication is available at Springer via https://doi.org/10.1007/978-3-319-97982-3_21.
    Peer-Reviewed
    Yes
    
    Metadata
    Show full item record
    Abstract
    Real-world datasets are increasingly heterogeneous, showing a mixture of numerical, categorical and other feature types. The main challenge for mining heterogeneous datasets is how to deal with heterogeneity present in the dataset records. Although some existing classifiers (such as decision trees) can handle heterogeneous data in specific circumstances, the performance of such models may be still improved, because heterogeneity involves specific adjustments to similarity measurements and calculations. Moreover, heterogeneous data is still treated inconsistently and in ad-hoc manner. In this paper, we study the problem of heterogeneous data classification: our purpose is to use heterogeneity as a positive feature of the data classification effort by using consistently the similarity between data objects. We address the heterogeneity issue by studying the impact of mixing data types in the calculation of data objects’ similarity. To reach our goal, we propose an algorithm to divide the initial data records based on pairwise similarity for classification subtasks with the aim to increase the quality of the data subsets and apply specialized classifier models on them. The performance of the proposed approach is evaluated on 10 publicly available heterogeneous data sets. The results show that the models achieve better performance for heterogeneous datasets when using the proposed similarity process.
    URI
    http://hdl.handle.net/10454/16760
    Version
    Accepted Manuscript
    Citation
    Ali N, Neagu D and Trundle P (2019) Classification of heterogeneous data based on data type impact on similarity. In: Lotfi A, Bouchachia H, Gegov A et al (eds) Advances in Computational Intelligence Systems. UK Workshop on Computational Intelligence, 2018. Advances in Intelligent Systems and Computing. Springer: Cham. 840: 252-263.
    Link to publisher’s version
    https://doi.org/10.1007/978-3-319-97982-3_21
    Type
    Conference paper
    Collections
    Engineering and Informatics Publications

    entitlement

     
    DSpace software (copyright © 2002 - 2023)  DuraSpace
    Quick Guide | Contact Us
    Open Repository is a service operated by 
    Atmire NV
     

    Export search results

    The export option will allow you to export the current search results of the entered query to a file. Different formats are available for download. To export the items, click on the button corresponding with the preferred download format.

    By default, clicking on the export buttons will result in a download of the allowed maximum amount of items.

    To select a subset of the search results, click "Selective Export" button and make a selection of the items you want to export. The amount of items that can be exported at once is similarly restricted as the full export.

    After making a selection, click one of the export format buttons. The amount of items that will be exported is indicated in the bubble next to export format.