Contributions to fuzzy object comparison and applications. Similarity measures for fuzzy and heterogeneous data and their applications.
AuthorBashon, Yasmina M.
Ridley, Mick J.
Fuzzy Geometrical Similarity Model
Fuzzy Set-Theoretical Similarity Model
The University of Bradford theses are licenced under a Creative Commons Licence.
InstitutionUniversity of Bradford
DepartmentDepartment of Computing, School of Computing, Informatics and Media
MetadataShow full item record
AbstractThis thesis makes an original contribution to knowledge in the fi eld of data objects' comparison where the objects are described by attributes of fuzzy or heterogeneous (numeric and symbolic) data types. Many real world database systems and applications require information management components that provide support for managing such imperfect and heterogeneous data objects. For example, with new online information made available from various sources, in semi-structured, structured or unstructured representations, new information usage and search algorithms must consider where such data collections may contain objects/records with di fferent types of data: fuzzy, numerical and categorical for the same attributes. New approaches of similarity have been presented in this research to support such data comparison. A generalisation of both geometric and set theoretical similarity models has enabled propose new similarity measures presented in this thesis, to handle the vagueness (fuzzy data type) within data objects. A framework of new and unif ied similarity measures for comparing heterogeneous objects described by numerical, categorical and fuzzy attributes has also been introduced. Examples are used to illustrate, compare and discuss the applications and e fficiency of the proposed approaches to heterogeneous data comparison.
Showing items related by title, author, creator and subject.
Neural and Neuro-Fuzzy Integration in a Knowledge-Based System for Air Quality Prediction.Neagu, Daniel; Avouris, N.M.; Kalapanidas, E.; Palade, V. (2002)In this paper we propose a unified approach for integrating implicit and explicit knowledge in neurosymbolic systems as a combination of neural and neuro-fuzzy modules. In the developed hybrid system, training data set is used for building neuro-fuzzy modules, and represents implicit domain knowledge. The explicit domain knowledge on the other hand is represented by fuzzy rules, which are directly mapped into equivalent neural structures. The aim of this approach is to improve the abilities of modular neural structures, which are based on incomplete learning data sets, since the knowledge acquired from human experts is taken into account for adapting the general neural architecture. Three methods to combine the explicit and implicit knowledge modules are proposed. The techniques used to extract fuzzy rules from neural implicit knowledge modules are described. These techniques improve the structure and the behavior of the entire system. The proposed methodology has been applied in the field of air quality prediction with very encouraging results. These experiments show that the method is worth further investigation.
Analogy-based software project effort estimation. Contributions to projects similarity measurement, attribute selection and attribute weighting algorithms for analogy-based effort estimation.Neagu, Daniel; Cowling, Peter I.; Azzeh, Mohammad Y.A. (University of BradfordDepartment of Computing School of Computing, Informatics & Media, 2010-10-01)Software effort estimation by analogy is a viable alternative method to other estimation techniques, and in many cases, researchers found it outperformed other estimation methods in terms of accuracy and practitioners¿ acceptance. However, the overall performance of analogy based estimation depends on two major factors: similarity measure and attribute selection & weighting. Current similarity measures such as nearest neighborhood techniques have been criticized that have some inadequacies related to attributes relevancy, noise and uncertainty in addition to the problem of using categorical attributes. This research focuses on improving the efficiency and flexibility of analogy-based estimation to overcome the abovementioned inadequacies. Particularly, this thesis proposes two new approaches to model and handle uncertainty in similarity measurement method and most importantly to reflect the structure of dataset on similarity measurement using Fuzzy modeling based Fuzzy C-means algorithm. The first proposed approach called Fuzzy Grey Relational Analysis method employs combined techniques of Fuzzy set theory and Grey Relational Analysis to improve local and global similarity measure and tolerate imprecision associated with using different data types (Continuous and Categorical). The second proposed approach presents the use of Fuzzy numbers and its concepts to develop a practical yet efficient approach to support analogy-based systems especially at early phase of software development. Specifically, we propose a new similarity measure and adaptation technique based on Fuzzy numbers. We also propose a new attribute subset selection algorithm and attribute weighting technique based on the hypothesis of analogy-based estimation that assumes projects that are similar in terms of attribute value are also similar in terms of effort values, using row-wise Kendall rank correlation between similarity matrix based project effort values and similarity matrix based project attribute values. A literature review of related software engineering studies revealed that the existing attribute selection techniques (such as brute-force, heuristic algorithms) are restricted to the choice of performance indicators such as (Mean of Magnitude Relative Error and Prediction Performance Indicator) and computationally far more intensive. The proposed algorithms provide sound statistical basis and justification for their procedures. The performance figures of the proposed approaches have been evaluated using real industrial datasets. Results and conclusions from a series of comparative studies with conventional estimation by analogy approach using the available datasets are presented. The studies were also carried out to statistically investigate the significant differences between predictions generated by our approaches and those generated by the most popular techniques such as: conventional analogy estimation, neural network and stepwise regression. The results and conclusions indicate that the two proposed approaches have potential to deliver comparable, if not better, accuracy than the compared techniques. The results also found that Grey Relational Analysis tolerates the uncertainty associated with using different data types. As well as the original contributions within the thesis, a number of directions for further research are presented. Most chapters in this thesis have been disseminated in international journals and highly refereed conference proceedings.