Enhancing Fuzzy Associative Rule Mining Approaches for Improving Prediction Accuracy. Integration of Fuzzy Clustering, Apriori and Multiple Support Approaches to Develop an Associative Classification Rule Base

View/ Open
Bilal Sowan PhD thesis (final version).pdf (2.349Mb)
Download
Publication date
2012-02-13Author
Sowan, Bilal I.Supervisor
Dahal, Keshav P.Hossain, M. Alamgir
Keyword
PredictionFuzzy associative rule mining
Fuzzy clustering
Associative classification rule base
Data mining
Predictive model
Decision support system
Minimizing prediction error
Prediction performance
Rights

The University of Bradford theses are licenced under a Creative Commons Licence.
Institution
University of BradfordDepartment
School of Computing, Informatics & MediaAwarded
2011
Metadata
Show full item recordAbstract
Building an accurate and reliable model for prediction for different application domains, is one of the most significant challenges in knowledge discovery and data mining. This thesis focuses on building and enhancing a generic predictive model for estimating a future value by extracting association rules (knowledge) from a quantitative database. This model is applied to several data sets obtained from different benchmark problems, and the results are evaluated through extensive experimental tests. The thesis presents an incremental development process for the prediction model with three stages. Firstly, a Knowledge Discovery (KD) model is proposed by integrating Fuzzy C-Means (FCM) with Apriori approach to extract Fuzzy Association Rules (FARs) from a database for building a Knowledge Base (KB) to predict a future value. The KD model has been tested with two road-traffic data sets. Secondly, the initial model has been further developed by including a diversification method in order to improve a reliable FARs to find out the best and representative rules. The resulting Diverse Fuzzy Rule Base (DFRB) maintains high quality and diverse FARs offering a more reliable and generic model. The model uses FCM to transform quantitative data into fuzzy ones, while a Multiple Support Apriori (MSapriori) algorithm is adapted to extract the FARs from fuzzy data. The correlation values for these FARs are calculated, and an efficient orientation for filtering FARs is performed as a post-processing method. The FARs diversity is maintained through the clustering of FARs, based on the concept of the sharing function technique used in multi-objectives optimization. The best and the most diverse FARs are obtained as the DFRB to utilise within the Fuzzy Inference System (FIS) for prediction. The third stage of development proposes a hybrid prediction model called Fuzzy Associative Classification Rule Mining (FACRM) model. This model integrates the ii improved Gustafson-Kessel (G-K) algorithm, the proposed Fuzzy Associative Classification Rules (FACR) algorithm and the proposed diversification method. The improved G-K algorithm transforms quantitative data into fuzzy data, while the FACR generate significant rules (Fuzzy Classification Association Rules (FCARs)) by employing the improved multiple support threshold, associative classification and vertical scanning format approaches. These FCARs are then filtered by calculating the correlation value and the distance between them. The advantage of the proposed FACRM model is to build a generalized prediction model, able to deal with different application domains. The validation of the FACRM model is conducted using different benchmark data sets from the University of California, Irvine (UCI) of machine learning and KEEL (Knowledge Extraction based on Evolutionary Learning) repositories, and the results of the proposed FACRM are also compared with other existing prediction models. The experimental results show that the error rate and generalization performance of the proposed model is better in the majority of data sets with respect to the commonly used models. A new method for feature selection entitled Weighting Feature Selection (WFS) is also proposed. The WFS method aims to improve the performance of FACRM model. The prediction performance is improved by minimizing the prediction error and reducing the number of generated rules. The prediction results of FACRM by employing WFS have been compared with that of FACRM and Stepwise Regression (SR) models for different data sets. The performance analysis and comparative study show that the proposed prediction model provides an effective approach that can be used within a decision support system.Type
ThesisQualification name
PhDCollections
Related items
Showing items related by title, author, creator and subject.
-
The use of solubility parameters to predict the behaviour of a co-crystalline drug dispersed in a polymeric vehicle. Approaches to the prediction of the interactions of co-crystals and their components with hypromellose acetate succinate and the characterization of that interaction using crystallographic, microscopic, thermal, and vibrational analysis.Forbes, Robert T.; Bonner, Michael C.; Isreb, Abdullah (University of BradfordThe School of Pharmacy., 2013-04-09)Dispersing co-crystals in a polymeric carrier may improve their physicochemical properties such as dissolution rate and solubility. Additionally co-crystal stability may be enhanced. However, such dispersions have been little investigated to date. This study focuses on the feasibility of dispersing co-crystals in a polymeric carrier and theoretical calculations to predict their stability. Acetone/chloroform, ethanol/water, and acetonitrile were used to load and grow co-crystals in a HPMCAS film. Caffeine-malonic acid and ibuprofennicotinamide co-crystals were prepared using solvent evaporation method. The interactions between each of the co-crystals components and their mixtures with the polymer were studied. A solvent evaporation approach was used to incorporate each compound, a mixture, and co-crystals into HPMCAS films. Differential scanning calorimetry data revealed a higher affinity of the polymer to acidic compounds than their basic counterparts as noticed by the depression of the glass transition temperature (Tg). Moreover, the same drug loading produced films with different Tgs when different solvents were used. Solubility parameter values (SP) of the solvents were employed to predict that effect on the depression of polymer Tg with relative success. SP values were more successful in predicting the preferential affinity of two acidic compounds to interact with the polymer. This was confirmed using binary mixtures of naproxen, flurbiprofen, malonic acid, and ibuprofen. On the other hand, dispersing basic compounds such as caffeine or nicotinamide with malonic acid in HPMCAS film revealed the growth of co-crystals. A dissolution study showed that the average release of caffeine from films containing caffeine-malonic acid was not significantly different to that of films containing similar caffeine concentration. The stability of the caffeine-malonic acid co-crystals in HPMC-AS was prolonged to 8 weeks at 95% relative humidity and 45°C. The theory developed in this project, that an acidic drug with a SP value closer to the polymer will dominate the interaction process and prevent the majority of the other material from interacting with the polymer, may have utility in designing co-crystal systems in polymeric vehicles
-
Prediction of natural frequencies of turbine blades for turbocharger application. An investigation of the finite element method, mathematical modelling and frequency survey methods applied to turbocharger blade vibration in order to predict natural frequencies of turbocharger blades.Olley, Peter; Zdunek, Agnieszka Izabela (University of BradfordSchool of Engineering and Informatics, 2015-07-03)Methods of determining natural frequencies of the D76D88, B76D88, A86E93, C86G90, C86L90 and C125L89 turbine wheel designs for various environmental conditions were investigated by application of Finite Element Analysis and beam theory. Modelling and simulation methods were developed ; the first method composed of 15 finite element simulations ; the second composed of 15 finite element simulations and a set of experimental frequency survey results; the third composed of 5 simulations , an incorporated mathematical model and a set of experimental frequency survey results. Each of these methods was designed to allow prediction of resonant frequency changes across a range of exhaust gas temperature and shaft rotational speed. For the new modelling and simulation methods, an analysis template and a plotting tool were developed using Microsoft Excel and MATLAB software. A graph showing a frequency-temperature-speed variations and a Campbell Diagram that incorporates material stiffening and softening effects across a range of rotational speeds was designed, and applied to the D76D88, B76D88, A86E93, C86G90, C86L90 and C125L89 turbine wheel designs. New design methodologies for turbine wheels were formulated and validated, showing a good agreement with a range of data points from frequency survey, strain-gauge telemetry and laser tip-timing test results. The results from the new design method were compared with existing single compensation factor methodology, and showed a great improvement in accuracy of prediction of modal vibration. A new nomenclature for the mode shapes of a turbocharger’s blade was proposed, designed and demonstrated to allow direct identification of associated mode shape. It is concluded that Finite Element Analysis combined with the frequency survey is capable of predicting changes in turbine natural frequencies and, when incorporated into the existing turbine design methodology, resulted in a major improvement in the accuracy of the predictions of vibration frequency.
-
Enhanced flare prediction by advanced feature extraction from solar images : developing automated imaging and machine learning techniques for processing solar images and extracting features from active regions to enable the efficient prediction of solar flares.Qahwaji, Rami S.R.; Ipson, Stanley S.; Colak, Tufan; Ahmed, Omar W. (University of BradfordSchool of Computing, Informatics & Media, 2012-04-16)Space weather has become an international issue due to the catastrophic impact it can have on modern societies. Solar flares are one of the major solar activities that drive space weather and yet their occurrence is not fully understood. Research is required to yield a better understanding of flare occurrence and enable the development of an accurate flare prediction system, which can warn industries most at risk to take preventative measures to mitigate or avoid the effects of space weather. This thesis introduces novel technologies developed by combining advances in statistical physics, image processing, machine learning, and feature selection algorithms, with advances in solar physics in order to extract valuable knowledge from historical solar data, related to active regions and flares. The aim of this thesis is to achieve the followings: i) The design of a new measurement, inspired by the physical Ising model, to estimate the magnetic complexity in active regions using solar images and an investigation of this measurement in relation to flare occurrence. The proposed name of the measurement is the Ising Magnetic Complexity (IMC). ii) Determination of the flare prediction capability of active region properties generated by the new active region detection system SMART (Solar Monitor Active Region Tracking) to enable the design of a new flare prediction system. iii) Determination of the active region properties that are most related to flare occurrence in order to enhance understanding of the underlying physics behind flare occurrence. The achieved results can be summarised as follows: i) The new active region measurement (IMC) appears to be related to flare occurrence and it has a potential use in predicting flare occurrence and location. ii) Combining machine learning with SMART¿s active region properties has the potential to provide more accurate flare predictions than the current flare prediction systems i.e. ASAP (Automated Solar Activity Prediction). iii) Reduced set of 6 active region properties seems to be the most significant properties related to flare occurrence and they can achieve similar degree of flare prediction accuracy as the full 21 SMART active region properties. The developed technologies and the findings achieved in this thesis will work as a corner stone to enhance the accuracy of flare prediction; develop efficient flare prediction systems; and enhance our understanding of flare occurrence. The algorithms, implementation, results, and future work are explained in this thesis.