Iganfis Data Mining Approach for Forecasting Cancer Threats
Abstract
Healthcare facilities have at their disposal vast
amounts of cancer patients’ data. Medical practitioners
require more efficient techniques to extract relevant
knowledge from this data for accurate decision-making.
However the challenge is how to extract and act upon it in a
timely manner. If well engineered, the huge data can aid in
developing expert systems for decision support that can
assist physicians in diagnosing and predicting some
debilitating life threatening diseases such as cancer. Expert
systems for decision support can reduce the cost, the waiting
time, and liberate medical practitioners for more research, as
well as reduce errors and mistakes that can be made by
humans due to fatigue and tiredness. The process of utilizing
health data effectively however, involves many challenges
such as the problem of missing feature values, the curse of
dimensionality due to a large number of attributes, and the
course of actions to determine the features that can lead to
more accurate diagnosis. Effective data mining tools can
assist in early detection of diseases such as cancer. In This
paper, we propose a new approach called IGANFIS. This
approach optimally minimizes the number of features using
the information gain (IG) algorithm which is usually used in
text categorization to select the quality of text. The IG will
be used for selecting the quality of cancer features by virtue
of reducing them in number. The reduced number quality
features dataset will then be applied to the Adaptive Neuro
Fuzzy Inference System (ANFIS) to train and test the
proposed approach. ANFIS method of training is ideally the
hybrid learning algorithm which uses the gradient descent
method and Least Square Estimate (LSE) for computing the
error measure for each training pair. Each cycle of the
ANFIS hybrid learning consists of a forward pass to present
the input vector calculating the node outputs layer by layer
repeating the process for all data and a backward pass using
the steepest descent algorithm to update parameters, a
process called back propagation.