By Silva, A.; Oliveira, T.; Novais, P.; Neves, J.
Communications in Computer and Information Science
This work presents a survivability prediction model for rectal cancer patients developed through machine learning techniques. The model was based on the most complete worldwide cancer dataset known, the SEER dataset. After preprocessing, the training data consisted of 12,818 records of rectal cancer patients. Six features were extracted from a feature selection process, finding the most relevant characteristics which affect the survivability of rectal cancer. The model constructed with six features was compared with another one with 18 features indicated by a physician. The results show that the performance of the six-feature model is close to that of the model using 18 features, which indicates that the first may be a good compromise between usability and performance.