• Home
  • Blog
  • data mining project (prediction survival of oral and pharyngeal cancer) using SQL, WEKA, STATA

data mining project (prediction survival of oral and pharyngeal cancer) using SQL, WEKA, STATA

0 comments

The project topics should be related to analyzing healthcare data in order to solve clinical or administrative problems. You will be required to submit this project in 3 portions:

  1. Report
  2. Presentation
  3. Source code, results, materials, findings

You need to find data that is suitable to solve your problem. You may want to use only newer data (after 2004 or newer). You need to decide based on variable availability and sample size. Most importantly the data need to be on the right level of aggregation.

The project report should include, but be not limited to:

  1. problem description
  2. data selection
  3. data pre-processing
  4. selection DM methods
  5. application of methods
  6. analysis of results
  7. review of available literature and related work
  8. conclusions and description of impact on healthcare
  9. As well as a brief description of what you learned in the project.

Direct application of existing software to publically available datasets is not sufficient. The projects must demonstrate significant efforts in data manipulation, processing, and mining. Projects must also illustrate understanding of applied techniques as well as the healthcare problem addressed.

1. You are not predicting survival rate, but survival. Rate exists in population and there is nothing to be predicted. What you want is a model that will take data for one individual and output the chance of survival.

2. You need to make the work sufficiently complex, in most cases, it will mean pulling many attributes/fields, trying several methods, etc.

3. SEER changes over time, even though data goes back to the 1970s, you may want to use only newer data (after 2004 or newer). You need to decide based on variable availability and sample size.

Start exploring SEER data to see how to use their software to download, etc.

About the Author

Follow me


{"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}