• Home
  • Blog
  • MIS 650 GCU Key Steps in Data Mining Process Combines Values Discussion

MIS 650 GCU Key Steps in Data Mining Process Combines Values Discussion

0 comments

Discussion 1: Tyler

Some of the key steps in the data mining process include data cleaning, data integration, data reduction, data transformation, data mining, pattern evaluation, and knowledge representation. During data cleaning, one should ensure that noisy or incomplete data is removed and that missing data is filled out. In the data integration step, one should ensure that data cubes or files are combined for data analysis to improve the speed and accuracy of the data mining process. Data reduction is applied to reduce the mined data to just the data necessary for your analysis. Data transformation is a process in which data is transformed to a form that is more suitable for the data mining process. This allows the process to become more efficient through methods such as smoothing, aggregation, normalization, and discretization. Data mining is the step in the process where you use patterns to extract more data patterns. Pattern evaluation is where interesting patterns are identified and are summarized and visualized to make the data more understandable. A common final step is knowledge representation which is where the collected information is represented through various representation tools (Software Testing Help, 2021).

It is apparent that there are many steps to data mining and while these steps were given in a specific order, this is not necessarily the order that one will follow when data mining. This process is not always linear as the order that one must take to be successful often varies throughout each dataset. There may often be occasions where some steps are skipped and other situations where some steps that appear to naturally take place later in the process may actually be required to be taken earlier. This being the case, one should always keep their mind open when data mining to ensure that you are taking the correct steps in the correct order for one’s dataset in order to not hamper the success of the process. For example, one may recognize a pattern early one that can greatly speed up the process of the data reduction and data transformation steps. In cases like these, performing pattern evaluation first may be a great solution. This is just one example of many to describe this point.

References

Data Mining Process: Models, process steps & challenges involved. Software Testing Help. (2021, September 27). Retrieved October 8, 2021, from https://www.softwaretestinghelp.com/data-mining-process/#Data_Mining_Models.

Discussion 2: David

Good afternoon class,

Methodologies for data mining such as SEMMA and CRISP-DM provide users with the necessary steps to create models which can be utilized to gain meaningful insights and solve complex business problems. “It can be complicated to know exactly which data sources to gather to align with business objectives” (Christiansen, 2021). The key steps in the data mining process involves cleaning the data by removing duplicates or filling in missing values, combining datasets and sources such as connecting a database from SQL into R, reducing the quantity of data by removing insignificant attributes, transforming data through techniques such as aggregation, and mining the data using data mining software such as R. In addition, the data mining process also includes evaluating the patterns and generating insights to then create visualizations and share the findings with management to support the decision-making process. It is important to remember that the data mining process is not linear given that the process in its entirety does not change. Viewing the process as a linear hamper may impair the results of using the process given that users may deviate from the process such as missing the step of reducing data to improve data quality, potentially resulting in more time being required to complete the data mining process.

Reference:

Christiansen, L. (2021, April 1). 7 Key Steps in the Data Mining Process. Retrieved from https://zipreporting.com/en/data-mining/data-minin…

Discussion 3: Arcelia

As noted by Software Testing Help (2021), the data mining process used by SEMMA and CRISP-DM adheres to the Knowledge Discovery Process. Software Testing Help (2021) also notes that the key steps in data mining include data cleaning, data integration, data reduction, data transformation, data mining, pattern evaluation, and knowledge representation. The larger process includes the gathering of business requirements, and the identification of data sources and data formats which are a part of understanding the business ask and associated data. For example, meeting with stakeholders, and obtaining a sample of the data from the data source would be representative of this step. Next, the preparation of the data and the creation of a data model is undertaken. An example of this would be obtaining data from various sources, aggregating it, and cleaning it. Once the data is clean, we would create various models (e.g. a logistic model) and test designs to validate the effectiveness of the model. The assessment or evaluation of the model is then conducted, and if approved, the model is deployed. An example of this would be testing the model on an application to see if the results are as expected. If the model passes this phase, the model would be deployed according to business requirements.

The data mining process is iterative because it allows the model to be redefined or changed through the addition of new data (Software Testing Help, 2021). This flexibility allows analysts to construct models that effectively meet business needs. Viewing the Data Mining process as linear is not recommended as an analyst may develop a model that reaches the evaluation phase but does not meet the business requirements and expected outcomes. In this situation, the data analyst may need to return to the understanding phase of data mining to gather more requirements. Alternatively, the analyst may create several models during the preparation and modeling phase before proposing a model for evaluation. As discussed, this cyclical process allows for the refinement of the data mining model so that it meets business needs.

References

Software Testing Help. (2021, September 27). Data mining process: Models, process steps & challenges involved. https://www.softwaretestinghelp.com/data-mining-process/#Data_Mining_Models

About the Author

Follow me


{"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}