StudentShare
Contact Us
Sign In / Sign Up for FREE
Search
Go to advanced search...
Free

Data Mining Process - Coursework Example

Cite this document
Summary
"Data Mining Process" paper states that challenging task to develop effective techniques for preventing the disclosure of sensitive information in data mining, especially now that the use of data mining system is increasing in domains ranging from business analysis to medicine and government. …
Download full paper File format: .doc, available for editing
GRAB THE BEST PAPER92.9% of users find it useful
Data Mining Process
Read Text Preview

Extract of sample "Data Mining Process"

Data Mining Process Table of Contents References 11 In this information age, every firm strives to be competitive in their ventures. Competitive advantage of the company is founded on its capacity. Abilities of the company are determined by how deep the knowledge at its disposal is. Search for better knowledge has given rise to data mining activities. Data mining can then be considered as drawing out of an unseen prognostic knowledge from a pool of data available to a firm. Through its advanced techniques, a company can highlight the most important information in their data store. Data store of a company can also be known to as information warehouse. Data tools can help a company solve business questions that could have been time consuming if traditional methods were deployed. A company can implement its information mining tools on available hardware and software platforms to augment the significance of on-hand information assets, and can get integrated with newer items and system in the process of bringing them online. Every business needs depth of information to survive in this information age. A data warehouse of firm, therefore, provides a platform where a company holds all its data. Data warehouse is where the firm’s data are contained in a centralized and standard manner for usage by employees. Literally, data storehouse is represented by repositories of statistics that a company needs to survive in this day and age characterized by information. It can also be said to be an emerging or current reporting trend to enable data users get direct access to relevant information. Data warehousing is thus a modern strategic tool implemented by many companies. Data warehousing is not just a rising tide strategy that unburdens all employees from strategic acumen, but also a weapon that let a firm compete across time and globe. Introduction Data mining process simply refers to a procedure of extracting knowledge from a database. It involves discovering interesting understanding from a collection of data. The process of data mining has been acknowledged as one of the most essential steps in find out more about a given knowledge. Information industry has seen a lot of growth in interest in turning the available data into useful knowledge and information. A firm will, therefore, employ various information mining techniques in its awareness discovery progression from a significant amount of information kept in databases, information warehouses, or other data repositories. Generally, knowledge development involves various steps that can be considered as a system. Data cleaning is the first procedure to be followed in knowledge development. The process of data cleaning includes management of noisy, misplaced or irrelevant activities. After data cleaning the data will be combined, that is, different data sources are joined into one. Once the data has been integrated into one the next step is data selection, where data relevant to the analysis to be carried out is retrieved from the database. Once the data has been selected, it is changed or merged into a state suitable for data mining process. The information mining procedure will then get conducted through aggregation operations. After transformation, data mining process then become the next step. Efficient methods are then applied accordingly in order to abstract the required data pattern. Conversely, knowledge unearthing does not end at information mining phase. After data mining the next stage is pattern evaluation, which is used to figure out the actual unique trend signifying knowledge inferred from some measure of uniqueness. Finally, the knowledge obtained will be presented. In case visualization and other knowledge presentation techniques are used, the mined knowledge will be applied in decision making in the appropriate form. Information mining can generally get classified into two clusters. The categories can be referred to as descriptive data mining and predictive data mining. A pool of data given in summary form is known as descriptive data. It summarizes and reflects unique general features of a set of data. Predictive data mining on the hand hypothesizes a model or a set of model, then conduct inferences of the available set of data. Predictive data mining also attempts to forecast the characteristic of first-hand pool of data. Data mining involves various tasks as discussed in the following sections of the essay. These various tasks include prediction, classification, clustering, association, visualization and time-series forecasting as discussed below. Prediction Data mining helps predict the probable units of some misplaced data or the value distribution of certain characteristics in a group of items (Berry & Linoff, 2004). Prediction, therefore, involves finding a set of characteristics of the attributes of interest, through statistical analysis. It then predicts the value distribution founded on a set of data similar to the selected objects (Last, 2004). For example, taking a distribution of salaries of similar employees in a company, one can predict a probable salary of a potential employee. To obtain a quality prediction, various tools such as regression analysis, correlation analysis, generalized linear model and decision trees are usually applied. Classification Classification analyzes data at hand and then constructs a model for each class based on attributes in the data (Berry, 2006). A set of classification criteria is then generated to be used for better understanding of each category in the database and classification in the future. Having the well-defined classes of diseases for example, can help in predicting the kind of diseases based on the symptoms of patients. Clustering A group of data items that poses same attributes to each other is referred to as cluster (Berry, 2006). The similarity can be based on different features as stated by experts. A good clustering technique should provide for a low inter-cluster and high intra-cluster similarity (Laroes, 2005). Clustering categories for a house, for example, might be based on floor area, type of the house and the area in which the house is located. Association Relationship or correlations among a set of data defines their association. Association is expressed in the rule from showing characteristics-value conditions that exist mostly together in a given set of data (Berry &Linoff, 2004). Mostly association analysis is applied in transaction data analysis especially for catalog design, direct marketing or other business decision-making process (Last, 2004). Various measures such as a multi-dimensional association mining multiple-level, mining correlations and interval data are the most recommended techniques in applying association analysis. Visualization Visualization can be defined as a data presentation that allows users to view them through complex patterns (Aggarwal, 2012). Various data models in collaboration with visualization give a better insight of the established trends or relationship. 3D graphs are the most known example of visualization models. Time-series forecasting It is mainly involved with analysing a pool of time-series data. Time-series forecasting thus analyses the differences and unique characteristics in a data (Larose, 2005). It means time-series analysis would also help in searching for trends or subsequences exhibiting same characteristics, and establishing chronological trends, periodicities, tendencies and nonconformities (Berry, 2006). For example, a stock history, competitors’ performance and business situation of a company can be used to predict the trend of stock values for the company. Conclusion Data mining process as indicated by the tasks involved is not just confined to interactive, transactional, and data warehouse (Groth, 2000). Other complex types of data may complex techniques of data mining. Methods and content based on image from a text analysis, on the other hand, play a significant role at extracting text and multimedia data, respectively (Last, 2004). These methods can be used with data cube and data mining techniques for efficient mining of such types of data. Diversity of data, data excavating tasks, and data mining approaches, however, present many challenging research topics on data mining (Aggarwal, 2012). For data mining system and application developers, the kind of language used in data mining, development of easy to use data mining techniques and system, construction of interactive and integrated data mining environment and application of data mining techniques in solving large application problem form the most important aspect of the whole process (Berry & Linoff, 2004). In addressing these shortcomings, many data mining systems have been developed in the past years. The trends in research and development of data mining system is seen as an undertaking that is going to flourish because of the huge amounts of data that have been collected in databases. The trends will also flourish as a result of a growing necessity to understand and make good use of such data in decision making thus a playing a significant role in management and economic growth. Moreover, with the rapidly growing computerizing age in societies, expected social impact of data mining should never be underestimated. A lot of threats on the goal of protecting data security and guarding against the invasion of privacy has been experienced when large amount of interrelated data is effectively analyzed from different perspectives. It is therefore, a challenging task to develop effective techniques for preventing the disclosure of sensitive information in data mining, especially now that the use of data mining system is rapidly increasing in domains ranging from business analysis, customer analysis to medicine and government. APPENDICES Figure 1.0: Explaining various tasks involved in the process of data mining. Figure 2.0: Sample diagram of Visualization in Data Mining process References Aggarwal, C. (2012). Mining text data. New York: Springer. Berry, M. (2006). Lecture notes in data mining. Singapore: World Scientific. Berry, M. J. A., &Linoff, G. S. (2004). Data mining techniques second edition - for marketing, sales, and customer relationship management. Wiley Groth, R. (2000). Data mining: Building competitive advantage. Upper Saddle River, NJ: Prentice Hall PTR. Larose, D. (2005). Discovering knowledge in data an introduction to data mining. Hoboken, N.J.: Wiley-Interscience. Last, M. (2004). Data mining in time series databases. New Jersey: World Scientific. Read More
Cite this document
  • APA
  • MLA
  • CHICAGO
(Data Mining Process Coursework Example | Topics and Well Written Essays - 1500 words, n.d.)
Data Mining Process Coursework Example | Topics and Well Written Essays - 1500 words. https://studentshare.org/business/1855350-w3-a566-data-mining-process
(Data Mining Process Coursework Example | Topics and Well Written Essays - 1500 Words)
Data Mining Process Coursework Example | Topics and Well Written Essays - 1500 Words. https://studentshare.org/business/1855350-w3-a566-data-mining-process.
“Data Mining Process Coursework Example | Topics and Well Written Essays - 1500 Words”. https://studentshare.org/business/1855350-w3-a566-data-mining-process.
  • Cited: 0 times

CHECK THESE SAMPLES OF Data Mining Process

Data Mining Techniques for Identifying Information Sources

This report "data mining Techniques for Identifying Information Sources" presents database efficiency.... In the process of data mining, the data associated with people have risky ethical significances.... data mining experts need to deploy norms for making the data application resilient (Keating, 2008).... In order to maintain 3 large databases for a VLDB and to keep them efficient for two years if required, there is a requirement for constructing a 'store and forward' mechanism that will process the data or information from and through each distribution center database....
5 Pages (1250 words) Report

Data Mining and Behavior of Customers

Data mining Name Professor Course Institution Date QUESTION 1 Benefits of Data Mining Process Predictive analytics to understand the behavior of customers The use of predictive analytics helps in the determination of a predictive score for the elements associated to the organization.... This situation has led to the use of data mining in allowing for associations discovery on the goods sold to the customers.... Web mining to discover business intelligence from Web customers Web mining is an important application for data mining helping to study the web patterns....
4 Pages (1000 words) Essay

Data Mining, Its Purpose and Its Working Methodology

From the paper "data mining, Its Purpose and Its Working Methodology" it is clear that data mining is a knowledge discovery process that is also known as Knowledge Discovery in Databases.... The primary function of data mining or KKD is to analyze and search a large number of data patterns in a database.... The first phase of the brief starts with the comprehensive introduction of data mining, its purpose and its working methodology....
12 Pages (3000 words) Coursework

Data Mining as the Process

The other pitfall is that the programmers involved in the Data Mining Process may not have sufficient business knowledge to understand the objectives or the information that can be retrieved.... The airline employed Data Mining Process in order to increase the responses from the customers and also to increase the value of the response.... The focus of this paper "data mining as the Process" is on data mining, the process used by the firms to extract underlying information stored in the vast amount of data they have about their customers....
1 Pages (250 words) Essay

Data mining approach for smoking cessation management system using M-health

Data Mining Process and m-healthData mining is process of extracting valuable information on given subject from data store.... moking has burden overgrowing number of smokers and Saudi Arabia in large and data mining for smoking cessation management system using M-health First A.... EYWORDS: data mining, m- health, development, implementation, smoking cessation.... Acknowledgment The system will assist smokers cheaply quit smoking using m-health and data mining technique....
1 Pages (250 words) Research Paper

Quantitative Approaches to Decision Making

To help in the complicated process, businesses make intelligent use of computer applications and technologies.... The assignment "Quantitative Approaches to Decision Making" states that Business Intelligence is the term given to the practice of businesses in gathering and extracting useful data from particular sources that then analyze, present this valuable information to put it to their strategic advantage.... Quantitative data analysis as an aid to decision-making and for optimizing other business processes is the driving force today for competitive advantage....
16 Pages (4000 words) Assignment

Data Mining Issues

Speed augments the amount of data acquisition during the Data Mining Process and at the same time provides valuable information to make informed decisions.... This report "data mining Issues" presents data integrity that plays a crucial role in data mining for providing authentic data that can be trusted.... For addressing individual privacy, data mining technology is not up to the mark.... Likewise, it links data mining to be considered as a social facet....
6 Pages (1500 words) Report

Survey in Multimedia Data Mining by Content in Social Media

"Survey in Multimedia data mining by Content in Social Media" paper has managed to illustrate one data mining technique that has been successful in the social multimedia domain.... Content uploaded by partakers in these vast content pools is escorted by wide-ranging forms of metadata, like descriptive textual data or social network information.... Djeraba, Gabbouj, and Bouthemy (2006) posit that such data may entail scores of features: textual descriptors, data concerning the content capture location, the properties of camera's metadata, and also user data as well as information in the social network....
9 Pages (2250 words) Literature review
sponsored ads
We use cookies to create the best experience for you. Keep on browsing if you are OK with that, or find out how to manage cookies.
Contact Us