Free

# Data Mining - Lab Report Example

Summary
Gender, number of previous data science courses by a student, students’ self assessed data mining efficiency, future career goals, geo-location, and preference for a one by…

## Extract of sample "Data Mining"

Data mining June 18, Data mining The survey aimed at developing information on backgrounds for informing teaching practices. Gender, number of previous data science courses by a student, students’ self assessed data mining efficiency, future career goals, geo-location, and preference for a one by one virtual meeting were the study’s variables. Data issues to the collected data, cleaning and analysis results are discussed. SPSS software was used for analysis.
Data issues and cleaning
Missing data was the most prevalent issue in the data set (Tan, Steinbach, & Kumar, 2006). All data for one participant (ID+ R_wZTAo2AjoAUTWvf) were missing. In addition, data on the number of science related course that a student had taken and data on years of professional experience that a student had prior to the course were missing for some of the participants. In addition, data on expected salary for first job had unrealistically low values and required cleaning. Means were used to clean data on previous number of science course and professional experience and expected salary while mode was used to clean ordinal data.
The following table summarizes descriptive statistics of the numeric scale variables.
Table 1: Descriptive statistics
Descriptive Statistics
N
Minimum
Maximum
Mean
Std. Deviation
Skewness
Statistic
Statistic
Statistic
Statistic
Statistic
Statistic
Std. Error
previous data science related courses
23
.00
4.00
2.8226
1.05466
-1.098
.481
previous years of professional experience in dara areas
23
.00
21.00
3.6478
4.17722
3.425
.481
23
29795.78
145000.00
46605.9435
32496.99730
2.341
.481
Valid N (listwise)
23
The three data sets are skewed (p> 0.05) and this means that the media is the best descriptive statistics. The following table shows the statistics.
Table 2: Median for the numeric variables
Statistics
previous data science related courses
previous years of professional experience in data areas
N
Valid
23
23
23
Missing
0
0
0
Mean
2.8226
3.6478
46605.9435
Median
3.0000
3.6500
29795.7800
Mode
2.82a
3.65
29795.78
a. Multiple modes exist. The smallest value is shown
A majority of the students, therefore, had undertaken about three science related courses and had about 3.65 years of professional experience in data areas. The students expected first salary of about \$ 29795.78.
A majority of the students (60.9 percent) were fair in data mining efficiency while only 8.7 were good. Only 21.7 percent had much confidence in becoming data analysts after graduation while 56.5 percent were not sure of their positions. Most of the students lived away from campus with 34.8 percent being within a driving distance while 52.2 percent lived far away, though within the United States. Most of the students preferred a one-by-one virtual meeting. The following histograms illustrate the distributions.
Graph 1: Data mining efficiency
Graph 2: Interest in becoming data analyst after graduation
Graph 3: Distance from campus
Graph 4: Preference for a one-by-one virtual meeting
The following table shows significant correlations, based on results in Appendix A.
Table 3: Significant correlations
Previous data science related courses
Previous years of experience in data
0.448
Previous years of experience in data
Expected first salary
0.494
Efficiency
Interest in data analysis
0.489

Correlation between expected salary and level of efficiency identify the role of expected salary on motivating students into the subject.
Summary
Majority of the students have sufficient background knowledge in data mining, having done many related course. They however lack experience in data mining and report average efficiency. Their level of motivation into data analysis profession is low, their locations are far from the campus, and they prefer one-by-one virtual meetings. A one on one approach to learning that focuses on technology for online study is therefore recommended.
References
Tan, P., Steinbach, M., & Kumar, V. (2006). Introduction to data mining. Boston, MA: Pearson Addison Wesley.
Appendix A: Correlation coefficients
Correlations
previous data science related courses
previous years of professional experience in dara areas
Gender 1
Efficiency1
Interest1
distance1
virtualmeeting1
previous data science related courses
Pearson Correlation
1
.448*
.122
-.183
-.044
.226
.287
.068
Sig. (2-tailed)
.032
.578
.402
.842
.299
.184
.759
N
23
23
23
23
23
23
23
23
previous years of professional experience in dara areas
Pearson Correlation
.448*
1
.494*
-.273
.009
.086
.212
-.047
Sig. (2-tailed)
.032
.017
.207
.967
.695
.333
.832
N
23
23
23
23
23
23
23
23
Pearson Correlation
.122
.494*
1
-.099
.010
-.115
-.111
-.157
Sig. (2-tailed)
.578
.017
.652
.963
.600
.615
.474
N
23
23
23
23
23
23
23
23
Gender 1
Pearson Correlation
-.183
-.273
-.099
1
-.086
-.270
-.111
-.273
Sig. (2-tailed)
.402
.207
.652
.696
.212
.614
.207
N
23
23
23
23
23
23
23
23
Efficiency1
Pearson Correlation
-.044
.009
.010
-.086
1
.489*
-.285
-.064
Sig. (2-tailed)
.842
.967
.963
.696
.018
.187
.772
N
23
23
23
23
23
23
23
23
Interest1
Pearson Correlation
.226
.086
-.115
-.270
.489*
1
-.094
-.300
Sig. (2-tailed)
.299
.695
.600
.212
.018
.668
.164
N
23
23
23
23
23
23
23
23
distance1
Pearson Correlation
.287
.212
-.111
-.111
-.285
-.094
1
.154
Sig. (2-tailed)
.184
.333
.615
.614
.187
.668
.483
N
23
23
23
23
23
23
23
23
virtualmeeting1
Pearson Correlation
.068
-.047
-.157
-.273
-.064
-.300
.154
1
Sig. (2-tailed)
.759
.832
.474
.207
.772
.164
.483
N
23
23
23
23
23
23
23
23
*. Correlation is significant at the 0.05 level (2-tailed). Read More
Cite this document
• APA
• MLA
• CHICAGO
(“Data Mining Lab Report Example | Topics and Well Written Essays - 500 words”, n.d.)
Data Mining Lab Report Example | Topics and Well Written Essays - 500 words. Retrieved from https://studentshare.org/information-technology/1698533-data-mining
(Data Mining Lab Report Example | Topics and Well Written Essays - 500 Words)
Data Mining Lab Report Example | Topics and Well Written Essays - 500 Words. https://studentshare.org/information-technology/1698533-data-mining.
“Data Mining Lab Report Example | Topics and Well Written Essays - 500 Words”, n.d. https://studentshare.org/information-technology/1698533-data-mining.
Click to create a comment or rate a document

## CHECK THESE SAMPLES OF Data Mining

### Data mining

...?Data mining Number: Lecturer: Data mining is an emerging technology that has been associated with database technology. Although these two technologies are similar, there are differing characteristics. Data mining is a technology that is used to discover patterns in data that has already been collected. It is vast finding its use in marketing, managing inventory, management of quality, and managing of risks associated with loaned money. Data mining application have also been seen in use in biological use as seen in DNA testing and gene analysis in living organisms. One of the notable...
17 Pages(4250 words)Research Paper

### Data mining

...? Data Mining Data mining Data mining can be defined as a of database applications which seek for hidden patterns within a collection of data that can be used to effectively and efficiently predict future behaviors. Therefore it is scientific that a true data mining software application or technique must be able to change data presentation criterion and also discover the previously unknown relationships amongst the data types. Data mining tools allow for possible prediction of the future trends and behaviors, hence...
12 Pages(3000 words)Essay

### Crime prevention programs

5 Pages(1250 words)Research Paper

### The Architecture of Sleep and the Function of Dreams

2 Pages(500 words)Essay

### Data Mining

...? Data Mining Introduction Data mining, also known as knowledge discovery, is the process of extracting and analyzing data from different sources and summarizing it into helpful information. This information can be useful in increasing revenue, cutting cost of production, or both. Data mining software is a computer aided process of extracting and analyzing hidden predictive information from a large set of data (Hoptroff & Hoptroff, 2001). Data mining tools helps in predicting the behaviors and future trends of a business’ operations, thus allowing it make proactive and...
5 Pages(1250 words)Research Paper

### Data Mining

...? Data mining Data Mining Data mining is the latest and the most powerful technology, and that have great potential in helping companies to focus only on the most vital information in the collected data on the behavior, of their potential customers and their current customers (Olson & Delen, 2008). This method gets the data that can’t be obtained through reports and queries with a high level of effectiveness. This can also be referred to as the computer assistance in digging for and analyzing data and finally analyzing the contents meaning. These tools predict and analyze the future...
4 Pages(1000 words)Assignment

### DATA MINING

...? Data Mining Data Mining Predictive analytics involves predicting what each is likely to do ahead by utilizingthe information collected about them. Admittedly, data mining becomes useful in a variety of ways when predictive analytics is applied. The first advantage is effective product recommendation. It is possible to identify what are the preferences of each customer from the available statistics. Using this, it is possible to contact each customer when there is something that will interest them. In fact, the early stages of predictive analytics involve product recommendations and behavioral targeting. Another advantage is the...
5 Pages(1250 words)Research Paper

### Data Mining

...? Data Mining School a). Using predictive analytics to comprehend behaviors: To remain competitive within the market, sellers need to comprehend present consumer behavior as well as predict those of the future. The accurate prediction of these behaviors and the full understanding of customer behaviors can help them in improving sales, retaining customers, and extending the relationships sustained with customers. This is realized through predictive analysis data mining, which offers the users, impactful insights throughout the organization (Greene, 2012). Predictive analytics is where statistics and mathematics integrate to business and marketing to establish patterns in data and extrapolating the patterns to future business cases... and...
4 Pages(1000 words)Essay

### Data mining

...? Data mining QUESTION Benefits of data mining process Predictive analytics to understand the behavior of customers The use of predictive analytics helps in the determination of a predictive score for the elements associated to the organization. The major organizational element, in this case, is the customers. The predictive scores inform the business about the most probable action by the customer. The production of predictive scores occurs when the subject organization design a predictive model. The predictive model work measures predictive scores based on the company’s data (Han et al, 2011). The predictive scores produced by the predictive analytics...
4 Pages(1000 words)Essay

### Data mining

...Running Head Data Mining Data Mining At the beginning of the 21st century, organizations depend upon information technology unsuccessful use of information systems management. First and foremost, the applications that are developed and acquired must deliver benefits to the user. Data mining is defined as: “a decision support process in which we search for patterns of information in data” (Pushpa 2007, p. 1). Depending on the exact nature of the tasks that organization performs, these could be anything. It is not possible to give specific advice, but there are four general principles that generally apply: every application must deliver benefits to the users; users must feel comfortable with the way in which the system manipulates... and...
2 Pages(500 words)Essay