StudentShare
Contact Us
Sign In / Sign Up for FREE
Search
Go to advanced search...
Free

Cluster Analysis: Basic Concepts and Algorithms - Case Study Example

Cite this document
Summary
This study "Cluster Analysis: Basic Concepts and Algorithms" discusses clustering which uses as a tool for forecasting future costs, expenses, sales, and net income. The study considers clustering customers according to income, location, sex, and others will help increase company sales…
Download full paper File format: .doc, available for editing
GRAB THE BEST PAPER93.7% of users find it useful
Cluster Analysis: Basic Concepts and Algorithms
Read Text Preview

Extract of sample "Cluster Analysis: Basic Concepts and Algorithms"

Introduction Clustering is the grouping together of another special kind of statistical technique That groups data into clusters. Cluster analysis(Lawal, 2003) therefore is the grouping together of similar accounts or data. We can see this everyday. Birds flock with birds. Students play with students. By grouping data under a homogenous group, we can use this as a tool for forecasting future costs, expenses, sales and net income. Clustering customers according to income, location, sex, and others will help increase the company sales. By using clustering, we can assign one salesman to concentrate on one cluster of prospects because these customers have the same hobbies, needs and the like. In hierarchical cluster analysis, the entire population under statistical study the cases are grouped based on their similarities. Distance between each member of the population is also used as basis in cluster analysis. Part 2 Based on the criteria of years as member of the club, distance from the club and club membership, the stage one cluster has one group under and cluster two has three groups under its wings. The coefficient of stage one is 1788. The cluster under stage two has one group under it and cluster has two groups under it. The coefficient under stage two for both clusters is 33445. Based on the criteria of years as member of the club, distance from the club and club membership, The cluster with the most population is the distance from the club group. The next cluster belongs to the group years as member. The last cluster belongs to the group club membership. This only shows that the nearness or the long travel time is a big factor in decision making for both the customers the management. Therefore, since distance is big factor in the invitation for new members to use the beautify facilities of the club, then the club must first entice people living near the club. This also shows that there are not as many member presently because only a few members as compared to the total guests of the club. It is noticed that the club membership cluster is the lowest of the three criteria. This statistical data shows that it is the difficulty of companies to maintain or even increase the present members. Part 2 Based on the criteria importance of pool facilities, importance of tennis facilities and importance of challenge of golf, the cluster no. one has two groups under it and cluster two has three groups under its wings for the same stage. The coefficient of stage one is 479.295. The cluster under stage two has one group under it and cluster no. two has two groups under it. The coefficient under stage two for both clusters is 2015.879. Based on the criteria importance of pool facilities, importance of tennis facilities and importance of challenge of golf, there is a big cluster around importance of pool facilities. Then the second cluster is belongs to the importance of tennis facilities criteria. The last cluster belongs to the importance of golf criteria. The above shows that the importance of pool facilities is a very strong marketing tool to increase the company sales. This also shows that more people in the club prefer to go and while away the sunshine at the pool area while the other members prefer to go tennis. This is a very powerful tool to help increase the profitability of the business. It is also relaxing to do business thinking while splashing that pool water onto the face to refresh the tired thinker. More people prefer pool because it is relaxing. Also, it is so nice to look at the beautiful bodies of men and women as the wear their swimwear. We can think of a possible business like, swimwear. Whereas, the people who love tennis and gold will have to sweat it out to enjoy. Tennis is mostly for the teenagers or below forty year old population. Whereas gold, is only for the rich man, real rich that is. The golf clubs cost so much. The golf bag and other gold accessories are sold at very prohibitively high prices. This shows that more people prefer to go and relax in swimming than playing strenuous tennis like Andrei Agassi and Martina Hingis, that beautiful Swiss tennis legend who won her first world championship during her teen years. Part 3 Based on the criteria importance of social events, total family income and importance of dining facilities, the cluster no. one has one group under it and cluster two has three groups under its wings for the same stage. For the stage two portion, the cluster no. one has one group under its wings and cluster two has tuition . The coefficient of stage one is 451.127. The coefficient under stage two for both clusters is 747.694. Based on the criteria importance of social events, total family income and importance of dining facilities, the cluster with the most members is the total family income group. The next cluster belongs to the importance of social events group. The last cluster belongs to the importance of dining facilities. This clearly shows that the family income is a major factor in recruiting new members and also aid the present members so they will not withdraw their membership and transfer to another sports and leisure club. The members often intended to join a club in order to socialize or make new friends. These new friends can either be their future customers or suppliers or just plain friends. The last cluster belongs to the importance of dining facilities. The people are more interested in the food that is being serve and less interested in the dining facilities.. Part 4 Based on the criteria, marital status, gender and age, the cluster no. one has two groups under it and cluster two has three groups under its wings for the same stage. For the stage two portion, the cluster no. one has one group under its wings and cluster two has two groups under its fold. The coefficient of stage one is 146. The coefficient under stage two for both clusters is 517494. Based on the criteria, marital status, gender and age, the cluster with the most members is the marital status group. The next cluster belongs to the gender group. The cluster with the smallest members is the age group. It is very evident that most people are grouped under the cluster of marital status group. Then, then next cluster, which is the gender, shows that people still feel that their preservation of the self identity by pampering their bodies with perfume and buying new clothes is another necessity that a person has to go thru everyday. The smallest cluster is the age group. This means that the respondents are not interested in the age limit. In fact, age limit bias is against the discrimination law of the United States. OVER ALL If we now make the hierarchical cluster analysis of the top cluster in each of the four groups, Based on the criteria, total family income, marital status, importance of pool facilities and distance of residence from the club, in stage one, cluster no. one has two groups under it and cluster two has four groups under its wings for the same stage. For the stage two portion, the cluster no. one has one group under its wings and cluster two has two groups under its fold. Under stage three, cluster one has one group and cluster no. 2 has three groups. The coefficient of stage one is 1090.449. The coefficient under stage two is 1813.204 and in stage three, the coefficient is 2081.743. Based on the criteria, marital status, gender and age, the cluster with the most members is the marital status group. The next cluster belongs to the importance of pool facilities. The 3rd cluster group according to size is the total family income group. The last group cluster is the distance of residence from club. This only shows, that people with money can afford to apply as members of the club. People who are sports minded and love to wheat. The company, after knowing the areas or clusters where they are best and where they are least, can now come with a strategic plan that will result to maintaining their present chunk of the market share. The company will also make strategic plans to help the club increase their present market share in the areas where the competition has firmly established themselves through the years as a better company in terms of service and prices. CONCLUSION: This statistical cluster analysis was based on the twelve criteria such as distance of residence from club, importance of pool facilities, total family income, age, gender, years as club member, distance of residence from the club, club membership, importance of challenge of golf, importance of tennis facilities, importance of social events, total family income, Importance of social events, total family income, importance of dining facilities. The hierarchical cluster analysis is used in order to make marketing and forecasting plans. In this case, the club will be using the statistical cluster analysis to help the client. The company will now maximize on cluster groups where they are strong like. The biggest cluster belongs to Marital Status group. This shows that married people love to enjoy quality family time in a quality family place such as this club. Then the second biggest cluster belongs to importance of pool facilities. More club members prefer to play with their family and loved ones beside the club pool. The third biggest cluster belongs to the total family income group. This means that family money is a necessity for the people to pay the high club membership fees. The fourth biggest cluster belongs to the distance of residence from the club. This gives a clue to the management of the club that the marketing must prefer to entice clients who are living near the club premises. The results of this statistical output using SPSS will help management think that there must be new and better ways to market the company products. There are other statistical methods to back up or support this statistical plan. The other tools are, bar graph, line graph, frequency and others. Now that management has the copy of the statistical cluster analysis using hierarchical approach. they will continue to maintain their own presence in product lines whether the company, in this case a pool, golf, tennis, and other amenities club, will continue to enjoy the lion's share of the market spoils. The company can also use better marketing strategies to take away from the competitors their tiger's share of the market where the club has presently low market share. BIBLIOGRAPHY: Lawal, B., Categorical Data Analysis with Sas and Spss Applications, Lawrence Erlbaum Associates, Mahwah, NJ, 2003 TABLES Agglomeration Schedule Stage Cluster Combined Coefficients Stage Cluster First Appears Next Stage Cluster 1 Cluster 2 Cluster 1 Cluster 2 1 1 3 1788.000 0 0 2 2 1 2 33445.000 1 0 0 Vertical Icicle Number of clusters Case Years as member Distence of residence from club Club membership 1 X X X X X 2 X X X X Part 2 Cluster Average linkage (between groups) Agglomeration Schedule Stage Cluster Combined Coefficients Stage Cluster First Appears Next Stage Cluster 1 Cluster 2 Cluster 1 Cluster 2 1 2 3 479.295 0 0 2 2 1 2 2015.879 0 1 0 Vertical Icicle Number of clusters Case Importance of pool facilities Importance of tennis facilities Importance of challenge of golf 1 X X X X X 2 X X X X Part 4 Agglomeration Schedule Stage Cluster Combined Coefficients Stage Cluster First Appears Next Stage Cluster 1 Cluster 2 Cluster 1 Cluster 2 1 1 3 451.127 0 0 2 2 1 2 747.694 1 0 0 Vertical Icicle Number of clusters Case Importance of social events Total family income Importance of dining facilities 1 X X X X X 2 X X X X Part 4 Agglomeration Schedule Stage Cluster Combined Coefficients Stage Cluster First Appears Next Stage Cluster 1 Cluster 2 Cluster 1 Cluster 2 1 2 3 146.000 0 0 2 2 1 2 517494.000 0 1 0 Vertical Icicle Number of clusters Case Marital status Gender Age 1 X X X X X 2 X X X X Over all Case Processing Summary(a) Cases Valid Missing Total N Percent N Percent N Percent 244 96.8% 8 3.2% 252 100.0% a Squared Euclidean Distance used Agglomeration Schedule Stage Cluster Combined Coefficients Stage Cluster First Appears Next Stage Cluster 1 Cluster 2 Cluster 1 Cluster 2 1 2 4 1090.449 0 0 2 2 1 2 1813.204 0 1 3 3 1 3 2081.743 2 0 0 OVER ALL Vertical Icicle Number of clusters Case Total family income Marital status Importance of pool facilities Distence of residence from club 1 X X X X X X X 2 X X X X X X 3 X X X X X Read More
Tags
Cite this document
  • APA
  • MLA
  • CHICAGO
(Cluster Analysis: Basic Concepts and Algorithms Case Study, n.d.)
Cluster Analysis: Basic Concepts and Algorithms Case Study. Retrieved from https://studentshare.org/finance-accounting/1514642-cluster-analysis-research-proposal
(Cluster Analysis: Basic Concepts and Algorithms Case Study)
Cluster Analysis: Basic Concepts and Algorithms Case Study. https://studentshare.org/finance-accounting/1514642-cluster-analysis-research-proposal.
“Cluster Analysis: Basic Concepts and Algorithms Case Study”, n.d. https://studentshare.org/finance-accounting/1514642-cluster-analysis-research-proposal.
  • Cited: 0 times

CHECK THESE SAMPLES OF Cluster Analysis: Basic Concepts and Algorithms

A Particular Place of Residence of a Person

This is done through a dissection of the cluster analysis.... cluster analysis borrows largely from clustering algorithms but is instead much more than the mere grouping of the objects.... Therefore, to successfully run a cluster analysis will require a series of particular steps, which involve multiple decisions across all the stages.... There are seven steps involved a successful running of the cluster analysis.... Each step in the framework represents a very important decision point that is imperative for the smooth running of the cluster analysis (Harris and Webber, 2005)....
6 Pages (1500 words) Research Paper

Efficiency of Data Mining Algorithms in Identifying Outliers-Noise in a Large Biological Data Base

The paper "Efficiency of Data Mining algorithms in Identifying Outliers-Noise in a Large Biological Data Base" summarizes that the classification of large protein sets of data sequences by clustering techniques in the place of alignment methods extremely cuts down on the execution time.... The local algorithms were used to find amino acid patterns that are conserved in protein sequences (Clote and Backofen, 2000).... The other type of algorithms was the global algorithm which was based on to align the entire protein sequence by making use of the most possible characters....
7 Pages (1750 words) Essay

A Comparison of Some Methods of Cluster Analysis with SPSS

hellip; Classification, Clustering and K-Means Clustering in SPSS In a Knowledge-Based Research Study Table of Contents Part I: Introduction to Classification and Clustering 3 Overview of Classification 6 Overview of cluster analysis 8 Hierarchical Clustering 14 K-Means Clustering 21 Two-Step Clustering 29 Part II: Measurement of Proximity 37 Hierarchical Cluster Measures 38 K-Means Cluster Measures 39 Two-Step Cluster Measures 43 Part III: cluster analysis With SPSS 44 Conducting the Hierarchical Clustering Process 46 Second Run Hierarchical Cluster 50 Conducting the K-Means Clustering Process 52 Conducting the Two-Step Clustering Process 56 Part IV: Summary 65 Appendix 64 Resources 64 Part I: Intro duction to Classification and Clustering Statistical analysis is the process by which those conducting research and analysing data, can determine who or what within a dataset, fit certain patterns and trends....
70 Pages (17500 words) Dissertation

Understanding Cryptography

Open SSL has a core library that is developed in C programming language and is used for the implementation of the basic cryptographic functions and the... Internet communication, wireless networking, mobile computing, and cloud computing are some of the trends shaping the computing and communication industry....
8 Pages (2000 words) Essay

Area lassification and Methodology

The essay "Area Сlassification and Methodology" applies the statistical methods to geographical means and looks at the multivariate statistical analysis technique used and how labels were decided for the final set of clusters at distinct levels of hierarchy.... The paper also looks at the multivariate statistical analysis technique used and how labels were decided for the final set of clusters at distinct levels of hierarchy....
7 Pages (1750 words) Essay

Identifying Outliers in a Large Biological Data Base

This coursework "Identifying Outliers in a Large Biological Data Base" identifies approaches that are efficient in clustering and are based on algorithms.... hellip; The main aim of these identified clustering algorithms is to come up with meaningful partitions, to better the quality of classification and to reduce the time used for computation.... The identified algorithms include; Pro-LEADER, Pro-Kmeans, Pro-CLARINS, and Pro-CLARA.... The above methods are used in the partitioning of protein sequence data sets in cluster algorithms....
7 Pages (1750 words) Coursework

The Efficiency of Clustering Algorithms for Mining Large Data Bases

The paper "The Efficiency of Clustering algorithms for Mining Large Data Bases" highlights that using Pro-PAM algorithm based on partitioning clustering techniques in the place of alignment methods in large data sets, increases efficiency and reduces execution time significantly.... his study focuses on evaluating the efficiency of various types of sequencing data mining algorithms with respect to protein sequence data sets, and on the basis of their shortcomings, design and develops an efficient clustering algorithm on the basis of the partitioning method....
7 Pages (1750 words) Coursework

Data Mining as a Methodology for Extracting Hidden Knowledge from Breast Cancer Patients Records in Jordan

Two data mining algorithms were applied including Self Organizing Map and K-mean.... nbsp;… Large medical data amounts exists in Jordan that require analysis to end up being useful.... A lot of data collected in a number of medical databases call for specialized tools that are able to store and at the same time access the data, proper use of the data and analysis.... Manual traditional analysis must be put into use for successful data harvest and use....
12 Pages (3000 words) Research Paper
sponsored ads
We use cookies to create the best experience for you. Keep on browsing if you are OK with that, or find out how to manage cookies.
Contact Us