StudentShare
Contact Us
Sign In / Sign Up for FREE
Search
Go to advanced search...
Free

Data Mining as a Methodology for Extracting Hidden Knowledge from Breast Cancer Patients Records in Jordan - Research Paper Example

Cite this document
Summary
This paper is about Data Mining as a Methodology for Extracting Hidden Knowledge from Breast Cancer Patients Records in Jordan. Modern monitoring devices among other data collection devices have helped health care organization reduce their cost of collecting and storing data. …
Download full paper File format: .doc, available for editing
GRAB THE BEST PAPER91% of users find it useful
Data Mining as a Methodology for Extracting Hidden Knowledge from Breast Cancer Patients Records in Jordan
Read Text Preview

Extract of sample "Data Mining as a Methodology for Extracting Hidden Knowledge from Breast Cancer Patients Records in Jordan"

Data Mining as a Methodology for Extracting Hidden Knowledge from Breast Cancer Patient’s Records in Jordan Affiliation Contents Related Work 5 Proposed Methods 5 The Data we use 7 Experimental Results 11 Conclusions 34 Future Work 35 References 36 Abstract Modern monitoring devices among other data collection devices have helped health care organization reduce their cost of collecting and storing data. Specialized tools that come with such equipment have made the entire data collection and storage process more effective thereby easing management’s decision-making processes. Large medical data amounts exists in Jordan that require analysis to end up being useful. The records belonging to a patient of breast cancer were taken including eight, age, marital status, address, smoking status, and the patient’s status as well as morphology, topology, and the cancer stage summery. This was done to allow clustering. Two data mining algorithms were applied including Self Organizing Map and K-mean. Comparison between the results, K-mean was, and algorithm work were done obtaining seven clusters that clarified the data type relationships. Such results could be used in medical statistics as well as in studies. Introduction Health care providers are today equipped with up-to-date monitoring equipment and other sophisticated data collection devices. This makes it possible to provide inexpensive method to collect as well as store data that is required in information system. A lot of data collected in a number of medical databases call for specialized tools that are able to store and at the same time access the data, proper use of the data and analysis. Constant accumulation of the data pile up cause great difficulties especially in extracting information to be used in decision making is crucial. Manual traditional analysis must be put into use for successful data harvest and use. In order to make this possible medical informatics and advanced technology should be employed to have more database cover, machine learning, pattern recognition as well as statistics tools that are able to support the existing perception of encoded data analysis and exploration of regularities in data (Piateski & Frawley, 1991). Data mining enhances researchers’ understanding of the complexities of the data available in large quantity. The concept is key because it has provided deep inspiring insight on the overall biomedical data. The biomedical dataset increase together with its understanding as well as effective continues interpretation has been of concern overtime by many researchers. In the middle of 1990, this concerned was addressed the moment majority of researchers adopted data mining techniques in other to uncover novel healthcare knowledge on ethical decision making. Data mining concept on the other hand it has been used in developing hypothesis from the same clinical databases, healthcare text and bulky experimental data. A lot of thanks goes to this technique since it is now possible to find isolated cases of under-diagnosed patients, weakness found in insurance fraud as well as identification individuals who are at risk, this has resulted into drastic decrease in healthcare biomedical costs (Yoo et al., 2012). Breast cancer as a disease is commonly known for its malignancy in women mostly, therefore, considered second leading disease responsible for causing cancer deaths among women (DeSantis, Siegel, Bandi, & Jemal, 2011). The combination of data mining approaches together with breast cancer has revealed a lot to researchers, for instance, they have discovered hidden patterns that can be used in understanding risks association as well as prevalence of breast cancer among women. Related Work A number of works in this field that is related to breast cancer data usually focus solely on the prediction data mining method. Particularly in Jordan, this kind of data was first applied using the mining method and the results were tremendous. Although there is a number of related works, this topic has never been covered adequately. Proposed Methods Breast cancer is one of the leading death causes among women especially in developing countries. The best method to reduce it is to have early detection. It is thus required that an early diagnosis be done. This diagnosis has to involve accurate and reliable procedures to allow the distinction between benign breast tumors and malignant tumors. By 1st December 2010 to 31st of January same year diagnosed cases of cancer that were reported and registered by JCR including cancer patients are of other origins but not Jordanians treated in the hospitals and recorded throughout the year. The first method used is known as active ants through data collection extracted straight from the register belonging to trained personnel by means of regular distribution with the focus of all hospitals throughout the country. A number of organizations including JCR find it difficult to access some of this data both private and public hospitals as well as laboratories around UK. The Male Cancer Data Coordination with its trained health workers who are drawn from patient files then filed in the standard form are send straight to Jordan Cancer Registry where data is extracted including personal information as well as demographic information, diagnosis and tumor details. For the proposed method, statistical analysis is critical. In this case, the information about the tumor from specific examinations have to be gathered through staging techniques in order to establish the way the cancer is widespread. The information has to be combined through stage grouping to establish the stage of the disease. The disease could fall between stage 0 and stage IV. It would also be important to consider the aspects of knowledge discovery and data mining processes. With knowledge discovery in database, non-trivial extraction of useful information specifically from databases and that is implicit and previously unknown would be done. This process will be critical in data mining. Knowledge discovery in databases will involve a number of steps. The steps leads from the collection of raw data to the formation of new knowledge. These steps would include: i. Data cleaning ii. Data integration iii. Data selection iv. Data transformation v. Data mining vi. Interpretation Other than these steps, a meaningful system framework would be important especially considering the role played by data mining in the extraction of knowledge from the data about breast cancer. The system framework will pass through three key phases. These phases include data collection, data processing, and data clustering as shown in Fig. 1. Fig. 1: System Framework Phases The data collection phase involves the collection of data from the Ministry of Health in Jordan. In the data processing phase, data are processed after which they are prepared for use in the clustering phase. The phase involves two sub-phases which include Filtering and Normalizing. Filtering the data is also a critical aspect of the proposed method. In this case errors and missing have to be removed. The data has to be normalized, which constitutes one of the key processing steps. Normalization helps in bringing all the variables into proportion making the value fall between 0 and 1. The Data we use Regarding the utilized data, a total of 3388 breast cancer patient’s records had been taken. Such records had been derived from the ministry of health in Jordan. This comprises all total data from the year 2005 to 2010. The records contained 8 fields, which included age, marital status, address, smoking status, topology, morphology, summary stage and status of the (Maximov, McDaniel & Jordan, 2013). Table 1 below shows the data used for the study. Table 1: Used Data Age MarS Add Smo Top Mor SumS Stat 28 1 19999 1 502 8500 1 0 48 2 19999 0 504 8500 2 1 41 2 19999 0 509 8522 9 0 56 2 99999 0 509 8010 7 0 Two clustering algorithms were used including K-mean with Weka 3.6.9 and SOM with Matlab 2012. Based on these clustering algorithms, a maximum of two algorithm in data mining were applied. These algorithm included the K-mean and SOM. The K-mean results were as follows. Table 2: K-mean Results Table 3: SOM Results Both the K-mean and SOM were generally good for clustering the patient’s records. There after getting the results, the next thing was a comparison between the two algorithms. Fig. 2 shows the Self Organizing Map “SOM.” Fig.2: Self Organizing Map “SOM” The Self-Organizing Map was specifically developed by professor Kohonen. And SOM since its development proved useful in many applications. Therefore, Kohonons SOMs are a type of unsupervised learning with the goal of discovering some underlying structure of data. It is one of the most popular neural network models which belong to competitive learning networks category. On the other hand, K-mean is among the simplest learning algorithms that are unsupervised and having the capability of solving well-known clustering problems. Fig. 3 shows the K-mean algorithm. Fig. 3: K-mean Algorithm Generally, the k-means algorithm is more reliable than the SOM algorithm since it can handle larger amounts of data without problems. Its ability to use larger amount of data as compared to SOM makes its reliability higher than that of SOM. Despite the results compatibility between the two algorithms, the K-mean algorithm is always found to deliver more accurate results than the SOM algorithm. Experimental Results Clustering entails grouping objects based on similarity. Data in this regard is specifically modeled in clusters in order to extract useful knowledge from such data. Searching for clusters is usually unsupervised learning, which assists in revealing any hidden patterns. Data mining has presented numerous procedures that utilize the idea of clustering for data that is large. These methods are used in engineering, science, healthcare industry and business to discover beneficial data from datasets that are available (Berkhin, 2006). In this study, as illustrated, k-means is used using Weka 3.6.9 and SOM using MatLab 2012. The tests are performed using MatLab and Weka on a computer with 2.40 GHz Core i3, 4 GB RAM, 64Bit Windows 7 System having 300GB Hard Disk Space. Having used K-means we got 3 clusters, which include cluster Zero, cluster One and cluster Two. Regarding the results for the clusters, cluster 0 contains data amounting to 72%. Each field’s outcomes is displayed below. Fig.5: Representation of the age percentage in Cluster 0. From fig.5 above, 32% was the greatest percentage for D category, and then F category was 17%. In the case of marital status, the highest percentage was 93.85% (rounded off to 94%) for the married women. This percentage was considered natural since married women had a greater chance of having breast cancer as compared to singles as well as windowed women. The percentages from cluster 0 was as shown in Fig. 6. Fig.6: Marital Status In the case of address in cluster 0, close to 73% of the patients within this cluster were from Amman. Fig.6: Address Regarding smoking in cluster 0, the highest percentage was about 87% for the non-smoking patients. This was considered normal since 88% of the patients are non-smoker. Fig.7: Smoking status in cluster 0 In the case of the typology cluster, 61% was the greatest percentage for 509, 11% for 508, and 5% for 504. Getting back to topology, it is easy to notice that 509, 508, and 504 are represented by 57%, 18%, and 10% respectively. This trend is critical in understanding why some areas are more infected as compared to others. The percentage of topology in cluster 0 is show in fig. 9. Fig.9: The percentage of topology in cluster 0 In the case of morphology, morph 8500 is found to have has the highest percentage of 80%. Morph 8520 and 8010 have 7% and 4% respectively. Fig. 10 the percentage of morphology in cluster 0. For the summary stage, the higher percentage here is 24% for summery stage 3, then stage 1 with percentage 14%and then stage 4 and 7 with 13% and 12% respectively. Figure 5.7 shows the percentage of summery stage in cluster 0. Fig.11: the percentage of summery stage in cluster 0. For patients’ status in cluster 0, alive patients are found to have 96% as shown below. Fig.11: The percentage of patient status in cluster 0 Cluster 1 on the other hand constitutes of 20% of data. In the case of age, 53% is the highest percentage as depicted in fig.12 below. Fig.12: Representation of the age percentage in C1 In the case of marital status, the higher percentage is found to be 95.2% for the married women, which is typically a higher rate than in the case of cluster 0. This percentage is as shown below: Fig.13: Representation of the marital status percentage in C1. In the case of address, 50% of the patients in cluster 1 are from Irbid. Fig.14: The percentage of addresses in cluster 1 Regarding smoking status, 90% was the highest percentage for the non-smoker patients, higher than in cluster 0 percentage. Fig.15: The percentage of smoking status in cluster 1 As far as topology is concerned in this, cluster the highest percentage was 58%, representing the 509 (Breast, NOS). 16% stood for 504 (Upper outer quadrant of breast) while 10% was for 508 (Overl, lesion of breast). Fig. 16: The percentage of topology in cluster 1. In the case of morphology, 78% was for morph 8500, 7% for morph 8520, and 5% for morph 8010. Fig.17: the percentage of morphology in cluster 1. In the summary stage, 43% was for summery stage 1, 15% for stage 4, and 12% and 10% for stages 9 and 7 respectively. Fig.18: the percentage of summery stage in cluster 1 Regarding the status of patients in cluster 1, 95% represents alive patients. Fig.19: the percentage of patient status in cluster 1. In summary, in cluster 1, about 20% of data was used. Based on the obtained results, breat cancer was found to be commonest in the early-age group. More married women, than single women, were affected. The cases were mainly from Irbid. Most breast cancer patients were found to be non-smokers. Besides, topography 509 (Breast, NOS) accounted for most of the cases. Breast cancer patients’ morphology was 8500. Most of the patients studied were typically alive. Regarding cluster 2 results, the cluster utilized 8% of data. In the case of age, category G was represented by 51%. Fig.20: Representation of the age percentage in C2. In marriage status, married women accounted for 93.61% for married women. Fig.21: Representation of the marital status percentage in C2. In case of address, 57% of the patients in C2r are from Al-Zarqaa. Fig.22: the percentage of addresses in cluster 2. For the smoking status, 90% stood for non-smoker patients. Fig.23: the percentage of smoking status in cluster 2. In the case of topology, 60% was the percentage for 504 (Upper outer quadrant of breast). Figure 5.21 shows the percentage of topology in cluster 2. Fig.24: the percentage of topology in cluster 2. In the case of morphology, 83%, the highest percentage, stood for the morph 8500. Fig.25: the percentage of morphology in cluster 2. In the case of the summary stage, 30% is the highest and stands for the summery stage 9. Stages 4, 3, and 1 have 19%, 13% and 11% respectively. Fig.26: the percentage of summery stage in cluster 2. For patients’ status in this cluster, 97% stands for alive patients. Fig.27: the percentage of patient status in cluster 2. In summary, regarding cluster, 2, 8% data was used in which breast cancer is found to be more common in the age group G as well as most among the married women. Most cases reported in Jordan are from Al-Zarqaa. Besides, most of these breast cancer patients are non-smokers and most cases belonged to topography 504 (Upper outer quadrant of breast). Morphology of breast cancer patients is 8500. Generally, the summery stage was 9 followed by 4. Most of the patients studied were alive. Regarding the SOM results, done using MatLab 2012 software, four clusters were generated. Clusters 1, 2, 3, and 4 had data amounting to 32%, 29%, 10%, and 29% respectively. In cluster 1, the case of age had 25% for the D category, 22% in cluster 2 for category E, 23% in cluster 3 for category D, and 21% in cluster 4 for category F. Figure 28: Representation of the age percentage in all clusters. In the case of marital status, 93.84% was the highest percentage for married women, in cluster 1, and 94%, 93.86%, and 95.29 for clusters 2,3, and 4 respectively. Fig.29: Representation of the marital status percentage in Clusters. Regarding the case of address, most patients considered are from Amman, in this regard, 62% in cluster 1, 52% in cluster2, 59% in cluster 3 and 52% in cluster 4. Fig.30: the percentage of addresses in clusters. In the case of smoking status, 89% was the highest percentage, for non-smoking patients in cluster 1 and then 86.97%, 83.63%, and 90% in clusters, 2, 3, and 4 respectively. Fig.31: the percentage of smoking status. In the case of topology, 93% was the highest percentage representing 509 (Breast, NOS) in cluster 4. Cluster 1 had 78% for 509; cluster 3 with 60% for 509; and then cluster 2 with 59% for 504(Upper outer quadrant of breast). Fig.32: the percentage of topology. Regarding morphology, morph 8500 has the higher percentage 82% in cluster 1, 80% in cluster 2, 79% in cluster 3 and 77.9% in cluster 4. Fig.33: the percentage of morphology In the summary stage, the highest percentage is 31% for stage 3 in cluster 1; 29% for stage 3, cluster 2; 47% for stage5, cluster 3; and 72% for stage 9, cluster 4. Fig. 34: the percentage of summery stage Regarding the status of patients, alive patients accounted for 100% in cluster 1 and 99% in cluster 2, which in cluster 3 they were 63% and 100% in cluster 4. Fig.35: the percentage of patient status in cluster 2 In summary regarding the clusters, breast cancer is found to be commonest in the early-age group. Most married conducted women were affected. Most of these reported cases were from Amman, while most of the breast cancer patients were found to be non-smokers. Besides, most cases belonged to topographies 509, 508 and 504. On the other hand, the breast cancer patients’ morphology was 8500. Again, the summery stage 3 was for clusters 1 and 2, stage 5 for cluster 3 and then stage 7 for cluster 4. Mostly, alive patients were studied as opposed to deceased patients. Conclusions All that is contained in this study has presented some definitions of basic notions in the KDD field. A primary aim was to clarify the relation between knowledge discovery and data mining and results being simulated by different results from Jordan Cancer Registry. We provided an overview of the KDD process and basic data mining methods. Given the broad spectrum of data mining methods and algorithms, our brief overview is inevitably limited in scope: there are many data mining techniques, particularly specialized methods for particular types of data and domains. Although various algorithms and applications may appear quite different on the surface, it is not common find that they share many common components. Understanding data mining and model induction at this component level clarifies the task of any data mining algorithm and makes it easier for the user to understand its overall contribution and applicability to the KDD process. Therefore this thesis represents a step towards a common framework that we hope will ultimately provide a unifying vision of the common overall goals and methods used in KDD. We hope this will eventually lead to a better understanding of the variety of approaches in this multi-disciplinary field and how they fit together. The comparative study of multiple prediction models for breast cancer survivability using a large dataset along with a 10-fold cross-validation provided us with an insight into the relative prediction ability of different data mining methods. Using sensitivity analysis on neural network models provided us with the prioritized importance of the prognostic factors used in the study. Future Work Since breast cancer is known to be one of the killer disease for women, therefore for proper diagnosis, there several data mining techniques that have been used, this method has changed many research sectors more so the health care industry through provision of important information by the way of clustering among other techniques. Cancer clustering in Jordan has been lucking for several years now despite huge availability of data, therefore in future there is need for serious investment in this kind of study to help the health sector detect the prevalence of cancer in Jordan early. The results gained from this study can serve as basis for future studies and will provide useful information about breast cancer cases in Jordan. Additionally, future studies can be designed based on the outcome of the current study which will help in establishing more studies dealing with breast cancer. References Abreu, P. H., Amaro, H., Silva, D. C., Machado, P., & Abreu, M. H. (2014). Personalizing breast cancer patients with heterogeneous data. In The International Conference on Health Informatics, Springer International Publishing., 39-42. Banerjee, A., & Shan, H. (2010). Model-Based Clustering. Encyclopedia of Machine Learning,, 686-689. Berk, R. A. (2010). Data Mining within a Regression Framework. US: Springer. Berkhin, P. (2006). A survey of clustering data mining techniques. Grouping multidimensional data (pp. 25-71). Berlin Heidelberg: Springer. Bittern, R., Dolgobrodov, D., Marshall, R., Moore, P., Steele, R., & Cuschieri, A. (2007). Artificial neural networks in cancer management. Science All Hands Meeting, 251-263. Cakir, A., & Demirel, B. (2011). A software tool for determination of breast cancer treatment methods using data mining approach. J Med Syst, 35(6), 1503-1511. doi: 10.1007/s10916-009-9427-x Coughlin, S. S., & Ekwueme, D. U. (2009). Breast cancer as a global health concern. Cancer Epidemiol, 33(5), 315-318. doi: 10.1016/j.canep.2009.10.003 S1877-7821(09)00132-5 [pii] DeSantis, C., Siegel, R., Bandi, P., & Jemal, A. (2011). Breast cancer statistics, 2011. CA Cancer J Clin, 61(6), 409-418. doi: 10.3322/caac.20134 Djebbari, A., Liu, Z., Phan, S., & Famili, F. (2008). International journal of computational biology and drug design (ijcbdd). Conference on Neural Information Processing Systems Dˇzeroski, S., Lavraˇc, N., (1996) “Rule induction and instance-based learning applied in medical diagnosis,” Technology and Health Care, 4(2): 203–221. Filipczuk, P., Kowal, M., & Obuchowicz, A. (2011). Automatic breast cancer diagnosis based on K-means clustering and adaptive thresholding hybrid segmentation. In Image Processing and Communications Challenges, Springer Berlin Heidelberg., 3, 295-302. Filipczuk, P., Kowal, M., & Obuchowicz, A. (2012). Breast fibroadenoma automatic detection using k-means based hybrid segmentation method. Biomedical Imaging (ISBI), , 9th IEEE International Symposium, 1623-1626. Fürnkranz, J. (2014). Association rule discovery. Retrieved 31 August, 2014 Han, J., & Kamber, M. (2006). Data Mining:Concepts and Techniques (Second Edition ed.). USA: Morgan Kaufmann Publishers. Hotho, A., Nürnberger, A., & Paaß, G. (2005). A Brief Survey of Text Mining. KDE Group, University of Kassel. Ilango, M. R., & Mohan, V. (2010). A survey of grid based clustering algorithms. International Journal of Engineering Science and Technology, 2(8), 3441-3446. Kantardzic, M. (2011). Data mining: concepts, models, methods, and algorithms. : John Wiley & Sons. Karabatak, M., & Ince, M. C. (2009). An expert system for detection of breast cancer based on association rules and neural network. Expert Systems with Applications, 36(2), 3465-3469. Koh, H. C., & Tan, G. (2005). Data mining applications in healthcare. J Healthc Inf Manag, 19(2), 64-72. Kriegel, H. P., & Pfeifle, M. (2005). Hierarchical density-based clustering of uncertain data. In Data Mining, Fifth IEEE International Conference, IEEE, 4. Kuo, R. J., Chao, C. M., & Chiu, Y. (2011). Application of particle swarm optimization to association rule mining. Applied Soft Computing, 11(1), 326-336. Kusiak, A., Kernstine, K. H., Kern, J. A., McLaughlin, K. A., & Tseng, T. L. (2000). Data mining: medical and engineering case studies. In Proceedings of the industrial engineering research conference Chicago, 21-23. Larose, D. T. (2005). Discovering Knowledge in Data: An Introduction to Data Mining: John Wiley & Sons, Inc. Liu, Y.-Q., Wang, C., & Zhang, L. (2009). Decision Tree Based Predictive Models for Breast Cancer Survivability on Imbalanced Data. International Conference on Bioinformatics and Biomedical Engineering. Lokanatha, C. R. (2011). A Review on Data mining from Past to the Future. International Journal of Computer Applications, 15(7), 19-22. Mansour, N., R Zantout, R., & El-Sibai, M. (2013). Mining breast cancer genetic data. Natural Computation (ICNC), Ninth International Conference, IEEE, 1047-1051. May, R. J., Maier, H. R., & Dandy, G. C. (2010). Data splitting for artificial neural networks using SOM-based stratified sampling. Neural Netw, 23(2), 283-294. doi: 10.1016/j.neunet.2009.11.009 S0893-6080(09)00294-9 [pii] Meshram II, Hiwarkar PA, Kulkarni PN (2009). Reproductive Risk Factors for Breast Cancer: A Case Control Study. Online Journal of Health and Allied Sciences. Volume 8, Issue 3. Moftah, H. M., Azar, A. T., Al-Shammari, E. T., Ghali, N. I., Hassanien, A. E., & Shoman, M. (2014). Adaptive k-means clustering algorithm for MR breast image segmentation. . Neural Computing and Applications, 24(7), 1917-1928. Padhy, N., Mishra, P., & Panigrahi, R. (2012). The Survey of Data Mining Applications And Feature Scope. International Journal of Computer Science, Engineering and Information Technology, 2(3), 43-58. Patel, B. C., & Sinha, D. G. (2010). An adaptive K-means clustering algorithm for breast image segmentation. International Journal of Computer Applications, 10(4), 35-38. Piateski, G., & Frawley, W. (1991). Knowledge discovery in databases: MIT press. Pinto, H., Han, J., Pei, J., Wang, K., Chen, Q., & Dayal, U. (2001). Multi-dimensional sequential pattern mining. In Proceedings of the tenth international conference on Information and knowledge management, ACM, 81-88. Rokach, L., & Maimon, O. (2005). Clustering methods: Springer US. Sandhya, G., Gowda, S., Swamy, L. N., Raju, G. T., & Vasumathi, D. (2013). Automated detection of cancer tissues in mammograms using advanced K-Means clustering with homomorphic filtering. . Circuits, Controls and Communications (CCUBE), International conference IEEE, 1-4. Sharma, R., Alam, M. A., & Rani, A. (2012). K-Means Clustering in Spatial Data Mining using Weka Interface. International Journal of Computer Applications, International Conference on Advances in Communication and Computing Technologies, 26-30. Shaykhian, G. A., Martin, D., & Beil, R. (2011). Improve Data Mining and Knowledge Discovery through the use of MatLab Sixth International Conference on Dynamic Systems and Applications, USA. Sorror M. Latif , Hassan H. Baqer, Kholod D. Habib (2009). Study of Risk Factors for Breast Cancer in A Hundred Breast Cancer Patients. The Iraqi Postgraduate Medical Journal, VOL.8, NO.4. Tamberi, F. (2007). Anomaly Detection. Tolun, M.R. and Abu-Soud, S.M. (1998), ILA: an inductive learning algorithm for rule extraction, Expert Systems with Applications, April 1998, Pages 361–370- Elsevier‏ Vesanto, J., Himberg, J., Alhoniemi, E., & Parhankangas, J. (1999). Self-organizing map in Matlab: the SOM Toolbox Proceedings of Matlab DSP Conference, 35-40. Witten, I. H., Frank, E., & Hall, M. A. (2011). Data Mining: Practical machine learning tools and techniques ( 3rd Edition ed.). San Francisco. Xu, J., Xu, T., Sun, L., & Ren, J. (2013). An Improved Correlation Measure-based SOM Clustering Algorithm for Gene Selection. Journal of Software, 8(12), 3082-3087. Yeung, K. Y., Fraley, C., Murua, A., Raftery, A. E., & Ruzzo, W. L. (2001). Model-based clustering and data transformations for gene expression data. Bioinformatics, 17(10), 977-987. Yoo, I., Alafaireet, P., Marinov, M., Pena-Hernandez, K., Gopidi, R., Chang, J. F., & Hua, L. (2012). Data mining in healthcare and biomedicine: a survey of the literature. J Med Syst, 36(4), 2431-2448. doi: 10.1007/s10916-011-9710-5 Read More
Cite this document
  • APA
  • MLA
  • CHICAGO
(Data Mining as a Methodology for Extracting Hidden Knowledge from Breast Cancer Patients Records in Jordan Research Paper Example | Topics and Well Written Essays - 3000 words, n.d.)
Data Mining as a Methodology for Extracting Hidden Knowledge from Breast Cancer Patients Records in Jordan Research Paper Example | Topics and Well Written Essays - 3000 words. https://studentshare.org/information-technology/1861130-data-mining-as-a-methodology-for-extracting-hidden-knowledge-from-breast-cancer-patients-records-in-jordan
(Data Mining As a Methodology for Extracting Hidden Knowledge from Breast Cancer Patients Records in Jordan Research Paper Example | Topics and Well Written Essays - 3000 Words)
Data Mining As a Methodology for Extracting Hidden Knowledge from Breast Cancer Patients Records in Jordan Research Paper Example | Topics and Well Written Essays - 3000 Words. https://studentshare.org/information-technology/1861130-data-mining-as-a-methodology-for-extracting-hidden-knowledge-from-breast-cancer-patients-records-in-jordan.
“Data Mining As a Methodology for Extracting Hidden Knowledge from Breast Cancer Patients Records in Jordan Research Paper Example | Topics and Well Written Essays - 3000 Words”. https://studentshare.org/information-technology/1861130-data-mining-as-a-methodology-for-extracting-hidden-knowledge-from-breast-cancer-patients-records-in-jordan.
  • Cited: 0 times

CHECK THESE SAMPLES OF Data Mining as a Methodology for Extracting Hidden Knowledge from Breast Cancer Patients Records in Jordan

The Relationship of the Over-Expression of C-Myc in Causing Breast Cancer

I am now interested in investigating the relationship of the over-expression of c-Myc in causing tumors, or more particularly, breast cancer.... I propose that c-Myc has a major role in causing tumors, including breast cancer.... When transcription is not sufficiently balanced, it becomes detrimental to the cell and can cause cancer (Cox & Goding, 1991).... A recent study in embryonic stem cells has revealed a transcription control mechanism that is pervasive and regulated by the gene c-Myc which causes cancer....
5 Pages (1250 words) Essay

Data Mining as the Process

The focus of this paper "data mining as the Process" is on data mining, the process used by the firms to extract underlying information stored in the vast amount of data they have about their customers.... hellip; data mining can be oriented to for three paradigms: discovery, predictive and forensic.... Irrespective of the technique, data mining can be broadly carried out in three steps in generic terms: classification (applied to group data based on set rules), association (the relation between objects within the group is identified) and sequence analysis (the sequence in which a data repeats itself is identified)....
1 Pages (250 words) Essay

The Experiences of Men Diagnosed with Breast Cancer

This essay "The Experiences of Men Diagnosed with breast cancer" discusses describing the experiences of men diagnosed with breast cancer.... As a result of the study, the researchers found out that men who had been diagnosed with breast cancer had high chances of depicting psychological problems.... nbsp;The dependent variable was determined by the independent variable such as knowledge that breast cancer can be diagnosed in men, male patients' concerns with disclosure, negative attitude towards cancer....
4 Pages (1000 words) Assignment

Patient access, data mining and PHI

Another problem is the encryption of the data and Patient Access, data mining and PHI Patient Access, data mining and PHI Health information technology is one of the best initiatives to improve quality and efficiency in the healthcare system.... However, its implementation faces some problems with patients and physicians.... However, its implementation faces some problems with patients and physicians.... Another problem is the encryption of the data and security of the patients' data in the system....
1 Pages (250 words) Essay

Quality of Life in Women with Breast Cancer Post Mastectomy

Data will be collected from post-mastectomy breast cancer patients.... A good deal more research is needed to help post-mastectomy breast cancer patients achieve the highest quality of life possible.... nbsp;Even though more research is needed, there is existing research that identifies some of the factors necessary for post-mastectomy breast cancer patients to achieve a high quality of life.... This study “Quality of Life in Women with breast cancer Post Mastectomy” is intended to add to the existing literature and help to develop a plan to aid patients in achieving their highest possible quality of....
9 Pages (2250 words) Research Paper

Data Mining: the Personalization of the Organizations Business Processes

Data Mining is in fact the key step in the acquisition of knowledge from a vast data bank.... The paper "data mining: the Personalization of the Organization's Business Processes" presents an organization's functioning.... data mining tools often predict future trends and current behaviors.... Hidden patterns are often the resultants of data mining processes.... Thus data mining does the wonders for an organization that may otherwise seem impossible....
5 Pages (1250 words) Essay
sponsored ads
We use cookies to create the best experience for you. Keep on browsing if you are OK with that, or find out how to manage cookies.
Contact Us