Scheme Evaluation & Future Direction - GroupBased Scheme Case Study Example | Topics and Well Written Essays

8.4 The Proposed Scheme: General Evaluation This section provides an overall evaluation of the proposed scheme based on the research hypothesis d in Chapter 1, this is: “Applying a second layer of labels and grouping the nodes based on the parent-child relationship may facilitate node insertions in dynamic XML data in an efficient way, offering inexpensive labels with an adequate label size growth rate in which it is easy to maintain structural relationships, as well as improved query performance.” Then, the main experimental findings are highlighted. The GroupBased scheme was designed to improve on XML labelling by providing a scheme that deals with insertions without the need for re-labelling and without sacrificing the queries’ performance, construction-time and memory usage. The overall rational for this was that the researcher identified that there were core qualities of the original XML labelling platform that needed to be maintained even after the scheme had been designed. These qualities were deemed necessary because of the advantage they offer to data interchange programming. As mentioned already, some of these qualities were query performance, construction time and memory usage (Fennell, 2013). But to get the real measure of the GroupBased scheme’s performance or merits, it was important to introduce a new variable or parameter with which the comparative measure could take place. This necessitated the introduction of the Dynamic Dewey labelling scheme (DDE) on which the experiments were run to allow comparable evaluation under the same circumstances. To test the research hypothesis, the scheme was implemented based on the defined rules and characteristics (Ch.4). The design and implementation specifications were provided in detail in Chapter 5. As explained in the earlier chapters, the DDE scheme was implemented as it contributed to the formation of the proposed scheme. In order to evaluate the performance of the proposed scheme, four main experiments were performed to test whether the scheme fulfilled its intentions. The experimental framework of these experiments and an analysis of their results were discussed in Chapter 6 and Chapter 7. Generally, it is fair to state that the research hypothesis was partly supported by the results; some of the results obtained were fully supportive of the hypothesis. For example, it would be seen that the hypothesis was testing three major outcomes as far as performance is concerned. These were the need for the scheme to facilitate node insertions in an efficient way, the need to offer inexpensive labels, and the need to achieve improved query performance. As far as the outcome of facilitating node insertions in dynamic XML data in efficient way, the results of the study as given in chapter 6 and chapter 7 showed that even though different levels of increases in consumption time were identified for the node’s levels of relationship, the GroupBased scheme gave better performance in determining different relationships in static form as against the DDE scheme. This is because there was up to 1.5% of saving time. Meanwhile, for efficiency to be attained with the node insertions, it is important that much insertion can be done within a very short time frame (Murata, Kohn and Lilley, 2009). In this regard, the first component of the hypothesis was slightly supported with the GroupBased scheme outperforming the DDE scheme in terms of determining relationships. The second component of the hypothesis sought to offer inexpensive label with an adequate label size growth rate where the structural relationships are easily maintained. To measure this, the experiment was made to focus on the evaluation of growth that was recorded in label size in terms of memory allocation. The findings made showed that there was only a slight difference with change in label size between the GroupBased scheme and the DDE scheme. This is also an indication of the fact that the hypothesis can be accepted in this context but it is important to mention that the rate of difference which was merely 0.002% was lower than what was expected, given the fact that the GroupBased scheme’s label was made up of two labels whiles that of the DDE scheme had only one. The last aspect of the hypothesis focused on improvement in query performance. This was a very important aspect of the whole experiment given the role played by query performance in query response time. Again, the emphasis was with a comparative analysis that sought to measure how effective the researcher’s GroupBased scheme was over the DDE scheme. Of the 19 queries that were used, the results showed that different queries achieved different times. On a comparative basis however, there were better response with the GroupBased query performances as depicted in chapter 7. There were actually only 5 queries that the DDE scheme outperformed the GroupBased scheme. This means that the hypothesis can be accepted on the grounds of improved query performance as well. Evaluating the testing of the hypothesis that has been performed in the above paragraph, it can be stated that the scheme’s implementation worked as intended, proving its superiority among other similar labelling schemes in many comparable aspects in terms of time. Nevertheless, the scheme was found to be inferior in some aspects, such as the labels’ sizes and some time-related features. As a consequence of the evaluation process, some changes could be made to improve the scheme’s efficiency. From a design point of view, the proposed scheme could be redesigned using another dynamic labelling scheme instead of the DDE scheme in order to enhance the growth of the label. However, the DDE scheme was chosen in the first place due to its simple implementation and its feature of extracting the different relationships from the label, which was one contribution of this thesis. From an implementation perspective, the scheme was implemented as efficiently as possible based on its design. However, the implementation was a complex process due to the two labels that formed the proposed label. Thus, as each label required a number of processes, this could be considered as a drawback. Also, although synthetic and real datasets were used in the experiments to cover most possible scenarios, the lack of resources and time restrictions limited the scalability experiments. The results gathered were sufficient, however, to analyse and evaluate the hypothesis. More datasets could be used to provide more analytical results. 8.4.1 The Main Experimental Findings The most important finding confirms the hypothesis as the GroupBased scheme showed better time performance than the DDE scheme and other similar labelling schemes. The flexible and fast calculation of different relationships led to the faster answering of queries. Despite the scheme’s complex implementation, calculating the labels was fast, which resulted in handling ‘uniform’ and ‘random skewed’ insertions efficiently. Even though the labels’ sizes still grew slightly faster than the DDE labels, and the proposed scheme was more time-consuming when performing ‘ordered skewed’ insertions, it delivered better scalability by providing more consistent results before and after insertions. Moreover, the proposed scheme showed better performance on a deep-tree structure rather than a wide-tree structure. The whole research findings are described in the next chapter. With the general overview of the main experimental findings made above, the findings can be further evaluated along the two main objectives of the experiments which were to check for initial labelling time and label size. In data interchange on the web and on any other given platform, Rusty (2004) noted that activities of debugging programs, storing small and large amounts of data, and providing scalability for configuration files is very important. It is for such reason that the XML would mostly be fallen upon to execute all of these functions. But Cunningham (2005) emphasised the need to accept that in XML, the initial labelling time and label size played very crucial roles in talking about efficiency and effectiveness respectively. It was for this reason that the place of the GroupBased scheme in attaining such goal was measured as against the conventional DDE schemes. Summary of experimental findings on initial labelling time showed three major outcomes. The first outcome is that both the GroupBased scheme and DDE scheme showed significant time growth. This means that the XML labelling as a collective entity does not readily guarantee lower turnaround times. The second outcome was that regardless of the limitation of not guaranteeing lower turnaround times through initial labelling time, the GroupBased scheme allowed better efficiency than the DDE scheme. This is because the time taken to undertake the initial labelling by the use of DDE scheme went up by 40% at the initial stage and rose to up to 180% in latter stages. The implication here is that even if the GroupBased scheme does not reduce the time to offset initial labelling, using DDE would rather increase the original time that would be use to undertake the initial labelling. The third outcome was that the size of file plays an important role in efficiency or initial labelling time. This is because the difference in initial labelling time between the GroupBaseed scheme and the DDE scheme kept increasing with increasing file size. The implication here is that for best initial labelling time, file size must be kept low and the GroupBased scheme must also be used. As far as label size is concerned, the experimental findings made showed that again, file size was an important determinant in growth of the label’s sizes. This is because for both schemes, it was only when there was an increase in the file size by a margin of 5MB that there was a corresponding increase in growth of the label’s size by 0.056MB. Indeed the increase in growth of the label’s size when compared to the file size can be said to be marginal. Between the two schemes however, the growth of the label’s size was higher for the proposed GroupBased scheme. The difference was however only 0.002%. It is important to emphasise increase in growth size for the GroupBased scheme was expected but not at the rate that it was recorded. The increase in GroupBased scheme was expected because as showed in chapter 4 and chapter 5, there were two labels used in the GroupBased scheme whiles only one was used in the DDE scheme. With such twofold increase in label, it was expected that the increase in growth would be even higher. It is for this reason that the hypothesis was still not rejected despite the fact that the proposed GroupBased scheme brought about an increase in growth of the label’s size. 8.5 The Consequences of Some Practical Decisions Evaluating the main experimental findings, a very strong case can be made for the relationship between time, size and performance. Sean (2006) had argued that there are several advantages that XML has over conventional data exchanging programmes that make them suitable and preferred over the others, including HTML. One of the qualities of XML for which they may be preferred includes the fact that they allow multiple functionality based on their plasticity to dynamic add-ons and changes. The line of experimental findings obtained however sparks the need to give this attribute a second look. This is because even though the XML labelling may be in a position to contain overloads, overloading the programming outcome of the XML by the use of schemes may lead to label sizes growing even faster and becoming more time consuming (Bosak and Bray, 1999). Future implementers may therefore have to make a case between having a stuffed XML labelling programme that gives out multiple series of functionality at increased label size and time-consuming pace, and one that is focused on fewer functions to guarantee efficiency. With reference to the tree structures on which the two schemes performed also, there are some implications for practice for implementers. Particularly, the findings that there are areas or aspects where the proposed GroupBased scheme has advantages over the DDE scheme but there are other times that the opposite is true. For example, the findings showed that performance on a deep-tree structure gives relatively same value between the two schemes. Here, selection of scheme to perform on deep-tree structures could go either side. But when it comes to time used in performance, the proposed GroupBased scheme can be said to show superior advantage. This is because it offers better performance with respect to time whether taken from a wide-tree structure or a deep-tree structure. As a matter of fact, the difference in performance time was as high up as 165% difference between the two. The problem that arises, which requires careful consideration in making selection however has to do with the fact that there was better concise labels with DDE scheme than the GroupBased scheme, which means that implementers have to be certain with their ultimate goal before making a selection. A DOM parser was applied for the implementation of the proposed scheme, as discussed in Chapter 5. The features of the parser resulted in this decision being made as it ensured that any section of the document could be easily accessed, thereby allowing the XML tree to be effectively modified. Additionally, the functionality of this parser further benefits the access and retrieval processes that occur. However, one of the major disadvantages of DOM is that it is highly inefficient, with regards to memory. It creates a tree of nodes that are stored within the memory; these are reflective of the size of the document, which can be especially problematic for a particularly large document. Parsing and labeling processes can consequently become more time consuming, consequently resulting in an out of heap memory during the tree loading process, and subsequently reducing the effectiveness and success of the operation. The ArrayList suffers from the same disadvantage; if the array is completely full, then any additional elements require further memory, often at a significant cost (approximately 1.5 times the original array size). These elements are copied over from the old source into the new source, which results in O (n). This issue is especially problematic, as the label number cannot be easily extrapolated before the process takes place, as there is variation between different documents. Additionally, the use of ArrayList results in a distortion of the element positions; latter elements need to be shifted to make new spaces, in spite of the efficiency of the addition and removal process. This operation is highly time-consuming, especially when insertions are occurring. 8.6 Experimental Limitations Despite the fact that all the experiments worked and served their purposes, limitations were detected. These limitations arose as a consequence of the simple experimental design, which was selected to validate the scheme’s capabilities before extending it to a more complex level. Generally, all the experiments might be extended by using more datasets, more complex and varying queries, and different comparable schemes in order to obtain more elaborate results. The document size restrictions could be temporarily improved by using a more efficient platform; however, this will always be an issue as the data increases. Some calculation and storing approaches could also be improved to achieve better performance. In order to stick with the hypothesis that was set from the beginning of the study and as restated in this chapter, it was important that the experiments will be focused and limited to testing the hypothesis. This need was however found to create a form of limitation where the researcher could not be extended to XML documents parsing and storage mechanisms. Meanwhile, Laurent (2003) stressed on how important the storage component of XML labelling has made it popular and preferable over other data interchange platforms. Until this point therefore, it is not certain whether the proposed GroupBased scheme has any effects on how XML documents are parsed and how the labels and data associated with them are stored. What is more, even though the approach to the experiment was to avoid re-labelling for inserting new nodes, this could not be entirely followed to the end. This is because re-labelling was found to be required in cases where the structure of the XML document was changed. Meanwhile, Sean (2006) confirms to the rapid nature in which in real world scenario the need change XML documents occur. This notwithstanding, the proposed scheme did not fully consider re-labelling. Again, even though re-labelling was not fully considered in the research, there is a limitation created where some of the findings made leave room for re-labelling to be required. Typical example of such finding and as backed by literature from Bosak and Bray (1999) is the high exponential growth that was recorded in labelling time and the label size. Notwithstanding the fact that the exponential growth and labelling time in the DDE scheme were much higher, it was expected that the proposed GroupBased scheme could bring the values very low than were recorded in the study. Having said all these, it will be reiterated that the general scheme evaluation which focused on testing the hypothesis makes it possible to conclude that the study has fully tested the hypothesis and covered all the experimental aspects, particularly in terms of labelling time and label size. 8.7 Further Research Developments and Future Directions Although the GroupBased scheme outperforms similar existing schemes in many aspects, it is not concise in terms of size, which needs further investigation. The novel idea in this thesis is that the GroupBased scheme is group-based and uses two labels instead of one. This facilitates improvement but, on the other hand, results in a more complicated scheme that is difficult to implement compared to the simplicity of the DDE scheme. However, some enhancement could be applied in terms of implementation and experimental aspects to achieve better performance. From an implementation perspective, as argued in the previous chapter, using ‘ArrayList’ to store the labels and ‘DOM’ as the parser resulted in inefficient memory usage; thus, using other approaches might improve the performance of the current development. From an experimental perspective, other datasets could be used in the experiments, as well as more complex queries, in order to obtain more accurate results. This would be useful in the evaluation process and would help in highlighting future work that could be carried out with regard to the technique, as well as pinpointing the limitations. Then, these limitations could be used to highlight what needs to be added to the technique in the future to suppress the limitations and enhance usability (Vlahavas, 1999 #276). Also, obtaining more results would facilitate comparisons that could be made with other existing techniques. Re-designing the GroupBased scheme using a different labelling scheme instead of the DDE scheme could improve efficiency or lead to new theory. Moreover, storing only one of the two labels and extrapolating the other when needed could result in improving the memory usage and the time required to calculate the label. Investigating XML compression methods is the next direction to follow after this research, either to find a suitable compression technique that could be smoothly applied or to build a more suitable one that would preserve the scheme’s characteristics and provide better performance. As explained in this thesis, finding a labelling scheme that resolves all the issues is still a very challenging task and needs further investigation. Last but not least, as exposed under the limitation, it will be expected that future researchers will focus on ways in which their proposed scheme can address the issue of structure of XML documents that require change while operating with them. It is hoped that with such focus, the problem of re-labelling will be well catered for. 8.8 Conclusion The process of evaluation is paramount in every software development process, although it is overlooked in some instances. It is important to evaluate keenly any process or technique in order to determine the limitations associated with it, and to highlight future work to enhance usability. An evaluation of the experiments and their results was presented in this chapter, proving the proposed scheme’s efficiency and scalability compared to the DDE scheme. An overall evaluation of the scheme was provided, along with the main findings of the experiments, while taking into account the simple experimental frameworks and the limited datasets used; this was intentional in order to ensure that the scheme worked properly before extending it to further, more complex development. Some suggestions of importance to the experiment were also mentioned briefly. Reference list Bosak, J. and Bray T. (1999). "XML and the Second-Generation Web". Scientific American. Online at XML and the Second-Generation Web. Cunningham L. A. (2005). "Language, Deals and Standards: The Future of XML Contracts". Washington University Law Review. SSRN 900616. Fennell, P. (2013). "Extremes of XML". XML London, 3(5): 80–86. Laurent, S. (2003). "Five years later, XML." Washington University Law Review. SSRN 900616. Murata M., Kohn D., and Lilley C. (2009). "Internet Drafts: XML Media Types". Internet Engineering Task Force. 2(5): 75-89 Rusty H. E. (2004). Effective XML. Texas: Addison-Wesley. Sean K. (2006). "Making Mistakes with XML". Texas: Addison-Wesley. Read More

Scheme Evaluation & Future Direction - GroupBased Scheme - Case Study Example

Extract of sample "Scheme Evaluation & Future Direction - GroupBased Scheme"

CHECK THESE SAMPLES OF Scheme Evaluation & Future Direction - GroupBased Scheme

Madoffs Ponzi Scheme

Institutional investments- the pension schemes

Anomaly Detection Scheme for Prevention of Collaborative Attacks

Anomaly Detection Scheme for Prevention of Online Attacks

The Health Protection Scheme

Individual Budget Scheme and Its Effects on Hospitals

Prefix Labeling Schemes

The GroupBased Labeling Scheme