StudentShare
Contact Us
Sign In / Sign Up for FREE
Search
Go to advanced search...
Free

A Review of an Existing IR System - Essay Example

Cite this document
Summary
This paper 'A Review of an Existing IR System' tells that The aim of carrying out the present report is to identify the place of information retrieval systems in the task of information retrieval, as the need and use of information from the internet continue to be an important phenomenon in academic practice…
Download full paper File format: .doc, available for editing
GRAB THE BEST PAPER91.8% of users find it useful
A Review of an Existing IR System
Read Text Preview

Extract of sample "A Review of an Existing IR System"

? A REVIEW OF AN EXISTING IR SYSTEM A REVIEW OF AN EXISTING IR SYSTEM The aim for carrying out the present report is to identify the place of information retrieval system in the task of information retrieval, as the need and use of information from the internet continues to be an important phenomenon in academic and professional practice. It is opined that even though there are as many sources of information as possible on the internet, a person needs to have a pragmatic understanding of how each of these sources, particularly information retrieval system works so as to get the best out of the process of information retrieval. What is more, the researcher acknowledges that each of the different information retrieval systems available have its own strengths and weaknesses, which make their use in certain context inappropriate as against other forms of use. With this in mind, the report will seek to critically describe, by coming out with a detailed descriptive account of a selected information retrieval system, which will become a basis for users of the report to make decisions on the information retrieval system. The description shall be done in accordance with how the system allows for specific information retrieval tasks to be performed. Thereafter, findings from the study shall be outlined, which shall mainly dwell on the strengths, weaknesses and areas of improvement for the selected information retrieval system. Finally, there is a realization that for every information retrieval to take place there must be a corresponding information source, which leads to the next topic of this discussion. Introduction to Information Retrieval and selected IR system The internet continues to dominate as an important tool in the search, retrieval and storage of information. Indeed, with the coming of the internet, there is now a centralized whole-point where almost every kind of information and data can be found (Ando and Tong, 2005a). This has been made possible because of the easily accessible nature of the internet where everyone can stay and live in the comfort of his home and input pieces of information online. Commonly, the same form of information exists online and from several sources that, when a person wants to make use of a particular line of information, the options available to choose from are more than required. Apart from the numerous natures of available options, there is also an issue with the differences with information source quality, where some sources of information can be proven to be more reliable, authentic and valid than others (van Rijsbergen, 2009). Han and George (2000) has explained information retrieval as a conscious activity that is aimed at obtaining information resources that are highly relevant to a person’s information need from available information sources. This definition indeed gives a very broad overview of the concept of information retrieval and explains it even further. In the first place, it would be noted that information retrieval does not take place as an accidental process but as an intential process, and thus the use of the word, ‘conscious’. In effect, people who perform information retrieval are people who sit before their computers and other internet media with the mind of finding something useful for further processing. Secondly, the definitions establish that information retrieval must lead to the obtaining information that is highly relevant to a searcher’s quest. In other words, information retrieval would always lead to a pool of options for the person undertaking the search. The person performing the search thus has a role to play at that point when he is presented with the pool of options and this role is to ensure that he becomes part of the search by selecting only what is most relevant to the search. Description of selected IR system This section of the report therefore deals with a description of how PubMed has developed functionalities to help support its users with their search. This is made up of five user interface functionalities. Standard searches The PubMed information retrieval system has a well elaborate, easy to identify search panel where users can easily type in key word or combination of key words to search for in the database of the system. This search panel is indicated in the diagram below. The standard search on PubMed functions in such a way that allows the user to do less whiles the system itself does more. This is because based on the search key words that are entered the system undertakes a translation process on the initiation search formulation by automatically adding necessary components of the search clue that could help the user identify best options more quickly (Baeza-Yates and Berthier, 2009). Some of these automatic additions that are made include field names, medical subject headings terms, Boolean operators, synonyms, among others. An example of expanded and enhanced format of a standard search is also presented below. From the diagram above, it would be noted that after typing obesity in children, there has been a refining of the search to include as many likely options that the user can be in use of as possible. Technically, the refining is done by combing textwords with MeSH terms (Pagallo and David, 2010). Journal Article Parameters Another exclusive characteristic of the PubMed information retrieval system is that not only does it focus on key words and topics but also stores and presents stock of journal articles. As a science and medicine oriented IR system, one would certainly expect that the journal articles are focused on science and medicine related topics. But what is perhaps more influential and can be differentiated from other search engines is that the system actually has its own parameters for making the search and usage of journal articles easier (Koller and Mehran, 2007). This feature of the PubMed IR system actually gives it a very typical traditional library characteristic where books are arranged according to specialised indexes or cataloguing system. But this time round, there is much comprehensiveness with the indexing and cataloguing that is done because instead of focusing only the larger topic area of the book so that the reader gathers as many books as possible to identify which of them contain articles that are necessary to serve a given purpose, the journal article parameters makes it possible to locate articles directly. Some of the parameters that are used for the search include article type, keywords, country, language, publication history, and format of publication (Kudenko and Haym, 1998). My NCBI Personalisation and customisation on IR systems have not been as easy as PubMed presents with the My NCBI portal. This is a specialised feature on the system that allows enhanced website utilisation by users who specially registers to be members of the system. Registration is however free and open to the public. Once a person register, an account is created and the My NCBI portal is enabled. Among the functionalities that registered members enjoy with this customised service is the ability to save searches that are performed. With the saving platform, users do not need to start specific searches that they have been involved in previously all over (van Rijsbergen, 2009). Clearly, this is a feature that saves time and enhances the efficient use of the website. The customised access also allows users to filter their search result. Not all, there is the ability to set up a specialised automatic update request that is sent by e-mail on a periodic basis. This way, users stay up to date with the IR system by being part of every new feature that is introduced (Nigam, McCallum, Thrun and Mitchell, 2000). Knowing the importance of references, the site also allows for the saving of a set of references used on the site and also to configure display formats to user’s discretion. askMEDLINE Apart from personalisation and customisation features on the system, there is also a well enhanced interactivity feature that comes through the askMEDLINE feature. As the name implies, this feature allows users to pose a professional question related to the field of health and medicine and get answers professional perspectives. This particular feature has been identified and praised for making PubMed more of an interactive educational website than an ordinary search engine (Dumais, John, David and Mehran, 2008). This is because the site goes beyond the traditional duty of helping the user find a set of information to the need for the user to be acquainted with in-depth knowledge of the subject areas that they search about through the askMEDLINE portal. What is more unique about this feature is that whiles other websites have been specialised in the provision of professional knowledge based on the payment of charged fees, the askMEDLINE feature comes as a free-text query tool that is available to all users as long as they are ready to register and be part of the identified user-base of the website. Critique of selected IR system PubMed is a typical example of available information retrieval system that has gained much popularity over the years for the kind of information retrieval enhanced services it offers. Generally, PubMed can be identified as a free online information retrieval system, or better still search engine that gives users access to MEDLINE database of references and abstracts (Turney and Michael, 2002). It would be noted that MEDLINE, which stands for Medical Literature Analysis is a database that is predominantly made up of life sciences and biomedical topics which is maintained by the United States National Library of Medicine (NLM). Indeed, it has been said earlier that the internet is a place where everyone is free to operate by putting up information for onward access by the public. But for one to be very critical and sure of the validity of the information he works with, the credibility of the source is very important. Noting that PubMed is maintained by none other than the United States National Library of Medicine therefore gives much explanation to the credibility of PubMed as an information retrieval system. Later in the paper, there will be detailed description of the information system and how it facilitates the process of information retrieval. First released in January 1996, PubMed has since served the public with characteristic features that include standard searches, PubMed identifier, comprehensive searches, askMEDLINE, journal article parameters, LinkOut, related articles, My NCBI, and many others (Ando and Tong, 2005b). The Stanford Natural Language Processing Group (2008) offers three major variables for critiquing or evaluating an information retrieval system, which will be borrowed and used in this instance in critiquing PubMed as a contemporary information retrieval system. These three variables are presented as test collections, the first of which looks at IR system as a document collection. From this perspective, the question relevance and irrelevance is asked of the available document. Clearly, PubMed cannot be said to be an IR system that lacks sufficient documents. However, this is limited to specific subject area, which has to do with health and medicine studies. As far as sciences and biomedical topics are concerned therefore, one is certain to have almost all areas of the subjects fully covered (Giles, 2005). Where a user’s search falls outside sciences and biomedical topics, there cannot be a guarantee of finding any results to search. With reference to the sciences and biomedical topics however, PubMed continues to be a world acclaimed IR system where highly relevant data can be found. It is not surprising that several universities across the globe recommend PubMed to their students as an accepted source for finding articles and journals for undertaking various research works (Etzioni et al, 2004). What may be the key to this achievement is in the management of the IR system, which is done by none other a credible and reputable organisation than The United States National Library of Medicine (NLM) at the National Institutes of Health (John, Ron and Karl, 2004). The second focus of evaluation looks at the issue of test suite of information needs, expressible as queries. What this generally seeks to exemplify is the ability for users of the IR system to have the mandate of subjecting the information base of the system to scrutiny through the use of queries. Better still a very credible IR system would be one that allows for information interrogation. The reason this is particularly important is that most sources of information on the internet are free to the public, making it possible for anyone to sit at the comfort of his home and post information, claiming them to be reliable. Looking at the number of text inputs that some popular IR systems receive in a day, particularly encyclopaedia based systems, an absence of a test suite of information needs would leave the user wondering whether a set of information presented is from a credible source or one of a make-up source (Goldberg and Xiaojin, 2006). With PubMed, this is observed in two major ways. In the first place, the askMEDLINE feature makes it possible for users to freely interact with managers of the system on issues of science and biomedical topics presented on the site that borders their intuition. What is more, there is a special feature on the website named Related Articles, which allows the user to undertake a personalised probing of information presented on the site by making reference to some of the suggested related articles, some of which are found on external sources from PubMed (Dumais, John, David and Mehran, 2008). The last critiquing preamble touches on the availability of a set of relevance judgments on the platform of the system that will make it possible for third party and neutrally held professional opinions to be held about the database of the system. The uniqueness of this particular critiquing model is that it does not feature or look into ways that the administrators or managers of the site itself justify the relevance of what they present but a will and freedom given in the hands of third party opinion holders to do so (Fawcett, 2001). In a typical organisational set up, this may be likened to a quality assurance or quality control measure that is in place to check the quality platform of the organisation. For PubMed, no singular platform of this nature may be found as giving such exclusive relevance judgment to the inputs and information available at the site. However, there are a number of partial and indirect ways by which this need is served. Generally, there are three major strengths that easily come out when talk is made of PubMed. These are relevance of information, interactivity and customisation. Interestingly, all these three are characteristic features that are needed to make the journey through any information retrieval system one of user friendly and user centred. With the growing urge for all people who are academically minded to be part of the revolution of the internet, it is important that there will be such places of convenient academic search where users will not have a cause to wonder about the authenticity of the information they come across with. For new users also, the fact that they can have some command and control over the system to decide for them what the system should be made up of is a step in the right direction that ensures that new users have the freedom to learn at their own rate and pace. These strengths notwithstanding, it remains a major weakness that PubMed has still not been able to come out of its specialisation to step into a more open ended information retrieval system where users will not be limited to only sciences and biomedical topic. What is more, it is high time the system had an independent quality assurance division that would be responsible for checking the site for quality issues. For instance there is a feature on the website named ‘LinkOut’. With this, a collection of full text database is taken from as many as 3,200 sites, most of which are tertiary institutions of higher learning (Goldberg and Xiaojin, 2006). These institutions are sampled all across the globe, giving a form of inter-institutional critiquing and appraisal of information to be done. In effect, each institution becomes a keeper of the other institution in terms of creating a check and balance system that ensure that consensus opinion creation on the contents that are presented on the site is not done by the major hosts of the IR system but other users who come in as third party users. Even though this is a positive move, it is still held that there could be a more comprehensive independent third party to undertake this form of quality assurance. This will be means of making the system better. Future Improvements As it has been established above, information retrieval cannot take place in the absence of an information source. Technically, this information source is referred to as information retrieval system. Indeed, information retrieval system is more appropriate terminology as it incorporates the processes and technicalities that come to play in making the task of information retrieval possible (Mladenic and Marko, 2008). Generally, a system is a set of independent component of entities that are put together to play a common function or perform a single role (Urena-Lopez, Buenaga and Gomez, 2001). In the case of information retrieval system, some of the isolated and independent functionalities that comes together to make up the system includes the sorting of key words, synchronization of dates, exclusion criteria formulation, and terminology highlighting functions. When all of these are put together, there are a number tasks that the searcher are enabled to perform. Later in the paper, three of these tasks will be critically reviewed in related to a selected information retrieval system. Commonly though, a person undertaking a search will identify some of the tasks to include factual search, known item search, search for instructions, search for description, location search, and finding introductory material. Because a very typical information source can perform all of these roles, it is referred to as an information retrieval system. Even more, there is an issue with the quality of the content of information found on individual sources, even if these sources are identified as the most credible of all sources. Because of these variable search differentiations, the need to perform information retrieval has always been necessary for people seeking information from online sources. Based on all the findings that have been made, it can be concluded with all certainty that PubMed is a highly credible information retrieval system that offers users with very comprehensive features that make the task of information retrieval worth engaging in. Before times that such credible sources of information had existed, users of the internet were not very sure of the guarantee on the credibility of information that they gathered from various websites. However, information retrieval systems such as PubMed continue to add meaning to why the internet is not only a place for junk information (Han and George, 2000). If for nothing at all, the critique undertaken on the system using the three preambles given by the Stanford Natural Language Processing Group can be used to score PubMed highly for its credibility. A major recommendation that will be made for the system, going into the future, would be the need for its interactivity to be better enhanced. Even if this should come at a fee, a more instantaneous response to the academic queries that are posed by respondents would be a ground breaking innovation with IR system. If IR systems are there for searches, then this should happen easily because it will be an innovative means by which members of the public will be searching for knowledge. These will make PubMed the best search-system. References Ando, R. K. and Tong Z. (2005a). Framework for learning predictive structures from multiple tasks and unlabeled data. JMLR, 2(43) 1817-1853. Ando, R. K. and Tong Z. (2005b). A high-performance semi-supervised learning method for text chunking. In Proceedings of the 43rd Annual Meeting of the ACL, 3(12) 1-9, Ann Arbor, MI, June. Baeza-Yates, R. and Berthier R. (2009). Modern Information Retrieval. New York, NY: Addison Wesley. Dumais S, John P., David H. and Mehran S. (2008). Inductive learning algorithms and representations for text categorization. In CIKM'98 3(4) 148-155. Etzioni, Oren, Michael Cafarella, Doug Downey, Stanley Kok, Ana-Maria Popescu, Tal Shaked, Stephen Soderland, Daniel Weld, and Alexander Yates. (2004). Webscale information extraction in knowitall (preliminary results). In Proceedings of the 13th International World Wide Web Conference (WWW'04), New York, USA, May. ACM Press. Fawcett, Tom. (2001). Feature discovery for inductive concept learning. Technical Report COINS Technical Report 91-8, Department of Computer and Information Science, University of Massachusetts, Amherst, Massachusetts. Giles, J. (2005). Internet encyclopaedias go head to head. Nature, 438:900{901. Goldberg, A. and Xiaojin Z. (2006). Seeing stars when there aren't many stars: Graph-based semi-supervised learning for sentiment categorization. In Workshop on Textgraphs: Graph-based Algorithms for Natural Language Processing, HLT-NAACL 2006. Han, E. and George K. (2000). Centroid-based document classification: Analysis and experimental results. In PKDD'00, 4(3) 34-54. John, G. H., Ron Kohavi, and Karl P. (2004). Irrelevant features and the subset selection problem. In Proceedings of the 11th International Conference on Machine Learning, pages 121{129. Koller, D. and Mehran S. (2007). Hierarchically classifying documents using very few words. In Proceedings of the 14th International Conference on Machine Learning, pages 170{178. Kudenko, D. and Haym H. 1998. Feature generation for sequence categorization. In Proceedings of the 15th Conference of the American Association for Artificial Intelligence, 3(5) 733-738. Pagallo, G and David H. 2010. Boolean feature discovery in empirical learning. Machine Learning, 5(1):71-99. Nigam, K, McCallum A, Thrun S, and Mitchell T. 2000. Text classification from labeled and unlabeled documents using EM. Machine Learning, 39(2-3):103{134. Mladenic, D. and Marko G. (2008). Word sequences as features in text-learning. In Proc. of ERK-98, 7th Electrotechnical and Computer Science Conference, pages 145-148. Turney, P and Michael L. L. (2002). Unsupervised learning of semantic orientation from a hundred-billion-word corpus. Technical Report ERB-1094, National Research Council Canada, May. Urena-Lopez, L. A., Buenaga M. and Gomez J. M. (2001). Integrating linguistic resources in TC through WSD. Computers and the Humanities, 35:215{230. van Rijsbergen, C. J. (2009). Information Retrieval. 2 edition. London: Butterworths The Stanford Natural Language Processing Group (2008). Information Retrieval System Evaluation. http://nlp.stanford.edu/IR-book/html/htmledition/information-retrieval-system-evaluation-1.html Read More
Cite this document
  • APA
  • MLA
  • CHICAGO
(“A review of an existing IR system Essay Example | Topics and Well Written Essays - 3250 words”, n.d.)
Retrieved from https://studentshare.org/information-technology/1493528-a-review-of-an-existing-ir-system
(A Review of an Existing IR System Essay Example | Topics and Well Written Essays - 3250 Words)
https://studentshare.org/information-technology/1493528-a-review-of-an-existing-ir-system.
“A Review of an Existing IR System Essay Example | Topics and Well Written Essays - 3250 Words”, n.d. https://studentshare.org/information-technology/1493528-a-review-of-an-existing-ir-system.
  • Cited: 0 times

CHECK THESE SAMPLES OF A Review of an Existing IR System

Reflection based on review and conduct another paragraph

Reflection When I gave my work to my partner for a review, I was not sure what the response would be, especially as I felt that my partners work was better than mine was.... The reason for this was that I had the idea that a review would be like the edits that the lecturer usually does on my work.... % between 1964 and 1995 following the implementation of the system (Timsina, 2011: p98).... hellip; This system saw the introduction of high yielding and short duration rice and wheat varieties in the same period....
4 Pages (1000 words) Book Report/Review

Investment Decisions With Firm Strategy: Management Accounting

It is not easy to identify wastes in an organizational system.... This paper describes how management accounting helps in identification of the most critical processes and related activities of the organization such that their cost allocations can be prioritized in the budgeting process....
7 Pages (1750 words) Literature review

Taxonomy on Existing Techniques of Reducing False Alarms in Sensor-Based Healthcare Monitoring

The primary purpose of an intrusion and detection system is to identify attackers trying to infiltrate a network and expose vulnerable resources.... By calculating the significance and severity of each suspected attack, the system establishes whether an activity can be classified as an attempted attack or normal behavior miss judged by the detection system.... The topmost layer represents the integration point of the system administrator and the intrusion detection system....
5 Pages (1250 words) Literature review

Liquid Air Energy as Vehicle Fuel

nbsp;… The consideration comes because air is available and abundant without any cost, and there are existing infrastructures that can assist in the early adoption, there are mature supply components or chains, storage of liquid air occurs at relatively low cost and safe low pressure, and there is no risk of combustion of the fuel (NC State University 2014)....
10 Pages (2500 words) Literature review

Impact of Portable Air Cleaners and Energy Use

It is obvious that homes with portable air cleaners utilize less energy compared to homes with a central HVAC system with filters.... The central HVAC system with filters air cleaner is more efficient the portable than the portable air cleaner device.... The goal of this review "Impact of Portable Air Cleaners and Energy Use" is to determine which air cleaner is more efficient between a central HVAC with filters and a portable air cleaner.... The review seeks to compare the difference in energy consumption between the two air cleaning devices....
12 Pages (3000 words) Literature review

Airline Pilots Commuting and Its Effects on Fatigue

… The paper "Airline Pilots Commuting and Its Effects on Fatigue" is a wonderful example of a literature review on social science.... The paper "Airline Pilots Commuting and Its Effects on Fatigue" is a wonderful example of a literature review on social science.... In this research, the paper will concentrate on a systematic literature review from peer-reviewed journals, experiments, and reports from different organizations in the airline transport systems as well as publications from different sources....
15 Pages (3750 words) Literature review

Generation of Useful Electrical Power

As mentioned by Electropaedia (2005), steam turbine technologies are fundamentally heat engines where heat energy is converted into mechanical energy by interchangeably heating as well as condensing a working fluid through the Rankine cycle, which is a closed system.... This literature review "Generation of Useful Electrical Power" discusses the generation of power through steam turbine technology that encompasses three energy conversions; producing thermal energy, changing the steam's thermal energy into kinetic energy, and making use of the rotary generator....
14 Pages (3500 words) Literature review

Pressurization and Ventilation, which is a Better Protection to High Rise Building Stairs

The "Pressurization and Ventilation, which is a Better Protection to High Rise Building Stairs" paper presents a review of literature explored the various aspects of pressurization system and ventilation systems and their application in high-rise buildings.... Additionally, it is less complex compared to the pressurization system.... The current International Building Code (IBC) recognizes three means of providing "smoke proof exposure": stair pressurization system, mechanical ventilation of stairs, and natural ventilated stair balconies (Wan-ki et al 2013; Hung and Chow 2001)....
12 Pages (3000 words) Literature review
sponsored ads
We use cookies to create the best experience for you. Keep on browsing if you are OK with that, or find out how to manage cookies.
Contact Us