Path: senator-bedfellow.mit.edu!bloom-beacon.mit.edu!newsfeed.sgi.net!news-xfer.netaxs.com!nntp.giganews.com!nntp.primenet.com!news1.mpcs.com!news.iinet.net.au!not-for-mail From: david@cn.net.au (David Novak) Newsgroups: sci.research,comp.infosystems.www.announce,comp.answers,sci.answers,news.answers Subject: Information Research FAQ v.2.5 (Part 7/9) Followup-To: poster Date: 17 Apr 1998 00:00:00 GMT Organization: iiNet Technologies Lines: 433 Approved: news-answers-request@MIT.EDU Message-ID: <6hfv1q$fvf$7@news.iinet.net.au> NNTP-Posting-Host: gothic21.nv.iinet.net.au Summary: Information Research FAQ: Resources, Tools & Training Xref: senator-bedfellow.mit.edu sci.research:17452 comp.infosystems.www.announce:21918 comp.answers:30980 sci.answers:8160 news.answers:128177 Archive-name: internet/info-research-faq/part7 Posting-Frequency: monthly Last-modified: Apr 17 1998 URL: http://cn.net.au Copyright: (c) 1998 David Novak Maintainer: David Novak Information Research FAQ (Part 7/9) This FAQ now continues to highlight other aspects of information research. This part of the FAQ is not duplicated in the website or Infokey shareware. This part is relatively concise, more of a discussion, an informative arm-chair read about the field and process of information research. Note also, the disclaimer statement on Part 1 of this FAQ. Contents ----- Part 7 ----- 31. More on the Internet as a research resource 32. More on the Commercial Information Sphere 33. More on the Information Service Industry 33.1 judging information value 34. Emerging Trends in the information sphere 35. Education and Training in Professional Research 35.1 Facts 35.3 Guidance 35.2 Practice 36. Question and Answer Section 36.1 How do I find information on the Internet? 37. Acknowledgments ___________________________________________________ 31. More on the Internet as a research resource Lets agree the Internet is a great resource for surfing, but less valuable when you have a certain question to answer. To find answers, we need to begin by understanding how the information is arranged on the Internet. Contrary to myth, information is not disorganized but rather organized very carefully along clear patterns. Each pattern differs between the various forms the information may take. Further, awareness of information moves through several systems. Your understanding of the strengths and weaknesses of each pattern, each format, each system, will guide your search for information. I will share two insights here then invite you to the website for more. Insight One: Information tends to clump on the Internet, as with most resources, either by design or by simple habit. The web is not the only source of information and often not the resource where the best information groups. If you routinely browse different Internet systems, you will find certain information is found primarily in certain systems. While much information is drifting to the web, this trend is far from complete. The dominant source of information can usually be explained historically, as websites, ftp-archives, online databases, software, telnet-databases, newsgroups, mailing lists, etc... Insight Two: Information moves from the producers of information to the people who are seeking such information, and the way the information moves defines the resource. This is far more general, and applicable to any information format. Let us use books as an example. Books are created by authors who have something to write. Books are printed and marketed by Publishers to the bookstores who then provide it on to the readers. Each facet of this process defines the resource. Books have quality, editorial vetting, sales value and a potentially lengthy preparation time. Now lets look at FAQs. The best resource in the world on copyright law is the musings of a group of copyright lawyers who form the copyright mailing list. The copyright FAQ supported by this group is a logical document which summarizes much of the discussion of this mailing list. FAQs are vetted by the news.answers team, automatically mirrored around the world, and read by millions. From its origins, the FAQ is a peer-reviewed document, often full of links to further resources, topical, knowledgeable, factual and few in number. Again, how the information is generated, organized and transmitted deeply affects the information. As you search and surf the Internet, carefully note the address. This is the key. Certain qualities of information reside at commercial websites, government websites, or personal websites. Each tool (ftp, gopher, web) has certain identifiable qualities. Each system (faqs, mailing lists, newsgroups, bureocratic websites) has certain identifiable qualities. All this is delicately coded into the Internet address. Can you easily identify personal webpages from the address? Your understanding of the relative qualities of information affects both the search process and your analysis of its value. This framework is very valuable when interacting with the Internet and cuts through much of the chaos which is the Internet. As I mentioned, http://cn.net.au/training/ discusses this further. ___________________________________________________ 32. More on the Commercial Information Sphere The commercial information sphere existed in the 1970's and earlier. It is far more developed, far better organized, far better funded, almost always far more valuable and expensive than most every other research resource. Commercial information is arranged reasonably uniformly in large databases of full-text or bibliographic information. Some databases are small, single source documents, while others are huge unfoccussed collections of resources. Most directories and journals can be made into a database, but single-source databases do not enjoy much financial success, (except in a local market as in newspapers). To overcome this difficulty, single sources are grouped together into larger collections of databases on a particular topic. These larger database groups become the primary tool for commercial research. Developing these databases requires the assistance and expertise of a range of skills. Sometimes this requires abstracting, interpreting, and as with some Lexis-Nexis databases, even expert legal interpretation. Sometimes this is accomplished by large database developers with a range of databases in their portfolio. Sometimes this is accomplished privately. The marketting and consumer billing of such databases is then provided by a relatively small collection of very large database marketers. As an indication of the size of this market, Knight-Ridder sold Dialog & Datastar for a figure approaching half a billion dollars! Thus, we have an industry consisting of a wide collection of players, each improving and developing the information from individual periodicals, journals, news items, etc... All very confusing for the end user, of course. This is elegantly illustrated by the database descriptions for Lexis-Nexis databases (They prefer the term libraries. See http://www.lexis-nexis.com/lncc/sources/libcont/aust.html as an example). Luckily, there are actually very few large databases in existence. Many single sources exist in different commercial databases. The combinations are not endless, but they most certainly are difficult to understand. Further, different databases sometimes include different information from the same single-source. One database may include just abstracts, another may have fulltext, chemical indexing and more. Most researchers are unfamiliar with what exactly is being searched. This state of affairs is not unproductive. Searching a 'database about Australia', is uncomplicated. You receive information about whatever in Australia. It is simple, informative and incomplete. This system gives rise to great customer loyalty to database marketters brought on by ignorance and obsfucation in the quest for simplicity. Unfortunately, I am hard pressed to compare prices let alone describe the differences between information products. Community Networking currently toils at this issue, and hopefull we will have something more next month. Our database of Commercial Database Descriptions may help - see http://cn.net.au This system has distributed information for several decades. It is both sophisticated and quite difficult. You will need to become experienced with inverted indexes, search techniques (Boolean, truncation, proximity, field limits ...), and properly phrasing the question in a way which will be answered by a database search. I have always found the value of a database search directly proportional to the length of the query. If you are incompletely skilled at research, you will take longer, pay more and locate far more information, or unwisely discard, more than necessary. These are very different from searching Altavista and Webcrawler. Doing your own research offers an opportunity to more closely influence the research process. Sometimes only you understand the topic and sometimes you can more quickly discard unimportant details. Certainly it is becoming simpler to undertake some of this work. Many of the commercial databases are also available in a CD format. There are substantial subscription costs which limit their availability to large research institutions and libraries, though individual databases can be found in bookstores (I believe world books in print costs AU$5000+). Provided you can find casual access, it will cost you far less. Keep an eye on the age, though. Sometimes (and only sometimes) online information is more recent. The decision between undertaking research on your own or seeking external help is really a decision based on your research expertise, your budget, your access to information, your time, and the importance of finding all the information available. It also depends on your access to some decent research assistance. That is your decision. What I do know, is that a newcomer to the commercial information sphere will seriously underestimate the difficulty involved in searching, and underestimate both the cost of research and the cost of research assistance. Keep in mind this same system serves the needs of large commercial conglomerates, professional legal research, and well financed government studies. The commercial information sphere contains far more valuable information than the you need. Often the Internet is just an interesting sneeze in comparison. # Article: The Gale Directory of Databases (bi-annual in two volumes) includes a factual article as a forward, which follows the development of this industry. # Full text databases - by Carol Tenopir and Jung Soon Ro ___________________________________________________ 33. More on the Information Service Industry Private Detectives, Professional Database Researchers, Library Researchers, Legal Researchers, Commercial Database Producers, Commercial Database Marketers, Magazines, News Organizations, Libraries, this is a big industry. Information Research is just a process which links together those seeking information with those who provide it. __ 33.1 judging information value Information has value. It also has other qualities which will assist you to judge the value of information you may consider buying. Accuracy: the factual nature of the information presented. If the statistics purport to show a particular trend - how large is the margin of error? How large is the sample size? How likely are there to have been factual errors in their development? The measurement of statistical error is now a refined science in some fields. A statistical result can be inaccurate when the sample size is too small, if the margin of error is too large, the sample collection procedure incorrect, or a number of other situations. Reliability: the support for trusting the solutions, both from additional resources and from being able to duplicate the conclusions. This includes the reputation of the researchers. No matter how inaccurate and biased you may believe certain facts to be, successful independent support of a suggested fact does improve its value. If facts can not be duplicated, like cold fussion, they are of less value. Bias: conscious or subconscious influences which affect information. Bias can occur in collection, preparation and presentation of information. Most information you find will be tainted. Secondary information is deeply affected. Statistics are not necessarily less biased. We counter bias in several ways. Firstly, we try to be aware of bias. Where is bias likely? Which direction would the bias affect the information. Secondly, we try to collect information which has different bias. This is why research based solely on government research, no matter how accurate and reliable, is less valueable. Often information from different countries can counter bias. Thirdly, we need to accept bias is likely to exist. This is why primary sources are often more valuable than secondary sources. This is why tertiary sources, like experts, are likely to be very biased. Age: The date information was created or compiled will feature prominently in the value of information. Dates given sometimes mean the date information was created, or the date information was compiled. How old is a book compiled in 1995, which took the author 10 years to finish? I find statistics often forecast information, prominently displaying recent compilation dates but still use old census data or the like to draw their conclusions. Information on the Internet typically has no date. Purpose: purpose merits further discussion. When you are uncertain about potential bias, you can look for reasons to distrust the information instead. Suspicion is not equivalent to bias, but it can be thought provoking. Privately, I have heard repeated rumours that important national statistics have been fudged in different countries. A government research report investigating the price of books in Australia would have a political purpose, a purpose which provides the climate for some potentially significant bias. A tell-all book by industry experts often include a tremendous quality of insider experience difficult to find elsewhere. While there may be a purpose of self-agrandizement, the purpose is less a climate for significant bias. Medical research has perhaps the greatest climate for significant bias, and this suggests the greatest standard of proof and external, reliable support. This explanation of accuracy, reliability, bias, age and purpose is very important in research. This is what leads us to an appraisal of value. For years, the tobacco industry funded 'independent' research finding smoking minimally harmful to health. It is now likely there may have been errors brought on by accuracy, and bias. Certainly, purpose was in doubt. As other studies showed smoking in harmful, we can also say this research lacked reliability. In business and the Internet, research is perpetually suspect because it also ages so very quickly. Once you are aclimatized to these elements, you begin to see potential for error in a whole range of information. Real-Estate association figures, expert opinions, Toothpaste advertisements and National GDP figures all occassionally display some degree of warping and manipulation, clouding the truth. The solution is awareness, comparison and careful analysis. As a personal aside, this is part of the reason for my personal dislike for market research: it is often taken far more seriously than warranted and mean far less than is suggested. ___________________________________________________ 34. Emerging Trends in the information sphere I will outline three emerging trends whose impact is not fully understood. Firstly, for the past few years, individual database owners/maintainers have been flirting with the idea of making paid access available through the Internet, rather than the existing system of allowing database marketing firms to promote and market their databases. I have heard rumours most database producers earn up to 30% of retail price when delivered through database marketing firms. The Internet is not a commercially viable alternative...yet, but some have emerged with alternative funding despite this (Library of Congress, ERIC, see section 6.2). Others are creeping in around the edges by offering subscribers access at a much reduced flat annual fee (Computer Select at one time). I expect to see much more of this once a meaningful way to charge by the page emerges - which despite the hype appears to be some time away. A second trend is Internet publishing itself. Gradually, the information is getting easier to locate (don't laugh please - its undignified). We are also getting better at using the Internet as a tool to disseminate information. Emerging from these efforts are the very visible, if perhaps short-lived, search engines, but also other efforts like archives of FAQs, archives of guidebooks, applying the dewey decimal system to the Internet, specialist directories, specialist search engines and more ensure this will be a lively field for several years to come. As it gets easier to locate the good information, perhaps the lines between commercial quality and Internet quality will begin to merge. I have seen some promising plans for raising the quality of Internet information. Thirdly, there is this very interesting prospect of paying for information by the page through the Internet - and viewing the results in a web page immediately. There are many technical hurdles yet, but certain elements are already appearing, including ventures like DialogWeb, but much more is in the future. This step may prove profitable for ATM vendors and owners of Internet cafes, pubs and kiosks. It may also herald a dramatic drop in the cost of information. ___________________________________________________ 35. Education and Training in Professional Research Practice, Guidance and Facts are required to become better at research. None of these is particularly hard to get, just the time and effort to get better, for just like an artist, professional research is a lifetime study made more complicated by a moving target. __ 35.1 Facts Facts on professional research are relatively easy to find. Making some coherent sense of them takes practice. You will want to learn of each tool in your field and their relative strengths, weaknesses and qualities. You will also want to learn about the technology supporting this industry, secondary experience on the skills you will need to learn, and some understanding of clear-thinking and statistical comprehension. Research as a business may also interest you. Technology: # Full text databases - by Carol Tenopir and Jung Soon Ro Research Skills: # http://cn.net.au Research as a Business: # _The Information Broker's Handbook_ by Sue Rugge and Alfred Glossbrenner (updated occassionally, so seek the latest edition) _Find It Fast_ by Robert I. Berkman __ 35.2 Practice Almost all University Libraries make an assortment of Research CD-Roms available to their patrons. Most University Libraries, for a small sum, will issue members of the public with a library card and permission to use these databases. Practice on these as they are free, relatively current, and provide instant feedback. The Commercial databases which have migrated to the Internet (LOCIS & ERIC) are atypical or overly simplified. Plus there is no time pressure (things change when you are being charged $3 a minute + download costs), so beyond this, subscribe to a commercial database provider. They will also begin to send you ample resources to further educate you in the art of database research. Interviewing both primary resources (those involved) and secondary resources can become an elegant and quick way to learn something, and also a rapid way to get advice of further uncommon resources. There is a definite skill to learn here which you will not get out of a book - though there are books that will help you learn this skill. I will see if I can find some useful resources on this topic. __ 35.3 Guidance If you read and practice without guidance, you will become proficient, but incomplete as a professional researcher. For starters, you will have picked up inefficient habits along the way. More importantly, you will be set apart from other researchers with few ways of learning of new resources/new techniques/new concepts. This is a particular problem among professional researchers who are not fortunate enough to be librarians or work closely with other researchers - there are few opportunities to share and discuss professional research with your peers. Besides seeking these opportunities, you may wish to consider: # InfoPro - a private mailing list devoted to discussing professional research and detective work. # There is a professional research periodical printed in Texas. # Professional Associations in the World. # Periodically read books by other authors on Research Techniques. ___________________________________________________ 36. Question and Answer Section __ 36.1 How do I find information on the Internet? Basically, you need to remember that a search for information on the Internet is not different from a the standard information search in process. You still need to start by outlining carefully just what you are hoping to locate. Secondly, you need to be aware of the peculiarities of the Internet as a researchable resource (or rather a collection of resources). If you expect instant delivery of exactly what you require, free, then you need a reality check (and I am sure you will get one as soon as you log in!) Sadly, the printed media tends to forget this, but that is another story. As with all resources, the more familiar you are with a given resource, the more efficiently you will work. Get to know the Internet for a time first. Understand how it works. Then re-adjust your expectations and file it as just another few resources which may be preferable in certain circumstances. ___________________________________________________ 37) Acknowledgments I would like to thank my past clients, the Western Australians I have trained and all you internauts who will shortly inundate me with endless snippets of wisdom to be included here. Your help is greatly appreciated. ___________________________________________________ Copyright (c) 1998 by David Novak, all rights reserved. This FAQ may be posted to any USENET newsgroup, on-line service, website, or BBS as long as it is posted unaltered in its entirety including this copyright statement. This FAQ may not be included in commercial collections or compilations without express permission from the author. Please post permission requests to david@cn.net.au ----------------------------------- David Novak - david@cn.net.au