On-line data retrieval – EDN
This piece was scanned and added to the archive in 2018, over 35 years after it was written. The contrast is breathtaking, with the article explaining how to use dollar-a-minute online searching tools (1982 dollars!) to hunt for bibliographic results, patents, and other limited text-only resources. Google was unimaginable back then…
by Steven K. Roberts
September 1, 1982
Taking advantage of the computer-resident databases described here can save you and your staff considerable time and effort in the task of keeping technically current. Learn how to use these resources, and in the time you would spend just driving to the library, you can peruse lists of articles, conference papers, reports and books on a wide range of topics.
A problem of some magnitude
Staying abreast of the latest technology is a classic problem for engineers. Technological change occurs with such deadly swiftness that the Old Dog syndrome strikes the complacent with a vengeance. Remember the first few dizzying years of the microprocessor, when the engineering community split into subtly warring factions? Bright-eyed enthusiasts, usually but not necessarily young, greedily devoured every scrap of published literature to keep pace with developments. Meanwhile, those who stopped learning on their graduation day skipped uncomfortably over the growing number of magazine articles about µPs and looked around for management positions that would isolate them from an increasingly alien technology. (Even they couldn’t hide forever, though.)
A fair percentage of this information problem is based on attitude and personal philosophy, of course. But it’s very easy these days for anyone to feel hopelessly swamped by the sheer magnitude of the industry. And the problem leads to very real and very expensive effects. Even if an engineer could somehow physically gather all available literature on one field, trying to organize it would be nearly impossible. How would you attempt such a task? Should you sort articles by title, author, device type, application area, date published or technique described? If you sort them one way, how do you get at them via any other? And if you do manage to implement a workable cross-reference scheme, how do you add new material without updating all the indexing? Who has that much time?
Data is at your fingertips
For a long time, the electronics industry has needed a robust, up-to-date source of engineering information to deal with such problems, and now it has one. Exemplified by systems such as Lockheed’s DIALOG information-retrieval service and System Development Corp’s ORBIT, the approach takes the form of computer-resident databases that individual users can easily search at a surprisingly low cost (see box, “A look at available services”).
INSPEC, for example, seems expensive at $75 per hr for connect time, but an average search through its more than two million bibliographic citations costs less than $10. And INSPEC is just one of DIALOG’S more than 130 databases; it comprises literature in computer, control and electronics engineering. Updated monthly from more than 2000 journals as well as conference proceedings, theses and books, it’s the world’s most concentrated source of references to published literature in electronics and related fields.
Such a reference is hard to compare with a few magazine subscriptions, although without doubt you must ultimately depend on full-text source documents. Using computer-resident databases, however, you can scan the world’s engineering literature in minutes for references to a particular combination of terms, then refine the search by applying Boolean operators to yield the specific information you need.
Suppose, for example, that your firm manufactures concrete-batching systems. The basic objective of your µP-based design is control of equipment in the batch plant to yield a certain quantity of concrete having a customer-specified balance of aggregate material, cement, water and admixtures.
This problem is not unreasonably complex, yet it does involve a few factors beyond your control. For example, how do you know how much water to add to a given batch of concrete to achieve the specified consistency if you don’t know how much moisture is present in the sand or gravel aggregate? A sudden rain could make the algorithms worthless and produce a few cubic yards of soup instead of concrete. You therefore need a moisture meter to deal with aggregate materials of unknown composition. Simple schemes based on conductivity and specific gravity alone are useless.
Begin your search for information by firing up a terminal, dialing the local Telenet or Tymnet node, specifying DIALOG’S network address, entering your password and selecting the INSPEC database for the period 1977 to the present (file 13).
You then type s moisture? The “s” means select; the command simply directs the system to create a set of records that, in one context or another, mention moisture. Almost immediately, the system responds — 1 1485 MOISTURE — meaning that set 1 consists of 1485 documents. This response is obviously too general to prove useful, because most of those documents won’t discuss moisture measurement in aggregate materials.
So, because you are looking for measurement techniques, you enter the command s meter or measur?. The system then identifies 2930 records referring to “meter” and 142,388 containing words beginning with “measur”—for a total in set 2 of 143,440 (including overlap). On a hunch, you create set 3 for “humidity,” obtaining 1527 members.
How many of those 142,388 articles on measurement or metering do you suppose deal with moisture or humidity? Using the combine command, c 2 and (1 or 3), you create set 4, which narrows the field to 1108 members.
This information, however, is still too general. Try the same command, this time without “humidity,” and reduce your options to 565 (set 5). But because your real interest is moisture of aggregates, you then produce set 6 with the command s bulk (w) material or aggregate? The “(w)” requires that the words “bulk” and “material” appear side by side somewhere in the record (abstract, title, etc).
1715 articles on bulk materials or aggregates make up set 6. How many of them also appear in set 5 on moisture measurement? The command c 5 and 6 yields the answer: two.
From this point, printing the two bibliographic records on the terminal is simple—Listing 1 shows the first. By the time both have been printed, your total on-line time (including head scratching) equals 0.139 hrs—which amounts to a $10.43 billing on your next statement from DIALOG.
But you’re not done yet. Additional information of interest might be available which, for one reason or another, is absent from the INSPEC database—perhaps in the form of patents. So you exit file 13 with the command End/Save, whereupon the system stores your search strategy and assigns it a serial number. You then log on to file 25 (CLAIMS—US Patent Abstracts 1971-present) and execute the strategy by typing .ex and the serial number. You find that eight US patents granted during the last 10 yrs meet the criteria established by the Boolean relationships among the terms in your search strategy. Their titles are displayed in Listing 2, and an abstract of one of them appears in Listing 3.
The provocative aspect of this type of search is that for approximately $20 you can perform a fairly comprehensive scan of the world’s literature on moisture-measurement techniques for aggregate materials. At this point, you could search further by tracking down key words found in the references obtained thus far, or just go ahead and order copies of the abstracted documents. In any case, you have saved considerable time and probably run circles around competitors who approach the same problem by visiting a library and searching industry catalogs.
Shop in a database supermarket
To permit such searches, Lockheed’s DIALOG system’s more than 130 databases serve such diverse application areas as pharmaceuticals, philosophy, labor statistics, molecular structures, government publications, water resources, social sciences and international trade. Here’s a look at some of the ones likely to hold interest for engineers and engineering managers.
- CAREER PLACEMENT REGISTRY. The process of finding qualified employees becomes a little easier with the addition of the CPR/STUDENT database to your repertoire of head-hunting tools. Containing mini resumes of college seniors and recent graduates, this database can be searched on the basis of such characteristics as degrees, special skills, geographic preferences, career objectives and academic credentials. A companion file for experienced personnel is also available.
- CLAIMS. Actually a group of approximately 10 databases, CLAIMS contains more than 1.5 million patent documents issued by the US Patent and Trademark Office since 1950. Updated monthly and heavily cross indexed via more than 10 million citations, this resource can save a firm considerable time and money during the research phase of filing a patent application. Individual records are searchable not only on the basis of key words, but also on approximately 20 different categories of information such as inventor name, assignee, class code and issue date. Another database, INPADOC, provides references to approximately 16,000 patents per week from 45 countries.
- COMPENDEX. Similar in many ways to INSPEC, this database contains more than one million records abstracted from significant engineering and technical literature. Monthly updates reference about 3500 journals as well as society publications and conference proceedings.
- COMPREHENSIVE DISSERTATION INDEX. For $55 per hr, you can search to your heart’s content through a database of more than 800,000 doctoral dissertations accepted at accredited American institutions since 1861—as well as thousands from other countries.
- CONFERENCE PAPERS INDEX. Providing data on more than 100,000 scientific and technical papers presented at conferences since 1973, this file is a source of R&D information that might not have yet found its way into professional journals. In addition to the usual bibliographic information, records include the authors’ names and addresses as well as information for ordering proceedings and other publications.
- DISCLOSURE. Have you ever wondered about the financial stability of a company with which you are about to become involved? Any publicly owned firm is, of course, open to scrutiny, but sometimes tracking down information quickly can prove inconvenient. DISCLOSURE solves such problems with abstracts of Form 10K financial reports on all SEC-filing corporations, as well as reports of changes, proxy statements and management discussion. For about $10 per full report, this database can dramatically streamline the research required for such activities as corporate planning, marketing intelligence and portfolio analysis.
- ENCYCLOPEDIA OF ASSOCIATIONS. Corresponding to the publication of the same name, this compendium of approximately 15,000 trade, professional, labor and fraternal associations provides details on key people and their addresses, membership size, group objectives and annual meetings.
- INSPEC. Specializing in physics, electrotechnology, computers and control, this resource represents a powerful tool for the electronics engineer. In addition to the capabilities described earlier, this database offers the Selective Dissemination of Information (SDI) service. If you want to keep up, for example, with the world’s literature on PLZT-based imaging techniques, you perform a search once—then request monthly mailings that incorporate anything new on that topic that meets your search criteria. At $5.95 per update (the price varies with the database), this approach proves an inexpensive way to stay abreast in areas.
- INTERNATIONAL SOFTWARE DIRECTORY (forthcoming). Aimed at microcomputer users, this file provides an industry-wide cross reference of software, indexed by topics including applications area and processor type. The database provider (Imprint Editions, Ft Collins, CO) delivers software packages via DIALOG’S on-line ordering facility (which can also be used to acquire the full-text documents corresponding to the bibliographic references in databases such as INSPEC and the MAGAZINE INDEX of more than 370 popular periodicals).
- MICROCOMPUTER INDEX (forthcoming). This bibliographic file is a subject and abstract guide to more than 30 microcomputer journals.
- SCISEARCH. With more than 7.5 million records, this massive multidisciplinary index to the literature of science and technology offers an interesting feature not shared by most bibliographic databases. It provides citation indexing—allowing retrieval of articles on the basis of subject relationships established through the authors’ references to earlier articles. This approach extends the depth of a search by reducing the dependence on key words and also permits a searcher to more easily track the history of a particular idea. One of the most thorough databases, SCISEARCH includes cover-to-cover indexing of approximately 2600 journals —including letters, editorials, meeting reports and correction notices.
- SSIE CURRENT SEARCH. This database, originally published by the Smithsonian Science Information Exchange, contains reports of both government and privately funded research projects, generally providing information long before it’s published.
- STANDARDS AND SPECIFICATIONS. You can more easily deal with the profusion of government and industry standards that can affect a design by using this comprehensive index. The information includes MIL, ANSI, IEEE and other specs for topics including terminology, testing, safety and materials.
Streamlining the search
The handful of databases described tell only part of the story. No matter how substantial a body of data, it’s of little use without fast, flexible search software. More than anything else, the lack of such software has retarded the growth of the on-line-database industry until relatively recently. Storage was not a serious problem; the communication links have been in place for a while; users have had terminals. The problems associated with the attempt to efficiently search a database housing millions of records, however, were many. Early systems bogged down during peak periods and encouraged users to operate in batch mode, sacrificing the interactive refinement of a search that makes modern systems like DIALOG so useful.
Indeed, this interactive operation is the key. As the opening example demonstrates, the Boolean operators can be used to arrive at references satisfying an expression of the form (A AND (B OR C) AND (D OR E)). This approach might have been the obvious choice were the problem given sufficient thought. But taking advantage of intermediate results to refine the search strategy and render the final set as well-matched as possible to the original question is more useful. Discoveries made along the way often suggest new avenues of inquiry that wouldn’t be obvious with an open-loop approach.
Save valuable time
Although on-line systems make searches much easier to perform, they don’t provide the full text of documents. But you can order those documents (or find them in your stack of magazines). The systems do allow you to take advantage of an up-to-date wealth of literature that would otherwise be out of reach, in a fashion that permits unrestricted search refinement. Just state, for example, “I want all articles published after 1980 that emanated from Schlumberger, dealing with visual object recognition, written in languages other than German or French.”
In many cases, the advantage of making such a search is simply one of saving time. The DISCLOSURE database, for example, offers nothing that isn’t available from other public sources. You can, however, use it while simultaneously talking on the phone with a potential customer and quietly call forth his last few income statements, information about his recent merger and the salary listings of his board of directors. Waiting for the same information to arrive in the mail would be far less efficient. And like the bibliographic databases, DISCLOSURE can be searched on the basis of any criteria you choose, such as “companies in Silicon Valley with 1981 sales over $5 million that have since been acquired by Exxon.”
Of course, this method of research has its dangers: A robust information resource like DIALOG is seductive. Information junkies already exist in American industry, being more concerned with the countless ways of interrelating data than with the task-specific information it represents. But the cost of on-line searching does tend to discourage browsing, and the system, unlike some that have arisen to address the hobby market, is clearly not a toy.
It is, instead, a useful tool for anyone who needs information enough to make the cost of obtaining it worthwhile. Connect-time rates average $1 per min, making the service appear expensive. Considering the lack of reasonable alternatives, however, on-line services are the most cost-effective resources available for accessing engineering information.
EDN Author’s biography Steven K Roberts, president of Words’Worth, a consulting firm, has written Industrial Design with Microcomputers (Prentice-Hall, Englewood Cliffs, NJ, 1982) as well as two other books and numerous articles about related technologies. A freelance industrial systems consultant, he maintains a home office in Dublin, OH as well as one in a computer-equipped motor home. Steve uses DIALOG extensively for article research, consulting projects and an ongoing attempt to keep current. His hobbies include photography, bicycling and playing the flute.
You must be logged in to post a comment.