(C) Grolier 1991

information storage and retrieval

In modern society the task of the storage and retrieval of vast amounts of information has been taken over almost entirely by COMPUTER systems. No longer a tool exclusively for mathematical computation, the computer now handles large collections of information called DATABASES. Government agencies such as the Internal Revenue Service, the Social Security Administration, and the National Crime Information Center maintain databases, as do private industry (personnel records) and other organizations (medical records, credit records). Computer access to such stored information raises questions of great importance to modern society: How accurate is the information, and how can inaccuracies be corrected? Who has access to the information? How is improper or illegal access prevented? These questions are currently under study by computer scientists, lawmakers, and those with an interest in databases.

Information is stored on various media by means of devices interfaced to computers. This storage is considered secondary, as opposed to the primary, internal COMPUTER MEMORY. Secondary memory has greater capacity than primary, but access is slower. Computer access to secondary memory may take time on the order of several thousandths of a second up to several seconds. This speed is in contrast to primary memory access times of less than one-millionth of a second.

Memory hierarchies extend to a tertiary, or archival, level with capacity for trillions of bits of information. An 8-million-word encyclopedia such as this one contains about 400 million bits of information (8 million words X 6 characters per word X 8 bits per character). A trillion-bit storage can store 2,500 times as much information as there is in such an encyclopedia, all of it directly accessible to the computer.

COMMON STORAGE DEVICES

Digital computers process information in the form of binary codes. Devices for storing the coded information include drums, disks, and tapes. For many years all three of these technologies have employed the property of magnetic material that allows particles to be oriented, or polarized, in one of two directions, corresponding to each of the values of binary code--zero or one. Strings of BINARY NUMBERS, referred to as BYTES, are represented by arrays of magnetic particles. Other technologies are just beginning to replace magnetic devices.

Drums and disks are shaped differently but have many other properties in common. Drums are shaped like a drum, or cylinder, with the recording material wrapped around the outside. Information is recorded in individual tracks which form circles around the drum. Several hundred tracks lie adjacent to each other along its length. Disks are shaped like a disk, or phonograph record, with the recording material on the flat surfaces. Information is recorded in individual, concentric circle tracks. Several hundred tracks may be on each surface.

Both drums and disks have read/write heads similar to the play and record heads on TAPE RECORDERS used for sound recording. Rotation of the recording medium causes the information storage tracks to pass under the heads for both devices. Heads are positioned on selected tracks, to access-selected information, by mechanical or electronic means. Both of these devices keep the recording surface in constant rotational motion when they are on-line (actively interfaced) to the computer. This procedure eliminates the time delay of waiting for a device to reach operational speed.

Both drums and disks store information that needs to be readily accessible but that either exceeds the capacity of the computer's primary memory or is used infrequently enough not to be maintained there. Disks are by far the more predominant, especially in the form of the small, flexible "floppy" disks that are used with small computers and that can hold several hundred thousand characters (see COMPUTER, PERSONAL).

Magnetic tape units record digital signals on reels or cassettes of magnetic tape. Information is recorded in parallel tracks that run the length of the tape. Spinning the reels causes the tape to move from the supply reel, past the read/write heads, onto the take-up reel. Common tape drives have seven or nine parallel tracks across the tape so that at one recording position along the tape length, a character (such as a letter or number) of information may be stored. Storage capacity is determined by the total length of the tape and by the recording density in characters per inch. Typical values of length are 2,400 ft (230 m) with densities of 1,600 characters per in (630 characters per cm). Not all of the length is usable, however, since recording methods require certain unused areas. Tapes may be dismounted from tape drives and stored in libraries, which may hold essentially unlimited amounts of information. Tapes in a manually maintained library are not directly accessible from the computer. A computer operator must find the desired tape and mount it on a tape drive.

Information is recorded from the beginning of a tape towards the end in sequential fashion. To retrieve selected information it is necessary to wind or rewind the tape to the place where the information was recorded. This form of access, known as sequential access, requires examination of recorded information in the order (or reverse order) that it was recorded.

NEW DEVELOPMENTS

Perhaps the most promising new information storage technology involves laser discs (sometimes in the form of compact discs, or CDs)--round, metal-coated plastic discs on which digital information is stored in the form of pits and bumps (see COMPACT DISC). Information is retrieved from the disc by the reflection from the surface of a small laser beam as the disk rotates rapidly. Laser discs offer great storage capacity, retrieval speed, and durability.

Newer storage devices are being developed out of electronic technology rather than the electromechanical technology of rotating or spinning devices. Bubble memories in which microscopic bubble patterns are formed on wafers of garnet crystals are available. These use the presence or absence of magnetic domains to represent the binary values zero and one. Similarly, charged couple devices (CCD) utilize the presence or absence of electronic charges to store information. Both devices are sometimes called electronic disks because they store information in patterns that make it accessible in a cyclic fashion, much like disk storage. Electronic disks have a significant advantage, however. Since there is no mechanical motion, the cycling may be stopped when the next needed storage area is available at the input/output (I/O) port.

Another new technology is the electron beam accessed memory (EBAM). This may become the fastest of the secondary memory devices, but it is not as well developed or readily available as bubble and CCD memories. The EBAM stores information by using an electronic beam to charge a small area on a silicon dioxide plane.

ARCHIVAL STORAGE

Archival storage is the third level in the memory hierarchy. Archival storage devices are intended to hold information that is infrequently accessed but extensive in quantity. The devices are slow but have great capacity. The most successful devices in this category use magnetic tape. A straightforward system is simply an automation of the conventional tape library. Reels of tape stored in racks are electromechanically selected, moved to one of several tape drives, and mounted. If the tape to be selected can be specified in a program, then the system is fully automated. There is no need for a person to walk through the library, select, and manually mount the tape. Automatic dismounting and replacing of the tape in its rack is also supplied. An alternative tape system operates in a similar manner but uses a newly designed tape cartridge rather than standard reels.

SOFTWARE CONSIDERATIONS

Information storage and retrieval involves more than hardware devices and storage media alone. To make the physical equipment readily usable it is necessary to provide SOFTWARE (programs) routines or systems. If the information exists as a database, operations associated with it are to add information, to change existing information, to delete information, and to retrieve items of interest.

Large databases employ managers who maintain the database for users. The need for powerful retrieval methods while maintaining strict security against unauthorized access gives rise to the need for complex software. Users inevitably want to retrieve information in different ways from the way it is stored. For example, because of the way information is stored in a phone book, it is easy to find a phone number if the name is known. A complex software system without sufficient safeguards can make it equally easy to reverse the process and find the name if the phone number is known.

Information storage and retrieval is an active area of computer-related work both for hardware devices and for software systems.

Edward W. Davis

Bibliography: Aikins, A. C., Computers and Data Processing Today (1983); Burch, John, G., et al., Information Systems, 3d ed., (1979); Couger, J. Daniel, and McFadden, Fred, Introduction to Computer-Based Information Systems (1975); Kruzas, Anthony, ed., Encyclopedia of Information Systems and Services, 3d ed. (1978); Parker, Donn B., Fighting Computer Crime (1983); Salton, G., and McGill, M., Introduction to Modern Information Retrieval (1983); Stern, Nancy and Robert A., An Introduction to Computers and Information Processing (1982); Van Rijsbergen, C. J., Research and Development in Information Retrieval (1984); Zmud, R. W., Information Systems in Organizations (1983). See also: DATA PROCESSING; INPUT-OUTPUT DEVICES.