Sound on the PC

Julian J. Bunn

January 1995

This article appeared in a revised form in Issue 8/95 of Speaker Builder.

Introduction

Those of you who have been following trends in the computing industry will know that desktop computer multimedia is presently undergoing a technology explosion. This is evidenced by the vast variety of hardware add-ons such as sound and video frame grabber cards, CD-ROM drives, purpose-built desktop loudspeakers, as well as software offerings like multimedia application authoring tools, voice synthesis and recognition tools, image processing packages and so on. In this article I will provide some background material that I hope will be thought-provoking, together with some guesses of how the technology advances may affect the desktop computer's role as a tool for the amateur loudspeaker builder. The end of the article includes a prediction about how these areas of technology will merge and evolve towards an integrated home multimedia system based on a sophisticated audio/video computer. One use of such a system may well be to correct for deficiences of room acoustics and loudspeaker performance or placement in real-time.

Digital Sound Processing

Over the last few years exciting new products have appeared in the domestic digital audio marketplace, at one time an area populated almost exclusively by CD players. Witness the burgeoning field of home cinema amplifiers with sophisticated on-board chips that allow almost unlimited creativity in terms of digital signal processing and variously artificial sonoric effects.

During the same period, the growth in the range of products for producing sound on home computers has been prodigious. So that it is now estimated that over half of all home computers being sold are capable of generating and recording sound. These computers typically contain extra printed circuit cards designed specifically to treat sound. Only a few years ago, the electronics on the board, and the software techniques using for playing and recording sound, were of low quality: typically eight bit mono samples, rates of 8 kHz, and poor signal to noise ratio. This situation has improved considerably, so that today, the norm is sixteen bits per stereo channel, with sampling rates up to 44 kHz, and signal to noise ratios and response curves that qualify such cards for inclusion under the "HiFi" umbrella. Additionally, the techniques for generating and recording sounds have advanced, with the result that home computers can sound rather pleasing, and no longer like an old 78 played across a bad `phone line. Methods for compressing audio and video data are well advanced too. This, coupled with cheaper, faster and larger capacity hard disk drives, means that satisfactory sizes of audio/video tracks or clips may now be stored within the computer. Finally, the speed of CPU chips (for example the Pentium), the greater memory bandwidth afforded by wider (more bits) and faster busses, the larger amounts of RAM that are usually installed, all result in sufficient audio and video data rates for jerk-free multimedia, and interesting possibilities in the area of processing the audio and video digital signals in real time.

The SoundBlaster Standard

The generic home PC is a desktop device running MS-DOS and, maybe, Windows 3.1, with an Intel (or clone) x86 architecture chip as the CPU. PCs like this by far outnumber any other sort of computer, whether in homes or in offices. Of course, computer sound is not the exclusive province of the PC; it is standard on the Macintosh, as well as on Sun and most other high-end "workstations" (Silicon Graphics desktop workstations even include a small video camera for use in video-conferencing as well as a microphone and loudspeakers). Although sound card technology is not PC-specific, I will concentrate on the PC specifics in this article.

The de-facto standard for the PC is the SoundBlaster card, manufactured by Creative Labs, a Singapore-based company. This card was first introduced in the 1980s. Being "SoundBlaster compatible" is a major marketing consideration for other card manufacturers, since vast numbers of PC games require it. The SoundBlaster "standard" includes the specification of a certain set of programmable registers that perform functions such as receiving command strings from the application, returning information on the card set-up to the application, setting the play/record mode, altering the mixer settings, starting and stopping DMA transfers, and so on. Creative's stranglehold on the standard is unlikely to last indefinitely as newer operating systems allow programmers to shield themselves from the hardware details by the use of appropriate software "drivers".

Digital sound basics

Since the PC cannot directly manipulate analogue signals it has to deal with digital units that are, in general, multiples of an 8 bit "byte". This means that both ADC (Analogue to Digital) and DAC (Digital to Analogue) converter chips must be present to convert the signals to and from a format that can be handled. The basic sound generation operation is to convert the value represented by one byte into a voltage level. Since an eight bit byte can represent up to 256 (i.e. 2 to the power of 8) different values, then the voltage level generated can have this number of values. Conversely, the basic sound recording operation is to convert a voltage level into a byte value. By stringing together a series of bytes that each represent a different voltage level, the waveform of a sound is emulated. By manipulating the string of bytes in various ways the resulting sound wave may be altered. Digital Signal Processing (DSP) is the term used to refer to the methods by which signals are treated algorithmically. Sound synthesis is a special DSP technique for generating a digital signal that, when converted to analogue and played through a transducer, sounds like a musical instrument. Sound synthesis has been an especially important area of development in sound card technology, and various methods are commonly used.

FM synthesis

This is based on the idea of "operators". The more operators, the more satisfactory the synthesis of the sound is. One or more sine tones (the "carriers") are modified with one or more sine tone operators. The frequencies of the operators determine how the carriers are are modified: the resulting sound is frequency modulated. This is a very general technique that allows to not only emulate traditional sounds, but also to generate completely new sounds. One disadvantage of FM synthesis is that the synthesised sounds of real instruments are rarely very realistic. The initial SoundBlaster cards sported FM chips which were made (at that time) only by Yamaha. FM synthesis is thus likely to be around for some time, since it is required for SoundBlaster compatibility.

Wave table synthesis

In contrast to FM synthesis, Wave Table synthesis allows extremely faithful simulation of real instrument sounds, since it makes use of digitised recordings (in the form of "wave files") of real instrument sounds. Boards offering Wave Table synthesis usually come with a selection of avilable instrument sounds. If a new instrument sound is needed, then it can be downloaded from a Wave Table file repository accessed over a network, by using a modem, or purchased on a diskette, etc.. A disadvantage of Wave Table synthesis is that it requires a lot of RAM to store the Wave Tables, although some chips (e.g. the Yamaha OPL4) have a permanent ROM that contains those that are commonly required.

Digital Signal Processing

In this context we mean the inclusion on the sound card of one or more DSP chips (as opposed to FM or Wave Table chips etc.). This will almost certainly be the future means of handling computer-based sound. The key advantage of DSP is that it is a technique that gives the developer full control over how sounds are generated or treated: it does not rely on a fixed method instantiated into chip logic, such as FM or Wave Table synthesis do. DSP chips are programmed to apply an algorithmic process to a digitised audio signal or to directly generate an audio signal. The details of the process are then up to the application designer. Some examples are addition of reverberation to an existing signal, application of an FFT for voice recognition purposes, simulation of the sound of each digit on a touch-tone telephone . Two disadvantages with DSP today are that the chips tend to be expensive, and are not easily programmed: both these objections are likely to become less and less valid.

Sound Chips

The heart of the sound card, then, is the chip set that controls the DSP. Audio DSP is just a special case of a general need throughout the industry for chips that can process digitized signal data at high rates and in great precision. Consequently the silicon industry is ramping up production capacity and pouring research and development money into new chip designs. We'll take some specific examples that are tagged for the audio markets. Yamaha's OPL4 combines 20 FM and 24 Wave Table synthesised voices on the same chip. An optional effect processor provides surround sound, echo and reverberation. The Wave Table data are stored in ROM, and can be in 8,12 or 16 bit sample format. Analog Devices recently announced the AD1845 chip, which incorporates full duplex record and play and variable frequency sampling rates. This companies' chips are already widely distributed on boards from manufacturers such as Orchid Technology, Hewlett Packard, and Kurzweil Music Systems. The latter company markets a "MASS Sound Engine" with 32-voice Wave Table synthesis and effects like echo, reverberation, flanging and pitch shifting.Texas Instruments TMS320C30 and TMS320C40 DSP devices are used by a number of board manufacturers who offer PC hardware and software products for analysing audio signals. Companies such as Sonitech and Loughborough Sound Images have product ranges that include sophisticated spectrum analysis, speech recognition, and filtering tools based on these TI chips. Hitachi are working on chips dedicated to audio/video applications. Intel has announced its intention to incorporate DSP in the next generation of Pentium-based Pcs. TriMedia/Philips are working on a multimedia chip known as a programmable DSP/CPU, which, as its name suggests, combines CPU and DSP partitions on the same chip. This will probably be programmable from C.

The Versatile PC

Taking into account the technology trends and equipment already available, we look now at how the tasks of the PC will likely diversify in the domestic setting in years to come.

The speaker builder's PC

The PC already comes into its own as a tool for the speaker builder when crossovers need to be designed, box dimensions calculated, optimum loudspeaker positions in a room estimated, and so on. Many such tools exist both as commercial products, and in the public domain or as Shareware. They are all passive tools, however, in the sense that you need to type in parameters and measurements before the PC can make the required calculations. More exciting possibilities are now emerging, where the PC itself takes care of gathering the data, which it then analyses and displays in a meaningful form. The obvious examples of this are several products that turn the PC into an audio frequency response measurement tool: you just take care of positioning the microphone, and the PC plots the response of the equipment.

We can also imagine a tool that completely automates the process of designing a crossover for a loudspeaker enclosure and drivers. Imagine the situation where you have built the box and installed the drivers in it, and you want to build a crossover that produces the flattest response curve (even though this may not in fact produce the best sound!). Leave your copy of Vance Dickason in the bookcase, and turn your PC on instead. This PC contains two sound cards: you're going to use one to drive the tweeter, and one to drive the woofer. Connect the tweeter to the output of one card, and the woofer to the output of the other. Connect a microphone to the input of one sound card, and place the microphone a short distance from the loudspeaker. Now fire up the automatic crossover design software, and watch the PC screen as it displays the progress of its deliberations! After a few moments it prints out the circuit diagram and LCR values of the optimum crossover for your box and drivers. It asks if you'd like to listen to some music in order to evaluate what the system would sound like, or if you'd like to alter the order of the crossover, and see if that would produce any improvement. In order to do all this, the crossover design software uses a simulation of the response curve of an N order crossover with given LCR values to send the appropriate (different) signals to the tweeter and woofer, and then records and analyses the signal from the microphone that results. According to this analysis, it simulates an adjustment of the LCR values in order to fit a flat response curve. It is a very simple computational task to fit using a least squares method the recorded response curve to a flat curve by using the simulated variation in LCR values.

A PC in your hi-fi system

Modern consumer audio systems allow to superimpose the acoustic response of one venue on top of a piece of music recorded in a different venue. These systems offer the owner varying amounts of control over the parameters of such audio signal conversion. We can imagine that the audio signal conversion component of the above system be replaced by an audio DSP-capable PC. In this case the only limit to the variety of effects that can be achieved is the flexibility of the DSP computer programs running in the PC and our imagination. We can imagine a set of software DSP "tools" that we can bolt together as building blocks to achieve the processing effect or monitoring we want. Here is a list of some of the more obvious:

An FFT tool that displays the frequency components of the source signal (i.e. a spectrum analyser),

A graphics equalizer that allows us to boost or suppress bands of frequency in the signal,

A stereo L-R difference signal extractor, that we use to detect mono, or use as an extra output,

A Dolby Pro-Logic emulator, that in software extracts the surround sound information from signals of this type,

A digital filter, whose coefficients are calculated to reproduce the acoustic environment of an arbitrary venue,

A convolver, which convolves the digital filter above with an audio signal to reproduce the sound in the venue,

A room response calculator, used with a microphone placed in the room, which folds out the source signal from the measured signal in the room, and yields the coefficients of the digital filter for the room,

An inverter, which flattens out the measured room response, by using the inverse of the measured room response.

Futures

The main identifiable trend in PC-based audio today is clearly that of integrating audio DSP on the motherboard of the PC. This not only does away with sound card installation hassles, but also allows the manufacturer to properly integrate and optimise the power of the sound chip set with the rest of the electronics. We are likely to see full duplex play and record, increasingly sophisticated on-board DSP algorithms (including perhaps multi-channel surround sound decoders), higher sampling rates and better filters, 32 bit sampling, rising to 64 bits in the longer term.

A discussion of PC-based audio futures would not be complete without mentioning areas which, in the end, will most likely be combined into a single device that acts as a home control centre. These include the integration of the telephone system in the PC, with the full functionality of an answering machine, FAX machine, together with voice recognition software that can identify the caller, voice synthesis software that can formulate an appropriate reply(!), and software that will allow the caller to interact with the PC by using, for example, the buttons on a touchtone telephone. With the PC an integral part of the home telephone system, it can provide you access to remote computers via dial-up lines, allowing you to download or upload audio/video files, and even to play them in real time once compression algorithms have advanced sufficiently. Access to the Internet across dial-up lines and hence the vast wealth of online information there is another highly attractive possibility. Those of us fortunate to be able to use the Internet-based World Wide Web, with its fully interactive choice of many TeraBytes of audio and/or visual data files held on computers spread around the globe, already appreciate the exciting possibilities of the fully networked home computer.

Summary

This has been a personal view of what I believe the future holds for PC-based audio. It is primarily an enthusiast's, rather than an expert's, opinion.

Bibliography

Cheryl Ajluni, "Audio-IC Technologies Tackle New Challenges", Electronic Design, February 20, 1995.

Analog Devices, "The Architectural Needs for Signal Processing Functionality in Personal Computers",Technology Trends Backgrounder, November 1994.

Dennis Cronin, "Examining Audio DSP Algorithms", Dr.Dobbs Journal, July 1994.

Phil Atherton, "Could low cost DSP signal the end for analogue audio?", Electronics World and Wireless World, May 1993.

Loughborough Sound Images, "A Synthetic Concert Hall in the Home",DSP Link, Issue 11.