Wave field synthesis (WFS) is a spatial audio rendering technique, characterized by creation of virtual acoustic environments. It produces artificial wavefronts synthesized by a large number of individually driven loudspeakers. Such wavefronts seem to originate from a virtual starting point, the virtual source or notional source. Contrary to traditional spatialization techniques such as stereo or surround sound, the localization of virtual sources in WFS does not depend on or change with the listener's position.
WFS is based on the Huygens–Fresnel principle, which states that any wavefront can be regarded as a superposition of elementary spherical waves. Therefore, any wavefront can be synthesized from such elementary waves. In practice, a computer controls a large array of individual loudspeakers and actuates each one at exactly the time when the desired virtual wavefront would pass through it.
The basic procedure was developed in 1988 by Professor A.J. Berkhout at the Delft University of Technology. Its mathematical basis is the Kirchhoff–Helmholtz integral. It states that the sound pressure is completely determined within a volume free of sources, if sound pressure and velocity are determined in all points on its surface.
Therefore, any sound field can be reconstructed, if sound pressure and acoustic velocity are restored on all points of the surface of its volume. This approach is the underlying principle of holophony.
For reproduction, the entire surface of the volume would have to be covered with closely spaced loudspeakers, each individually driven with its own signal. Moreover, the listening area would have to be anechoic, in order to avoid sound reflections that would violate source-free volume assumption. In practice, this is hardly feasible. Because our acoustic perception is most exact in the horizontal plane, practical approaches generally reduce the problem to a horizontal loudspeaker line, circle or rectangle around the listener.
The origin of the synthesized wavefront can be at any point on the horizontal plane of the loudspeakers. For sources behind the loudspeakers, the array will produce convex wavefronts. Sources in front of the speakers can be rendered by concave wavefronts that focus in the virtual source and diverge again. Hence the reproduction inside the volume is incomplete - it breaks down if the listener sits between speakers and inner virtual source. The origin represents the virtual acoustic source, which approximates an acoustic source at the same position. Unlike conventional (stereo) reproduction, the perceived position of the virtual sources is independent of listener position allowing the listener to move or giving an entire audience consistent perception of audio source location.
A sound field with very stable position of the acoustic sources can be established using wave field synthesis. In principle, it is possible to establish a virtual copy of a genuine sound field indistinguishable from the real sound. Changes of the listener position in the rendition area can produce the same impression as an appropriate change of location in the recording room. Listeners are no longer relegated to a sweet spot area within the room.
The Moving Picture Expert Group standardized the object-oriented transmission standard MPEG-4 which allows a separate transmission of content (dry recorded audio signal) and form (the impulse response or the acoustic model). Each virtual acoustic source needs its own (mono) audio channel. The spatial sound field in the recording room consists of the direct wave of the acoustic source and a spatially distributed pattern of mirror acoustic sources caused by the reflections by the room surfaces. Reducing that spatial mirror source distribution onto a few transmitting channels causes a significant loss of spatial information. This spatial distribution can be synthesized much more accurately by the rendition side.
Compared to conventional channel-orientated rendition procedures, WFS provides a clear advantage: Virtual acoustic sources guided by the signal content of the associated channels can be positioned far beyond the conventional material rendition area. This reduces the influence of the listener position because the relative changes in angles and levels are clearly smaller compared to conventional loudspeakers located within the rendition area. This extends the sweet spot considerably; it can now cover nearly the entire rendition area. WFS thus is not only compatible with, but potentially improves the reproduction for conventional channel-oriented methods.
Since WFS attempts to simulate the acoustic characteristics of the recording space, the acoustics of the rendition area must be suppressed. One possible solution is use of acoustic damping or to otherwise arrange the walls in an absorbing and non-reflective configuration. A second possibility is playback within the near field. For this to work effectively the loudspeakers must couple very closely at the hearing zone or the diaphragm surface must be very large.
In some cases, the most perceptible difference compared to the original sound field is the reduction of the sound field to two dimensions along the horizontal of the loudspeaker lines. This is particularly noticeable for reproduction of ambience. The suppression of acoustics in the rendition area does not complement playback of natural acoustic ambient sources.
There are undesirable spatial aliasing distortions caused by position-dependent narrow-band break-downs in the frequency response within the rendition range. Their frequency depends on the angle of the virtual acoustic source and on the angle of the listener to the loudspeaker arrangement:
For aliasing-free rendition in the entire audio range a distance of the single emitters below 2 cm would be necessary. But fortunately our ear is not particularly sensitive to spatial aliasing. A 10–15 cm emitter distance is generally sufficient.
Another cause for disturbance of the spherical wavefront is the truncation effect. Because the resulting wavefront is a composite of elementary waves, a sudden change of pressure can occur if no further speakers deliver elementary waves where the speaker row ends. This causes a 'shadow-wave' effect. For virtual acoustic sources placed in front of the loudspeaker arrangement this pressure change hurries ahead of the actual wavefront whereby it becomes clearly audible.
In signal processing terms, this is spectral leakage in the spatial domain and is caused by application of a rectangular function as a window function on what would otherwise be an infinite array of speakers. The shadow wave can be reduced if the volume of the outer loudspeakers is reduced; this corresponds to using a different window function which tapers off instead of being truncated.
A further and resultant problem is high cost. A large number of individual transducers must be very close together. Reducing the number of transducers by increasing their spacing introduces spatial aliasing artifacts. Reducing the number of transducers at a given spacing reduces the size of the emitter field and limits the representation range; outside of its borders no virtual acoustic sources can be produced.
Early development of WFS began 1988 at Delft University. Further work was carried out from January 2001 to June 2003 in the context of the CARROUSO project by the European Union which included ten institutes. The WFS sound system IOSONO was developed by the Fraunhofer Institute for digital media technology (IDMT) by the Technical University of Ilmenau in 2004.
The first live WFS transmission took place in July 2008, recreating an organ recital at Cologne Cathedral in lecture hall 104 of the Technical University of Berlin. The room contains the world’s largest speaker system with 2700 loudspeakers on 832 independent channels.
Research trends in wave field synthesis include the consideration of psychoacoustics to reduce the necessary number of loudspeakers, and to implement complicated sound radiation properties so that a virtual grand piano sounds as grand as in real life.