Science Fair Project Encyclopedia
In many applications of acoustics and audio signal processing it is necessary to know what humans actually hear. Sound, which consists of air pressure waves, can be accurately measured with sophisticated equipment. However, understanding how these waves are received and mapped into thoughts in the brain is not trivial. Sound is a continuous analog signal which (assuming infinitely small air molecules) can theoretically contain an infinite amount of information (there being an infinite number of frequencies, each containing both magnitude and phase information.)
Recognizing features important to perception enables scientists and engineers to concentrate on audible features and ignore less important features of the involved system. It is important to note that the question of what humans hear is not only a physiological question of features of the ear but very much also a psychological issue.
Limits of perception
The human ear can usually hear sounds in the range 20 Hz to 22 kHz. With age, the range decreases, especially at the upper limit. Lower frequencies cannot be heard but loud sounds can be felt on the skin.
Frequency resolution of the ear is, in the middle range, about 2 Hz. That is, changes in pitch larger than 2 Hz can be perceived. However, even smaller pitch differences can be perceived through other means. For example, the interference of two pitches can often be heard as a (low-)frequency difference pitch. This effect is called beating.
The "intensity" range of audible sounds is enormous. Our ear drums are sensitive only to the sound pressure variation. The lower limit of audibility is defined to 0 dB, but the upper limit is not as clearly defined. The upper limit is more a question of the limit where the ear will be physically harmed (see also hearing disability). This limit depends also on the time exposed to the sound. The ear can be exposed to short periods in excess of 120 dB without permanent harm, but long term exposure to sound levels over 80 dB can cause permanent hearing loss.
A more rigorous exploration of the lower limits of audibility determines that the minimum threshold at which a sound can be heard is frequency dependent. By measuring this minimum intensity for testing tones of various frequencies, a frequency dependent Absolute Threshold of Hearing (ATH) curve may be derived. Typically, the ear shows a peak of sensitivity (i.e., its lowest ATH) between 1kHz and 5kHz, though the threshold changes with age, with older ears showing decreased sensitivity above 2kHz.
The ATH is the lowest of the equal-loudness contours. Equal-loudness contours indicate the sound pressure level (dB), over the range of audible frequencies, which are perceived as being of equal loudness. Equal-loudness contours were first measured by Fletcher and Munson at Bell Labs in 1933 using pure tones reproduced via headphones, and the data they collected are called Fletcher-Munson curves. Because subjective loudness was difficult to measure, the Fletcher-Munson curves were averaged over many subjects.
Robinson and Dadson refined the process in 1956 to obtain a new set of equal-loudness curves for a frontal sound source measured in an anechoic chamber. The Robinson-Dadson curves were standardized as ISO 226 in 1986. In 2003, ISO 226 was revised using data collected from 12 international studies.
What do we hear?
Human hearing is basically like a spectral analyzer, that is, the ear resolves the spectral content of the pressure wave without respect to the phase of the signal. In practice, though, some phase information can be perceived. Inter-aural (i.e. between ears) phase difference is a notable exception by providing a significant part of the directional sensation of sound. The filtering effects of head-related transfer functions provide another important directional cue.
In some situations an otherwise clearly audible sound can be masked by another sound. For example, conversation at a bus stop can be completely impossible if a loud bus is driving past. This phenomenon is called masking. A weaker sound is masked if it is made inaudible in the presence of a louder sound.
If two sounds occur simultaneously and one is masked by the other, this is referred to as simultaneous masking . A sound close in frequency to the louder sound is more easily masked than if it is far apart in frequency. For this reason, simultaneous masking is also sometimes called frequency masking. The tonality of a sound partially determines its ability to mask other sounds. A sinusoidal masker, for example, requires a higher intensity to mask a noise-like maskee than a loud noise-like masker does to mask a sinusoid. Computer models which calculate the masking caused by sounds must therefore classify their individual spectral peaks according to their tonality.
Similarly, a weak sound emitted soon after the end of a louder sound is masked by the louder sound. In fact, even a weak sound just before a louder sound can be masked by the louder sound. These two effects are called forward and backward temporal masking, respectively.
Psychoacoustics in software
The psychoacoustic model provides for high quality lossy signal compression by describing which parts of a given digital audio signal can be removed (or aggressively compressed) safely -- that is, without significant losses in the quality of the sound. It explains, for example, how a sharp clap of the hands might seem painfully loud in a quiet library, but hardly noticeable after a car backfires on a busy, urban street. It might seem as if this would provide little benefit to the overall compression ratio, but psychoacoustic analysis routinely leads to compressed music files that are 10 to 12 times smaller than high quality original masters with very little discernible loss in quality. Such compression is a feature of nearly all modern audio compression formats. Some of these formats include MP3, Ogg Vorbis, Musicam (used in digital radio -- DAB, or DR --in Europe and elsewhere, based on Eureka 147), and the compression used in MiniDisc, to mention a few common audio compression standards.
Psychoacoustics is based heavily on human anatomy, especially the ear's limitations in perceiving sound as outlined previously. To summarize, these limitations are:
- High frequency limit
- Absolute Threshold of Hearing
- Absolute Threshold of Pain
- Temporal masking
- Simultaneous masking
Given that the ear will not be at peak perceptive capacity when dealing with these limitations, a compression algorithm can assign those sounds outside the range of human hearing a lower priority; by carefully shifting bits away from the unimportant components and toward the important ones, the algorithm ensures that the sounds the listener hears most clearly are of the highest quality.
Psychoacoustics and music
Psychoacoustics includes many subjects and produces discoveries which are relevant to music and its composition and performance, and some musicians, such as Benjamin Boretz, consider the results or some of the results of psychoacoustics to be meaningful only in a musical context.
Yet to be done:
- Bark scale, Equivalent rectangular bandwidth (ERB), Mel scale and other scales
- Loudness, that is, perceived volume, Bel, sone
- Perception of non-existent sounds, such as, missing fundamental frequency, and other auditory illusions. Compare to telephone which transmits 400 Hz to 3400 Hz.
- Auditory Scene Analysis (incl. 3D-sound perception, localisation, etc.)
- auditory illusions
- audio compression
- speech recognition
- sound localization
- source separation
- musical tuning
- rate-distortion theory
- Haas effect
The contents of this article is licensed from www.wikipedia.org under the GNU Free Documentation License. Click here to see the transparent copy and copyright details