(without taking human hearing into consideration) "volume" literally refers to the volume of air being displaced by a sound which is more or less proportional to the displacement of a speaker cone (or surface) which is proportional to the peak-to-peak voltage being applied to the electromagnetic coil in the speaker.
phew.
With that said "volume" works by taking an input signal (your music/etc) and mapping it to a range of voltages.
E.g. if -1 .. 1 volts is "100% volume" then for your music the lowest it could move the coil is -1 volts and the highest would be 1 volt (this pushes/pulls the cone of the speaker which creates pressure waves which is what sound is).
At say "50%" the same input signal is mapped to -0.5 volts to 0.5 volts. Meaning the cone travels less distance and displaces less volume of air as a result. In reality the mapping of volume "numbers" you see on screen and voltage ranges is not linear but typically logarithmic.
As for perceived volume ... The tricky part though is what you "hear" (your brain perceives) and is happening are not the same thing. Our ears are actually not equally sensitive to all frequencies of sound (distance between peaks in the waves of air). Low frequencies for instance require a lot more energy to be perceived as louder. Whereas frequencies in say the human voice range (200-4000Hz) need less changes in volume to be perceived as being louder/quieter.
As a side note this is how audio compression works. If you're less sensitive to say sounds in the sub-200Hz range then you don't need as much precision when you encode them. Additionally some frequencies can "mask" others. E.g. a loud 2kHz tone could make you far less able to perceive changes in a parallel 750Hz tone. Additionally our hearing isn't "real time". Some loud tones can mask other frequencies that come before and after them (as in a loud tone could actually mask a quieter tone that happened just before it).
phew.
With that said "volume" works by taking an input signal (your music/etc) and mapping it to a range of voltages.
E.g. if -1 .. 1 volts is "100% volume" then for your music the lowest it could move the coil is -1 volts and the highest would be 1 volt (this pushes/pulls the cone of the speaker which creates pressure waves which is what sound is).
At say "50%" the same input signal is mapped to -0.5 volts to 0.5 volts. Meaning the cone travels less distance and displaces less volume of air as a result. In reality the mapping of volume "numbers" you see on screen and voltage ranges is not linear but typically logarithmic.
As for perceived volume ... The tricky part though is what you "hear" (your brain perceives) and is happening are not the same thing. Our ears are actually not equally sensitive to all frequencies of sound (distance between peaks in the waves of air). Low frequencies for instance require a lot more energy to be perceived as louder. Whereas frequencies in say the human voice range (200-4000Hz) need less changes in volume to be perceived as being louder/quieter.
As a side note this is how audio compression works. If you're less sensitive to say sounds in the sub-200Hz range then you don't need as much precision when you encode them. Additionally some frequencies can "mask" others. E.g. a loud 2kHz tone could make you far less able to perceive changes in a parallel 750Hz tone. Additionally our hearing isn't "real time". Some loud tones can mask other frequencies that come before and after them (as in a loud tone could actually mask a quieter tone that happened just before it).
- 1