The basic reason is the complex nature of a loudspeaker.
I will try to explain, but I'm at the limit of my knowledge here, so some details may be wrong or incomplete. I hope you get the general picture though.
Loudspeakers are inductive, because they are constructed from a coil of wire, but they also have a property called motional inductance, which is one way of looking at the effect caused by the inertia of the cone, which is very significant, especially for large speakers.
A speaker driver has its own self-resonant frequency, and a more or less complicated and uneven frequency response, which are a result of that inertia, and which can be greatly affected by the speaker enclosure. Its impedance and the phase relationship between applied voltage and current flow also vary wildly over the frequency spectrum.
An audio amplifier is really a voltage amplifier with high output current capability. The input signal is a voltage, and the amplifier amplifies it enough that when it is applied across the loudspeaker, significant current flows. The current waveform is related to the voltage waveform according to the speaker's complex characteristics.
The output stage of the amplifier is supplied from one or two power supply rails - for example, +30V and -30V. The output stage is able to drive its output over a certain voltage range, say +25V to -25V, to cause current to flow in the speaker.
Because of the nature of impedance, and especially motional impedance, in various situations the amount of voltage that would have to be applied to an inductance in order to cause a specific desired amount of current to flow could be almost infinite. It is not feasible for an amplifier to be able to provide a current signal that follows the input voltage signal into an inductive load or a loudspeaker.
For a simple resistive load this is no problem, since voltage and current are proportional, but for an inductive load, it is not feasible.
People have experimented with current output amplifiers for audio use. The amplifier has to be designed to gracefully handle the situation where producing the correct output current is not possible given the limited supply voltage available.
There's a lot more to this subject. I've only told you the little I know about it. You can find more information by Googling some of the keywords I've mentioned, and other keywords you find in your search.
Beware of audiophile claims about sonic quality; these are often not backed up by as little as a double-blind test, may vary greatly from one self-proclaimed "golden-eared expert" to another, and ultimately may mean nothing at all.
Edit: In answer to your second question, an audio signal is a single voltage that varies in a complex way over time. In a stereo system, two signals (channels) are used, and more channels are used for 5.1 etc. Each channel signal contains all of the sounds that will be reproduced by the relevant speaker or earphone. In digital devices such as CD players, DVD players, iPods etc, the data is obtained from a digital source, decompressed if necessary, and converted to a digital data stream with a certain bit width (typically 16 bits) and sample rate (typically 44,100 samples per second) and this data is fed through a digital-to-analogue converter (DAC) which turns it into a voltage signal that represents the sound for one channel. Search PCM with Google or on Wikipedia for more information.