Digital Sound Processing - DSP (up-encoding analysis), page 1

SpasV V.I.P. on April 17th, 2007 / post 18184

THE MORE YOU KNOW THE MORE CHANCES YOU HAVE

My intention is to discuss the topic of Digital Sound more strictly
The reason is simple. We work with Digital Sound every day and I believe “The more we know the more chances we have”.

Skype:spas.velev

SpasV V.I.P. on April 17th, 2007 / post 18185

The sound is a physical phenomenon – a wave that spreads in space at some speed. By its nature it is an analog process – it is defined at every moment of time and for every value in some range of values. If we put a sensitive device at some point of space it can transform the sound wave to an analog signal.
What about the human? His sensors – the ears - transform the sound wave and the brain processes their signals generating what we call sound.
So, the sound is, so to say, two fold: a wave that can be transformed to a signal and the human perception.
It would be interesting to discuss both of them

Skype:spas.velev

SpasV V.I.P. on April 17th, 2007 / post 18186

An analog signal can not be process by the computers for the simple reason – it contains an infinite amount of information no matter how small the time interval and value range are. A computer can process a huge amount of information but not infinite amount. Here the digitalizing comes to play. The analog signal is digitalized - transformed to a digital signal which contains an limited (not infinite) amount of information.

A digitalized signal actually represents the corresponding analog signal by its samples taken at discrete moments of time and their values quantized - taken from a finite set of numbers. To hear a sound represented in digital form it is back converted to an analog signal.
And here is the question. How close is the reconstructed signal to the source analog signal?
The answer is – under some restrictions it could be the same. And the restrictions are: the original signal has to have a limited spectrum bandwidth and to be sampled properly.

Now, my favorite spectrum.
The signals – analog and digital can be treated as mathematical functions – continuous and discrete time functions.
A spectrum is result of a mathematical transformation of function of a time (continuous or discrete) to a function of a parameter called frequency. This proposes is known as a transformation of a function from the time domain to a frequency domain. The transformation is reversible – a function of time can be reconstructed by its spectrum which means that what has been said about the spectrum is true for the function itself. Or in other words, the conclusions made based on the spectrum evaluations are valid for the original signal also.

For the discrete signals discrete spectra are defined and under the same restrictions the discrete spectra are considered as produced by sampling the continuous signal spectra which means the continuous signal spectrum can be reconstructed by the discrete signal spectrum. There are powerful methods for calculating the discrete signal spectra based on the Fast Fourier Transform (FFT) and (I think) every sound editor has a Spectrum or Frequency Analyzer. So, it is an ordinary tool now

Skype:spas.velev

SpasV V.I.P. on April 17th, 2007 / post 18188

Based on the studies of human hearing, Sony and Philips have defined their standard for audio CD recording in a document now known as “The Red Book”. It is: two channels, sampling rate of 44,100 Hz, 16 bits numbers to represent the sample value.
This means:
• a signal spectrum of such digital signal can not have bandwidth more than 22.05 kHz
• the sample value range is -32768 to 32767, the absolute value range is 1 to 32767 or in a logarithmic scale -90.3 db to 0 dB
• the maximum quantization error is ½ and it is considered as a white noise with spectral density of 1/12 (-112 dB) . It seems to me reasonable to define the spectrum bandwidth from this part of the spectrum which is above the -92 dB value. (-112 +20 = -92)
• the bit stream corresponding to such a digital sound has a bit rate of 1,411.2 kbps

Skype:spas.velev

(user gone) on April 18th, 2007 / post 18210

this is very interesting though i cannot comprehend all of it because i know very little of sound. i still find it fascinating. Spas very nice thread...perhaps you would share more of your knowledge with us :-D

vegy MusicFreak on April 18th, 2007 / post 18211

Thanks SpasV for the nice info!!! :-D

SpasV V.I.P. on April 18th, 2007 / post 18235

Now I need to mention something about human hearing as a basic for further discussions.
First of all I do not have good knowledge about the topic, so I took some references from Psychoacoustics (Wikipedia https://en.wikipedia.org).
Some Limits of perception:
• The human ear can nominally hear sounds in the range 20 Hz to 20,000 Hz (20 kHz). This upper limit tends to decrease with age, most adults being unable to hear above 16 kHz. (Check yourself out.) Some recent research has demonstrated a hypersonic effect which is that although sounds above 20 kHz cannot consciously be heard, they can have an effect on the listener.
• The "intensity" range of audible sounds is enormous. Our ear drums are sensitive only to the sound pressure variation. The lower limit of audibility is defined to 0 dB, but the upper limit is not as clearly defined. The upper limit is more a question of the limit where the ear will be physically harmed or with the potential to cause a hearing disability. This limit depends also on the time exposed to the sound. The ear can be exposed to short periods in excess of 120 dB without permanent harm, but long term exposure to sound levels over 80 dB can cause permanent hearing loss.
I could add:
• The spectral componetets interpretaion is they are real sound that can be hear if you can extract them from the complex sound signal. So, if you can hear for example a sound @ 20 kHz separated you can hear it in the complex sound also. But if you can not hear it separated I am not sure if its influence on the complex sound can not be perceived. For me these are different processes.
• “The lower limit of audibility is defined to 0 dB” means the lower limit is taken to be 1 and intesity is measured compared to the real lower limit. In other words: A sound of 80 dB is 10,000 more intesive than the lower limit of audibility.

As it is seen the aodio CD recording standard covers the human hearing very well.
A conclusion that can be made about a sound quality is: the closer the sound to these limits the better is its quality because the whole human potential will be used for the sound to be processed.

Skype:spas.velev

SpasV V.I.P. on April 19th, 2007 / post 18237

Here is an example of a spectrum. It is calculated, as it can be seen, using 65,536 sound samples or (at 44,100 samples per sec) over a 1.486 sec. Of course, the sound signal can vary along the time but nevertheless such a spectrum can be used to make conclusions for the whole sound file.
• The spectrum is spread over whole frequency range [20 Hz – 22.05 kHz]
• Dynamic range is 65.2 dB, based on the low frequency component or 61.5, based on the high frequency component. The whole dynamic range (to me) is 92 dB but it is never used entirely. The good quality sound utilizes more than 60 dB.
The figure on the right hand side presents the middle and high frequency spectrum range where usually the sound quality cut occurs. It is around 12 kHz, 16 kHz, and 18.5 kHz.

Skype:spas.velev

Willy84 Bongaz on April 19th, 2007 / post 18243

man this things are awesome and quality thing is really great but i really understand nothing from the Pix and all i can tell u that if we have a good sounded set without skips or static in it i think i am glad that i have such a set but if there is anything in the sound so we would like to do what u r doing and i think that u r and a few ppl in here can do such a stuff and by the way i am going to get books about this stuff to know how u do such a thing with the sound i really admit that u have agreat quality in u r sets but it's not a big diffrence between the mp3 we upload and the m4a u do :whistle:

arnani D-Formation Gue on April 19th, 2007 / post 18244

WELL DONE PROFESSOR SPAS V :smart:

NICE INFO

SpasV V.I.P. on April 20th, 2007 / post 18265

I have three applications that implement Frequency Analysis: two sound editors and EAC. The Frequency Analysis is a common tool and I believe it is a must for an uploader and (still) option for the music fans. My intention is to make this topic interesting for you an yes, to take the books and to read more.
As another illustration here are four more spectra: of the original ((01) MYSTICA. Bliss (Mystica Mix).wav – an audio CD rip which I think is very high quality) and three generated from encoded versions of the original .wav file.
I am not comparing the encoders. I am showing the results as examples of different spectra received from one origin after applying some processing over it.
It is easy to see the AAC encoder @400 kbps produces the best signal mach.
The best mp3 result (not surprisingly) is @ 320 kbps. And finally, there is an mp3 @ 128 kbps.

Skype:spas.velev

SpasV V.I.P. on April 20th, 2007 / post 18266

Now for curiosity - another example - a spectrum which you can easy recognize.
With all my respect for the uploader, what in common has this spectrum with an mp3@320 kbps audio CD quality signal spectrum. Only the file container is the same ï¿½ mp3@320. You can see from the previous spectra that an mp3 @128 kbps can hold such signal. Or in other words this file can be safety- without loosing a quality - recoded to 128 kbps mp3.
And this was the reason for me to offer a 128 kbps version of the last posted Carl Cox ï¿½s Global show. The spectra of the TALiONï¿½s files were actually 128 kbps spectra.

Skype:spas.velev

SpasV V.I.P. on April 23rd, 2007 / post 18341

I am thinking of discussing the topic of audio compression later on but right now I would like to clarify what I mean saying a file is 128 kbps quality.
What follows are two spectra (to remind that between a signal and its spectrum there is a reversible mapping):
• The first one is calculated from data of Paul Oakenfold - Exclusive Guest Mix To Nocturnal - 21-Apr-2007.mp3 file.
• Te second one: I have recoded the above file to .wav (nothing changed, only the file format), then I have encoded the .wav file to an mp3@128kbps using Lame 3.97 encoder.

Can you see differences between both spectra? There are such but they are too small to be essential for the sound quality.
That is why, at least for me, the file “Paul Oakenfold - Exclusive Guest Mix To Nocturnal - 21-Apr-2007.mp3” holds 128kbps sound quality.

[/url]

Skype:spas.velev

SpasV V.I.P. on April 23rd, 2007 / post 18342

Now I am trying to show what an mp3@192kbps spectrum looks like.
• First is an original file (.wav) spectrum (which is difficult to compress –actually it is an audio CD rip from (9) – ASOT 2005 CD2). The spectrum is limited to the (max possible) 22.05 kHz.
• Second is the best result I could obtain using Lame encoder (lame -t -k -ms -q0 -V0 -b192 -B192 test2.wav vbr192test2.mp3). You can see the spectrum is now limited to 18.5 kHz

So, what is actually the real mp3@192kbps sound?
I would add: This is the quality of Armin’s ASOT radio show on DI.fm.

[/url]

Skype:spas.velev

SpasV V.I.P. on May 7th, 2007 / post 18684

We could read many comments about sound quality using bit rates as a measure of quality. It is obvious that the sound quality can not be measured in bit rates. So, let us talk the same language.
A bit rate is a measure of information rate at which a discrete (sound) time function is represented. For example a CD quality sound is represented by a function at a bit rate of 2x44,100x16 = 1411200 or 1,411.2 kbps. The information to reconstruct this time function is 1,411.2 kb for every second.
When the information is reduced the sound is compressed. When the information is reduced the reconstructed function (sound) is different. If it is different the quality is not the same. It is worse. How worse is it? I do not rely on hearing perception. Instead I use the spectrum which is a result of a strict convertible mathematical transformation. (Convertible means you can exactly reconstruct the original function having its spectrum.) If the compressed sound spectrum is not the same as the original the quality is worse. The bigger the differences the worse the quality is.
The easiest way to compare two spectral functions (spectra) is by using the spectrum bandwidth, which is widely used characteristic for signals and signal transferring devices.

Now I am showing three spectra:
• The first one – of a test signal with spectrum bandwidth of 22.05 kHz – the maximum possible for an audio CD sound.

• The second is the same but the frequency scale is up to 16 kHz - to compare it easily to the third.
• The third is the spectrum of the same test signal but compressed as 96kbps mp3. Its spectrum bandwidth is 11.133 kHz

You can see:
• An audio CD quality sound has a spectrum bandwidth of 22.05 kHz,
• An mp3 compression @192kbps has a spectrum bandwidth of 18.5 kHz,
• An mp3 compression @128kbps has a spectrum bandwidth of 15.5 kHz,
• An mp3 compression @ 96kbps has a spectrum bandwidth of 11.1 kHz.
It is easy to understand that a spectrum bandwidth reducing results in worse sound quality no matter how someone perceives and evaluate and feel the sound.

The next:
The bit rate shows the maximum possible information contained in the file. But part of this information can be irrelevant to the signal spectrum. Let us say it is meaningless zeros like “0011” instead of “11”. Both numbers are eleven but to write the first one four digits are used instead of the two actually needed.
Or, the fie contains surround information also. How this information impact the spectrum?

This means that when, for example, an mp3 file @192kbps actually contains a sound with a spectrum bandwidth of 15.5 kHz the sound quality is as an mp3@128kbps.
I used this example because it is typical. Most radio broadcasts are band limited to 15.5 (or 16.0 – I do not know exactly) by the radio channel and recorded as mp3@192kbps. The record cannot make them 192kbps sound. It puts more meaningless zeros in the file. Nothing more.
I would say all mp3@192kbps from Kiss100, I have seen, are actually 128kbps sound.

Finally, I would say: Please, do not talk about XM radio as Ojay did: “broadcasted by XM in the aac format at around 96kbps (they are always doing that).
You have a Frequency Analyzer you have the files. Use them and then let us talk.
The Ojay’s statement above shows only the author’s ignorance - nothing more.

I am thinking of discussing the MPEG-4 aacPlus audio compression technology also. Not only based on it is XM radio, I have seen at least two Internet audio strems based on it also. And it is natural - new knowledge, more efficiency, better quality.

Skype:spas.velev