Sound quality - what is this?

SpasVstar V.I.P. on June 4th, 2007 / post 19397
:-) I have already posted a reference I am going to post here but I think this is the place for it.
I really try to use sources of information which could seem more abstract but they are proven scientific texts.
Here is the source: THE DIGITAL SIGNAL PROCESSING HANDBOOK
Edited by VIJAY K. MADISETTI, DOUGLAS B. WILLIAMS.
both editors:
Center for Signal and Image Processing
School of Electrical and Computer Engineering
Georgia Institue of Technology
Atlanta, Georgia

Nikil Jayant
Bell Laboratories, Lucent Technologies

says in chapter IX - Digital Audio Communications:

"The three parameters of digital audio quality are: signal bandwidth, fidelity and spatial realism".
The spatial realism is provided by increasing the number of spatial channels. The common formats are:
1-channel (mono), 2-channel (stereo), 5-channel, 5.1 channel, 8-channel.
The fidelity refers to the level of perceptibility of quantization or reconstruction noise. It is evaluated by formal listening tests.

It is clear we cannot talk strictly about digital audio quality not using the signal bandwidth or in other words without using a spectral analysis tool.

And more:
"Compact-disk (CD) signals have bandwidth of 20-20,000 Hz, while traditional telephone speech has a bandwidth of 200-3400 Hz. Intermediate bandwidths characterize various grades of wideband speech and audio, including roughly defined ranges of quality referred to as AM radio and FM radio quality (bandwidth on the order or 7-10 kHz and 12-15 kHz, respectively)."
Skype:spas.velev
SpasVstar V.I.P. on June 10th, 2007 / post 19534
;-) Is the sound quality an empty word?
For many, yes.

I saw three releases of the Tiesto’s  Club Life 010 (Radio538 broadcast) recently.
• a TALiON’s release – 44.1 kHz VBR avg 171 kbps
• a Monse’s  release – 48 kHz mp3 VBR avg 213 kbps
• a Monse’s  release – 48 kHz mp2 @192 kHz (the same release is shared at TM also)
(Monse is an TBM uploader and TM user also)

Obviously Monse rips his music using some mp2 stream which was the reason for
Ojay – a Resident of TMB - to say about the Monse’s release of ASOT 302:
“This is not one of the usual 192kbps DI.fm versions!!
This release was obtained from a source with a bitrate of up to 384kbps!
It was converted by Monse to … - roughly the same quality as 320kbps CBR MP3.
This is the highest quality ASOT version on the entire Internet!!!”

The quality of the ASOT 302 as well as the quality of the above Monse’s releases of the Tiesto’s Club Live 010 is 128 kbps and the reason for that obviously is the source of the mp2 streamer (in spite of its  bitrate of up to 384kbps). In the case with the Club Life the source is Radio538 broadcast which is band limited to 16 kHz (an FM radio station).
This is obvious but here I am showing the spectra that prove this also.
(All spectra have been calculated over the whole length of the files.)
• First, these are two spectra: Tiesto010-MoN.wav and Copy of 01_-_Tiesto_-_Club_Life_010_MoN.mp3. The file Tiesto010-MoN.wav is result of conversion of the file Tiesto010-MoN.mp2. This conversion is lossless and I needed it because the Sony Sound Forge 8.0 does not work with mp2 files. The second file is copy of … because … (if someone wants to know why I would explain but it has nothing in common with the topic).

Both spectra are the same, with a shift in Y axis, which is not surprising because the spectra are determined by the source – the radio broadcast – not by the encoders. These two files can hold sound with better quality than this and the source spectrum is not intact.
So, as it is seen the spectrum bandwidth of the both files (mp2 @192 kbps and mp3 VBR 213 kbps) is the same – around 16 kHz.



• Second, here are spectra to compare Club Life - 010 versions: mp2 @192 kbps, mp3 VBR 213kbps, and two mp3s @128kbps and to show they are essentially the same.
How the 128 kbps files have been generated?
I think the idea is clear. I used the file Tiesto010-MoN.wav (a conversion from Tiesto010-MoN.mp2) as a source and compressed it to mp3@128 kbps using two different encoders: Lame 3.97 and Blaze Media Pro. The same software (Blaze Media Pro) I used to convert Tiesto010-MoN.mp2 to Tiesto010-MoN.wav.
(all mp3 files have a sampling frequency of 48 kHz – same as the Monse’s source files)

These spectra are shown relatively -112 dB because this level is the level of the quantization noise and it makes no sense to consider spectral components that are lower than the noise for now.
The red line is at the level of -92 dB to show the spectral components that are at least 20 dB higher than the noise. I prefer to determine the spectrum bandwidth based on the level -92 dB.


mp2@192 kbps (Tiesto010-MoN.wav) vs mp3@128 kbps (k128_lame.mp3)



And
mp3 VBR avg 213 kbps (Copy of 01_-_Tiesto_-_Club_Life_010-MoN.mp3) vs mp3@128 kbps




As it is seen the four spectra are essentially the same (the differences are of the magnitude of the quantization noise) hence the Monse’s files are 128 kbps quality and it is determined by the Radio538 broadcast channel. (Of course, the same is true for the TALiON’s release as well.)
(What determines for a sound to be of a 128 kbps quality? If the original sound spectrum and its bandwidth specifically can be preserved with an mp3@128 kbps compression then the sound is at most 128 kbps quality. But it could be worse quality also.)

I am not showing the TALiON’s files spectra because with the same source (Radio538 broadcast) and mp3 VBR avg 171 kbps they should be the same.

To put a bottom line for now:
A 128 kbps sound offered as:
mp3 VBR avg 171 kbps
mp3 VBR avg 213 kbps
mp2 @192 kbps.

Isn’t it funny? No, it is not.
Skype:spas.velev
bidonavip user on June 11th, 2007 / post 19547
SpasV wrote:
Isn’t it funny? No, it is not.
sure its not
Quote: This is not one of the usual 192kbps DI.fm versions!!
This release was obtained from a source with a bitrate of up to 384kbps!
It was converted by Monse to … - roughly the same quality as 320kbps CBR MP3.
This is the highest quality ASOT version on the entire Internet!!!
now thats funny :lol:
SpasVstar V.I.P. on June 23rd, 2007 / post 19794
What do you think, is there something in common between bit rate and the quality of the digital sound?
If there is, then …

A three hours set of 15.5 kHz bandwidth sound in an mp3 @320 kbps!
(After all an mp3 @320 can hold a sound of 20 kHz bandwidth.)

The file Sasha & Digweed - Live @ Bonnaroo 17-06-2007.mp3 has a size of 516,416 KB and its content is a 15.5 kHz bandwidth sound.
The size can be safety reduced without loosing the quality:
• First, decode it to a wav file. Using (command line application) LAME: lame –decode Sasha&Digweed-Live@Bonnaroo17-06-2007.mp3  Sasha&Digweed-Live@Bonnaroo17-06-2007.wav (or rename the files)
• Then, encode: lame -v --lowpass 16 Sasha&Digweed-Live@Bonnaroo17-06-2007.wav  @new_bitrate.mp3
(No spaces in the file names.)

The result will be a 156 kbps file of size 251,375 KB and a sound of the same quality. :wink:

You can download this result here: https://www.megaupload.com/?d=83O6FIYI
Skype:spas.velev
bidonavip user on June 23rd, 2007 / post 19799
yes, i know all this SpasV, and i said in the description that this is not the real sound quality, just the bitrate of the file. Now, i didnt decoded it because this was what i found, its not mine, so for later request/reseed/use there will be the original file, as about the decode, i think who wants can do it himself as you did :-)
SpasVstar V.I.P. on June 23rd, 2007 / post 19801
:-) I remember you were the first to tell me the file size does matter even when the quality is considered.
After then I always try to find some balance between the quality and the size. I often recode some improperly encoded files to cut the needless size.
In this case the unnecessary, needless file size is 105.4%. The file size is more than doubled!
It took me more than three hours to download the file.
And even with my relatively fast computer it took me long time for recoding it to the proper bit rate.
I do not understand why haven’t you done the recoding?
Anyway, thanks.
I have uploaded a proper encoded version of the file and the megaupload link works.
(again the download link: https://www.megaupload.com/?d=83O6FIYI)
Skype:spas.velev
bidonavip user on June 24th, 2007 / post 19809
SpasV wrote:
I do not understand why haven’t you done the recoding?
as i said, its not mine recording, thats why i've puted up what i was found and its the one which will be available in all the www as it was posted in 1st place, as about the unnecessary size, i have another copy made by myself for my creative, and you have yours as well. cheers SpasV, i know you do your best as i do :-)
SpasVstar V.I.P. on June 27th, 2007 / post 19896
:-)
Skype:spas.velev
SpasVstar V.I.P. on June 28th, 2007 / post 19899
I would like to show the very interesting Energy Compaction Property of the Discrete Cosine Transform (DCT).
An MP3 encoder is very complicated to be discussed but at some point of the encoding process it uses the Modified Discrete Cosine Transform (MDCT). I do not know MDCT and that is why I am going to use an example with a DCT found in the book Discrete-Time Signal Processing by Alan V. Oppenheim and Roland W. Schafer.
I have done the calculations for the example so as to be able to show the figures illustrating the results. I would like to point out my results seem a little bit different, so at least one mistake exists somewhere but I do not believe I have it. Otherwise the main result is the same.

The DCT used in the example (DCT-2) is defined by the transform pair:



and β[k] = ½ if k=0 and β[k] = 1 if k≠0.
Here x[n] is the discrete sequence (a discrete signal like a PCM signal) of N points
and X[k] is the discrete sequence of DCT coefficients of N points also calculated for the sequence x[n]..
(Actually, the content of an mp3 file is such MDCT coefficients.)
The signal sequence is represented by the a set of basic (cosine) functions and a set of coefficients X[k] calculated for the specific signal using the same set of basic functions.
The sequences   n = 0, 1, … 31 (N = 32)
and the sequences  X[k] are shown in the figure below.
As it can be seen in the figure the magnitude of the coefficients X[k] decreases very fast which reminds many of them do not contribute scientifically in reconstructing the original sequence x[n].



To demonstrate the Energy Compaction Property of the Discrete Cosine Transform the example shows a sequence    n = 0, 1,…, N-1  p = 1, 2,…, N
Which is reconstructed sequence x[n] using only p (part of the) coefficients.
The figure below shows the original sequence – x[n] and the  sequences X5[n], X6[n], and X10[n] reconstructed by using 5, 6, and 10 coefficients (For perfect reconstruction all 32 coefficients are needed.).
The visual inspection confirms 10 coefficients produce a perfect reconstruction.



Finally, the example shows how the approximation error depend on p by defining

to be the mean squared (ms) approximation error
It is plotted in the figure below along with the X[k] series of coefficients. As it can be seen the max ms approximation error is when using only one coefficient and it is 1000 times less when 9 or more coefficients are used.



The example illustrates how powerful the DCT Energy Compaction Property is and why DCT is widely used for data compression.

One more thing to mention. If we think x[n] was an PCM sound and we need to encode it then this particular example shows that if the criterion for acceptance was an ms error less than 0.0001 then no more than 9 coefficients would be needed to reconstruct the original signal and the reconstruction would be, I would say, perfect. Then every next coefficient to be encoded would only increase the file size without adding anything significant to the quality of the sound.
Skype:spas.velev
TomMixlightning 3daywarning on June 30th, 2007 / post 19973
:lol: spasv  :smile:
ur uploading xmstreams with two to three times the bandwidth they are streamed originaly  ;-)
peeps please read on here


a big fuck to the ajax if the link doesnt work!
TomMixlightning 3daywarning on June 30th, 2007 / post 19974
and u can post as much funny self test pictures from ur laboratory as u like:

XM-HAS-NO-CD-QUALITY!!!
:-D

"XM PROPAGANDA SITE SAYS" wrote:
XM searched the world for the best sound quality technologies and found them in customized CT-aacPlus audio encoding with Neural Audio optimization, which provides superior sound quality remarkably close to Compact Disc.
Ojaylightning mp2/mp3/aac/ogg on July 2nd, 2007 / post 20019
TomMix wrote:
:lol: spasv  :smile:
ur uploading xmstreams with two to three times the bandwidth they are streamed originaly  ;-)


Let him do his job. As long as people like it (there are enough of them), even the bloated m4a files are not important - as long as people think they downloaded great music it is okay. The quality of XM is very limited (to say the least) but still a few million people are willing to pay for the streams - so why not offering the same music for free?
TomMixlightning 3daywarning on July 2nd, 2007 / post 20020
i appreciate spasvs efforts, no doubt about that!
but the teqnique ...  :-(
SpasVstar V.I.P. on July 5th, 2007 / post 20048
:-) I have two problems: one easy and one not so easy.

You have a Digital Audio Source, a Digital Audio Broadcast Channel (DABC), and a Receiver as shown in the figure below:



The DABC transmits the Digital Sound in real time, you are at the receiver site and you are listening the music. You have perfect hearing, perfect sound system but …
What is the sound quality you are listening?

1) The first problem.
The DABC streams a 48 kHz digital sound @ 1536 kbps. You have recorded the sound, you have done a Spectrum Analysis and you have found the sound spectrum bandwidth was 16 kHz.
a) What is the source signal spectrum bandwidth?
b) What would be the minimum channel bit rate if it was to transmit an mp3 stream so as the source signal bandwidth to be preserved?

2) The second problem – the more difficult one.
Neither know you something about DABC nor about the Digital Source.
You have recorded the sound, you have analyzed it using Spectrum Analyzer and you have found the digital sound you have received was 44.1 kHz sampled and its spectrum bandwidth was 22.05 kHz.
a) What was the source spectrum bandwidth?
b) What should have been the DABC bit rate if it would be to broadcast this sound as a PCM signal?
c) What should have been the DABC bit rate if it would be to broadcast this sound as an AAC encoded signal?
d) What should have been the DABC bit rate if it would be to broadcast this sound as an AAC++ version 2 encoded signal?

Notes:
• What could help in solving the problems is:
Shannon Sampling Theorem/the Nyquist  rules,
Some “laboratory” work experience,
maybe Ojay and TomMix also but I am not sure about that.
• part d) is the most difficult part – TomMix couldn’t solve it at all.
Skype:spas.velev
TomMixlightning 3daywarning on July 6th, 2007 / post 20077
oh part d is solved: 34 to 50kbit/s
https://www.tribalmixes.com/#/forums.php%QUaction=viewtopic&topicid=3148&ajaxloader=1&poid=1183751598270
ah i see u havent posted a response there yet ... maybe u missed it.

i dont get ur strange setup, sorry.

todays DABC just send 192kbit 48khz mp2 in my country. ordinry DAB.
and theses streams are just lossy to mp2 encoded audio cds
to me they are transparent also...
you cannot post in this forum.
click here to to create a user account to participate in our forum.