Digital Sound Processing - DSP (up-encoding analysis), page 2

slash ProDanceCulture on July 5th, 2007 / post 20052
reopened...
TomMixlightning 3daywarning on July 7th, 2007 / post 20088
SpasV wrote:
... Or in other words, the conclusions made based on the spectrum evaluations are valid for the original signal also. ...

you have to consider it is only valid to the signal analyzed, though not to the source of that signal if it is a lossy encoded one!
if a lossy encoding process is involved the original source information is lost.
transcoding, from one lossy codec to another, means always a loss of information from
the original source!

SpasV wrote:
:-) Based on the studies of human hearing, Sony and Philips have defined their standard for audio CD recording in a document now known as ?The Red Book?

almost right champ, but u always forget BUSINESS!!!

History of Red Book wrote:
Audio format

The format of the audio disc, known as the 'Red Book' standard, was laid out by SONY and Philips in 1981. Philips is responsible for the licensing program of the intellectual property pertinent to the Compact Disc including the 'CDDA' logo that appears on the disc. In broad terms the format is a two-channel (four-channel sound is an allowed option within the Red Book format) stereo 16-bit PCM encoding at a 44.1 kHz sampling rate. Reed-Solomon error correction allows the CD to be scratched to a certain degree and still be played back.

The sampling rate of 44.1 kHz is inherited from a method of converting digital audio into an analog video signal for storage on video tape, which was the most affordable way to store it at the time the CD specification was being developed. A device that turns an analog audio signal into PCM audio, which in turn is changed into an analog video signal is called a PCM adaptor. This technology could store 6 samples (3 samples per each stereo channel) in a single horizontal line. A standard NTSC video signal has 245 usable lines per field, and 59.94 fields a second, which works out at 44,056 samples/second. Similarly PAL has 294 lines and 50 fields, which gives 44,100 samples/second. This system could either store 14-bit samples with some error correction, or 16-bit samples with almost no error correction. There was a long debate over whether to use 14 or 16 bit samples and/or 44.056 k or 44.1 k samples/s when the Sony/Philips taskforce designed the compact disc; 16 bits and 44.1 k samples/s prevailed. The Sony PCM-1610 and PCM-1630 are well-known examples of PCM-adaptors used in conjunction with the Sony U-Matic VCR.

Originally, audio-format CDs came with a three letter code on the back, where "A" stood for analog and "D" stood for digital. The first letter represented how the album had been recorded, the second how it had been mixed, and the third how it had been transferred. As a result, almost all early CDs are "AAD" (analog recording and mixing, digital transfer to CD) quality. The rock band Rush was the first musical act to record a full digital, "DDD," album—Signals.
SOURCE so it was a business strategy desicion ... no studies of human hearing involved.

real research to human hearing started by developing lossy audio codecs. why?
to let them produce TRANSPARENCY!
this means: the usual human ear cant tell the difference
between the original source and the compressed one.
as spasv mentioned: check urself out :wink:

and i see one did it already:
Willy84 wrote:
...  i am going to get books about this stuff to know how u do such a thing with the sound i really admit that u have agreat quality in u r sets but it's not a big diffrence between the mp3 we upload and the m4a u do  :whistle:
seems the whole thing is transparent.
the only difference is the size of the file.
and the question is why a very advanced codec like aac gives a transparent result to an
ordinary mp3 with smaller size?

okay so what do we know about 'Limits of perception' spasv already wrote some lines above and one should read here about psychoacoustic stuff
and also like to bring a quote from Karlheinz Brandenburg
"Karlheinz Brandenburg one of the dev of mp3" wrote:
Read the following text about bandwidth by Karlheinz Brandenburg
from MP3 and AAC explained :

The bandwidth myth
Reports about encoder testing often include the mention of the bandwidth of the compressed audio signal. In a lot of cases this is due to misunderstandings about human hearing on one hand and encoding strategies on the other hand.

Hearing at high frequencies
It is certainly true that a large number of (especially young) subjects are perfectly able to hear single sounds at frequencies up to and sometimes well above 20 kHz. However, contrary to popular belief, the author is not aware of any scientific experiment which showed beyond doubt that there is any listener (trained or not) able to detect the difference between a (complex) musical signal with content up to 20 kHz and the same signal, but bandlimited to around 16 kHz. To make it clear, there are some hints to the fact that there are listeners with such capabilities, but the full scientific proof has not yet been given. As a corollary to this (for a lot of people unexpected) theorem, it is a good encoding strategy to limit the frequency response of an MP3 or AAC encoder to 16 kHz (or below if necessary). This is possible because of the brick-wall characteristic of the filters in the en-coder/decoder filterbank. The generalization of this ob-servation to other types of audio equipment (in particular analog) is not correct: Usually the frequency response of the system is changed well below the cutoff point. Since any deviation from the ideal straight line in frequency re-sponse is very audible, normal audio equipment has to support much higher frequencies in order to have the required perfectly flat frequency response up to 16 kHz.

Encoding strategies
While loss of bandwidth below the frequency given by the limits of human hearing is a coding artifact, it is not necessarily the case that an encoder producing higher bandwidth compressed audio sounds better. There is a basic tradeoff where to spent the bits available for encoding. If they are used to improve frequency response, they are no longer available to produce a clean sound at lower frequencies. To leave this tradeoff to the encoder algo-rithm often produces a bad sounding audio signal with the high frequency cutoff point varying from block to block. According to the current state of the art, it is best to introduce a fixed bandwidth limitation if the encoding is done at a bit-rate where no consistent clean reproduction of the full bandwidth signal is possible. Technically, both MP3 and AAC can reproduce signal content up to the limit given by the actual sampling frequency. If there are en-coders with a fixed limited frequency response (at a given bit-rate) compared to another encoder with much larger bandwidth (at the same bit-rate), experience tells that in most cases the encoder with the lower bandwidth pro-duces better sounding compressed audio. However, there is a limit to this statement: At low bit-rates (64 kbit/s for stereo and lower) the question of the best tradeoff in terms of bandwidth versus cleanness is a hotly contested question of taste. We have found that even trained listeners sometimes completely disagree about the bandwidth a given encoder should be run at.

so the range above 16khz is very doubtable to be necessary in a lossy encoded audiofile to the quality. and that is what we are dealing with. lossy encoded audio files. no hd-audio content! not even cd-quality!

again the imortant influence is BUSINESS, press more data thru the limited bandwidth of satellites or terestrial broadcaster. to make ppl pay for every shit, to have drm to make more money ...

SpasV wrote:
:-) We could read many comments about sound quality using bit rates as a measure of quality. It is obvious that the sound quality can not be measured in bit rates. So, let us talk the same language.

finally a interesting point. but i would prefer it like this:

"Wiki" wrote:
Sound quality generally is the quality of the audio output from various electronic devices.

Sound quality can be defined as the degree of accuracy with which a device records or emits the original sound waves. For digital recording/digital playback, this accuracy depends on the range of sound which is sampled, the rate at which it is sampled, and the various conversions that occur in any sound reproduction system. With lossy codecs such as MP3 and Ogg Vorbis, sound quality is a quantifiable factor that determines how much sound data the encoder is allowed to discard in order to reduce file size. MP3-encoded sound is generally CBR, so its quality is defined by its bitrate, in kilobits per second (kbit/s). Quality of Ogg Vorbis-encoded files, which are most commonly VBR, is a decimal value ranging from –1 to 10, with –1 being suitable only for low-quality voice.
The range of sound (in hertz) which the equipment detecting the sound samples affects sound quality. Humans can hear vibrations ranging from about 20 Hz to approximately 20 kHz, so sampling that doesn't extend this far will have a detrimental effect on the resultant quality.
The rate at which the sound is sampled refers to the amount of information the detection equipment records about each second of sound. More information about the shape of the sound waves results in a more accurate sample, in other words, this is due to the digital quantization of the analogue sound wave.
The conversions of sample range and sample rate between different pieces of equipment in a sound recording and reproduction system will affect the quality of the sound. More conversions usually results in a lower level of quality.
Sound quality is the physical pleasure or fatigue experienced by a listener, and is typically characterized in a live setting by the skill of the musicians, the tonal quality of their instruments, and the physical traits of the venue. In a playback setting it is characterized by the same traits as in a live setting but is also affected by the recording techniques and equipment used, from the microphones at the session to the final pressing at the record or compact disc factory, to the quality of electronics and speakers used to recreate the sound in a listener's home.

well this reads nice!
suming this up with the facts of Limits of perception and knowing about the sources we have on tm i see no big deal with audio quality here.
but sometimes i wonder about  >200kbit/s aac files ...
if we are talking about audio-quality we have to consider the source always.
here comes some very very special point on aac and other post-processing codecs:
these codecs do add artificial information at the time decoding the signal.
this means information that is not in the original source is added to enhance the audio feeling for the listener.

"Wikipedia" wrote:
How it works

It can be combined with any audio compression codec: the codec itself transmits the lower frequencies of the spectrum, while SBR synthesizes associated higher frequency content based on the lower frequencies and transmitted side information.

When applicable, it involves reconstruction of a noise-like frequency spectrum by employing a noise generator with some statistical information (level, distribution, ranges), so the decoding result is not deterministic among multiple decoding processes of the same encoded data.

Both ideas are based on the principle that the human brain tends to consider high frequencies to be either harmonic phenomena associated with lower frequencies or noise, and is thus less sensitive to the exact content of high frequencies in audio signals.

read here and here
and here

quality regarding mp3 this is all already done. go to the web and search or try hydrogenaudio.com
with latest lame 3.97 for archiving transparent cd-copies -V 2 --vbr-new is recommended to the usual mortal human being.
i dont have the corresponding data for aac. but i highly doubt that it has to be more than 200kbit/s ...


SpasV wrote:
It is easy to understand that a spectrum bandwidth reducing results in worse sound quality no matter how someone perceives and evaluate and feel the sound.

but always consider the SOURCE u have! in case of ur cd samples it is okay. and it is nice to see that the 192kbit mp3 has a range of sound up to 18khz.
seems to be lossy encoded and transparent.


SpasV wrote:
Finally, I would say: Please, do not talk about XM radio as Ojay did: broadcasted by XM in the aac format at around 96kbps (they are always doing that).
You have a Frequency Analyzer you have the files. Use them and then let us talk.
The Ojay?s statement above shows only the author?s ignorance - nothing more.

well u think ojay is ignorant?
i think ur resistant against some simple facts!
man i dont know how often i have to say it again and this time i try it bold, i hope u like bold, not uppercase, that would be offending i guess, though i like 3daywarnings:

xm uses a proprietary customized CT-aacPlus codec from coding technologies
some facts of that codec are
"facts from developer of the codec" wrote:
Features
• True Superset architecture
• Multi-channel support for 5.1, 7.1 and beyond (48 channels total)
• Built-in error concealment for mobile applications
• CD-quality stereo down to 48 kbps
• Near CD-quality stereo at 32 kbps
• Excellent quality stereo down to 24 kbps
• Widest available audio bandwidth
• Optimized speech, mixed speech/music down to 8 kbps mono
• Compliant with ISO/IEC 14496-3, incl. Amd.1:2003, Amd.2:2004, and all corrigenda



xmradio propaganda website tells: xmradio provides superior sound quality remarkably close to Compact Disc
this means 32 kbps to 48kbps to give some extra bits maybe 64 kbs

it is not even 96kbitps. any questions?!
oh and regarding ur spectra from xmradio it should be full cd-quality, so where is the error?
maybe he-aac+sbr+ps? but this has nothing to do with audio quality! this means artificial manipulation! business!

SpasV wrote:
I am thinking of discussing the MPEG-4 aacPlus audio compression technology also. Not only based on it is XM radio, I have seen at least two Internet audio strems based on it also. And it is natural - new knowledge, more efficiency, better quality.

yepp this should be the next station :)
SpasVstar V.I.P. on July 11th, 2007 / post 20190
:-) No answers to my two problems posted in “Sound quality - what is this?”
Most probably there is not an interest or they are difficult.
Anyway, I decided to answer the first problem.
Here it is.
1) The first problem.
The DABC streams a 48 kHz digital sound @ 1536 kbps. You have recorded the sound, you have done a Spectrum Analysis and you have found the sound spectrum bandwidth was 16 kHz.
a) What is the source signal spectrum bandwidth?
As long as the bit rate is 1536 kbps then every channel uses 1536/2 = 768 kbps. With a 16 bits per sample every channel transmits 768/16 = 48,000 samples per sec which means the channels can transmit a signal with BW of 24 kHz (Nyquist rules). Then, the source signal BW is 16 kHz because the channel BW is not a restriction.
b) What would be the minimum channel bit rate if it was to transmit an mp3 stream so as the source signal bandwidth to be preserved?
By inspection of signal spectra we can find an mp3 stream @ 128 kbps can carry a signal with a BW of 16 kHz.

But I decided to use a simulation modeling to find a more strict answer changing the problem a little bit.
Here is the model.
I have a digital signal of 22 kHz BW in a wav file at the Source site – it is a CD rip from ASOT 2005.
The DAB channel is band limited to 16 kHz which means its output signal would have a BW no more than 16 kHz.. To simulate this, as long as I still do not have a low pass filter, I have used an encoder – NeroAACenc – which has created a signal of 15.8 kHz BW @96 kbps. I have converted its output m4a file to a wav and have used the wav file as an input file for mp3 encoding @ 80 kbps, 128 kbps, 192 kbps, 256 kbps, and 320 kbps. These are illustrated in the figure below.



I have converted these five mp3 encoded files to wav files again so I had PCM files containing samples I could compare. To remind the conversion mp3 to wav is lossyless.
So, I had a received file - void.15.8 kHz (spectrum BW: 15.8 kHz) and five more files – 80kbps, 128kbps, 192kbps, 256kbps, and 320kbps. Even the 80kbps sound has a 15.8 kHz BW maybe because I have used an AAC encoder instead of a low pass filter.

I have used the Mean Squared Error (MSE) to compare two files. The MSE is defined by:



The sound I have used was a 4:41 min length so I had to calculate the MSE over N = 12,379,436 samples which obviously I had to program.

First, I have compared void.15.8 kHz to each of the other five files, calculated MSE for every pair. Because the MSE is a large number - the biggest one is 4,010,817 - I decided to express every MSE relatively to the biggest one so, it varies between 1 and 0. The files that are close - mp3 has better quality - have low MSE while the high MSE shows e file of worse quality.
All these files have the same BW of 15.8 kHz and spectra that are difficult to distinguish by inspection. Although it is expected I was surprised to see that the higher mp3 bit rate means a better quality– lower MSE and a sound which is closer to the simulated received sound.
To be honest, I have always thought, the proper encoding for a 16 kHz BW sound is 128 kbps not 320 kbps but it turned out the 320 kbps sound was closer to the received 15.8 kHz sound than every other at lower bit rate.
These results are shown in the figure below.



Second, I have compared the five mp3s to the sound at the source site. New five comparisons done with the source file void.22kHz.This time the things seem as I thought they should be. The closer to the source 22kHz BW sound turned out to be 80 kbps and the higher the mp3 bit rate the bigger MSE - the worst quality mp3 had.
These results are shown in the figure below.



The biggest MSE value has the 256kbps, which is represented by 1.0 but the 320kbps value is very close – it is 0.9999.

The last result is not a proof. It is a result of modeling and it opens at least one problem for me. I need to program a good real low pass filter to simulate an FM broadcast channel instead of using an encoder. Although it is quite possible for a DAB channel to use some ACC encoded stream and in such a case this model would be quite adequate.

P.S. After having posted this I saw I have actually analyzed six files – not five - one more @96 kbps.

To summarize the results:
• The model simulate a band limited broadcast channel like every FM radio channel – Bandwidth (BW) of 16 kHz. The drawback of the model is my using an AAC encoder as a low pass filter which can be different from the real model.
• The results are obtained by comparing the corresponding sound samples of the received/sent sound and six mp3s created from the received sound which has been made band reduced by the channel. It is an objective, low level comparison – not a listening test or signal spectrum inspection.
• The higher bit rate, when encoding the received sound, makes the mp3 sound closer to the received sound but worse when compared to the broadcasted source sound.
• The closest to the broadcasted source sound turned out to be the mp3 @80 kbps which preserves the received sound spectrum also.
Skype:spas.velev
SpasVstar V.I.P. on July 17th, 2007 / post 20416
:-) Bandwidth (BW) and Bit Rate

The known fact is: You have a high  BW you have a quality.

Here I am showing this fact.
The setup is already known. I encode an original file in .wav format (containing the wave form samples)  to mp3 file with different encoder parameters, then convert the resulting files to .wav files, and finally compare the original file to the resulting files using the Mean Square Error (MSE).

The mp3 encoder Lame (in v 3.97) has three options to control the bit rate: CBR, ABR, VBR.
The “ABR options:
 --abr  specify average bitrate desired (instead of quality)”.
Instead of using this option I have used the VBR option varying the bit rate through the parameters –b (the lowest bit rate) and –B (the highest bit rate).
In all cases I have used the parameter –q 0 for the highest quality.
(“Noise shaping & psycho acoustic algorithms:
  -q          = 0...9.  Default  -q 5
                  -q 0:  Highest quality, very slow”)

I have used two Lame’s internal low pass filters to have mp3s with different BW:
• Lame –q0 –Vx –lowpass 19 (transition band: 18671 Hz - 19205 Hz)
• Lame –q0 –Vx –lowpass 16  (transition band: 15826 Hz - 16360 Hz)
where x = 0, 1
The results are shown in the figure below.
It is quite clear that:
• The quality of the 19 kHz BW sound is always better (the sound wave is closer to the original sound wave) at any bit rate and any x (0, 1) in Vx – the parameter controlling the quality.
• The higher bit rate does not mean higher quality. You can see the MSE decreases slightly when the bit rate increases depending on V0 or V1 and the BW also.
• The MSE   for the 16 kHz BW sound is 2-3 times bigger than this of the 19 kHz BW sound.
• The bit rate does not determine the actual quality at all because (most probably) you can have many different sound qualities at the same bit rate depending on other factors.
• The points marked as 1, 2, 3, and 4 are generated using the default Lame parameters. The interesting fact is increasing the bit rate after these points does not decease the error.



As to the CBR, the results are the same but there is a peculiarity – the lowpass option does not work as I expected (maybe this is a bug). The MSE errors for the files generated with different lowpass parameters: 16 and 19 are very close at the same bit rate. The things are as expected when I used, as original files to encode, sounds with different BW (produced by NeroACCenc used as a low pass filter).
The results are shown in the figure below.
Here, the MSE generated using the original file is shown also. These errors are less than others but the reason should be the NeroACCenc low pass filter used.



Finally, I am showing a comparison between the errors generated for VBR and CBR.
As it is seen the CBR results are between the results for VBR: –V1 –lowpass 16 and –V0 –lowpass 19 .
@192 kbps CBR is equivalent to –V1 –lowpass 16 and @320 kbps  is slightly worse than –V0 lowpass 19, which is surprising to me also.
Skype:spas.velev
FlowerPowdervip V.I.P. on July 19th, 2007 / post 20477
I think this topic is very interesting and I think that it is nice that people know a bit more about this giant chewing-gum that is sound...

But, In the end,  it is good to know how to master sound.....

....but for me the colour of sounds tend to disappear with highest digital quality and that's why musicians like analog synths which have so much colours due to imperfections and age...

So, the truth is always somewhere in the middle, and what science cannot explain so accurately is the effect of music and sounds on our souls...

So a perfect spectrum can have sometimes the poorest results in what is the most essential in music:
Transportation of feelings and emotions.......

Having said that... Thankx to all for a nice contribution at educate us and collecting valuable info...

Cheers,
F.
TomMixlightning 3daywarning on July 20th, 2007 / post 20498
@FlowerPowder
you are so right! no mathematical pictures nor spectras can tell about the quality!

@spasv
man i really really beg you to join the hydrogenaudio.org forums to post ur tests there.
since im am the only one here participating with some technical background i guess.
there are a lot of peeps doing this stuff for ages and not since the xmradiostream is on air :wink:

man i only take this one example that u take a wave encoding it with a lossy codec and decoding it back to wave ... WTF?!

please please with sugar on top: go to hydrogenaudio.org, take the few seconds to register and post ur spectras there! (a tip from me is to read the FAQs though)

here is an example how to compare audio quality!

ur on the very wrong way to compare audio quality!
"HydrogenAudio Forum" wrote:
The regulars here know that looking at graphs is pretty irrelevant when coming to the actual listening. But I still think it's interesting from a technical point of view to see what the codecs are doing.


i want give up the hope u will someday understand ...
Ojaylightning mp2/mp3/aac/ogg on July 20th, 2007 / post 20508
TomMix wrote:
@FlowerPowder
you are so right! no mathematical pictures nor spectras can tell about the quality!

@spasv
man i really really beg you to join the hydrogenaudio.org forums to post ur tests there.
ur on the very wrong way to compare audio quality!


SpasV would be banned immediately on the hydrogenaudio forums. That is because any claim on the sound quality of files needs to be based on "blind listening tests" or 'ABX'ing. And that is nothing SpasV did up to now.
SpasVstar V.I.P. on July 20th, 2007 / post 20511
:-) I would definitely go to the hydrogenaudio forum if someone defends the idea that a 16 kHz AVB’s ASOT has a better quality than a 19 kHz has. And I hope I will not be banned for that as at TMB.
Skype:spas.velev
TomMixlightning 3daywarning on July 20th, 2007 / post 20512
Ojay wrote:
SpasV would be banned immediately on the hydrogenaudio forums. That is because any claim on the sound quality of files needs to be based on "blind listening tests" or 'ABX'ing. And that is nothing SpasV did up to now.
:smile:

SpasV wrote:
:-) I would definitely go to the hydrogenaudio forum if someone defends the idea that a 16 kHz AVB’s ASOT has a better quality than a 19 kHz has. And I hope I will not be banned for that as at TMB.

i will say it! regarding the source of ASOT from DI.fm it comes as a 192kbit/s mp3 stream
and i say it has a cutoff at ~16khz so it would be useless to record it via the line in with 256kbit/s and a ~19khz cutoff ... :evil:

spasv please take the time an post ur 'tests' on hydrogeneaudio because here on tm, it is all about the music and not the teqnique! no offence to the peeps on tm, but i think they dont care what they get as long as it plays and its free.
just a very few or maybe just i do care about the teqnique behind codecs.

im sorry but if one knows that xmradio has a stream with a bitrate of approx. 56kbit/s he aac+
how can u post this as a >256kbit/s aac!?!?!?
and all this only because of some spectras? cmon! they tell nothing at all!

the whole world knows that mp3 delivers a transparent quality of a cd-rip at 192kbit/s vbr!
so one should not accept anything >192kbit/s in mp3 if it is not a line-in or dat rip!!!

im sorry to say that but i do care about bandwidth and storage space!
SpasVstar V.I.P. on July 21st, 2007 / post 20513
:-) In a topic such a Digital Sound Processing neither consider I any assumptions nor discuss I assumptions.
The only acceptable, for me, way to discuss the topic is by providing verifiable facts.
Skype:spas.velev
TomMixlightning 3daywarning on July 21st, 2007 / post 20546
SpasV wrote:
:-) In a topic such a Digital Sound Processing neither consider I any assumptions nor discuss I assumptions.
The only acceptable, for me, way to discuss the topic is by providing verifiable facts.
:lol:
damn u will never get it! ur telling tales here!
posting some pictures that say nothing at all but they are coloured!
and ur whole posts are nothing but assumptions! because u cant measure nor prove audible quality with graphs. and ur ALONE on ur very own strange way of 'testing' and 'verifing'.

i just recall ur statement:
"SpasV" wrote:
Now I am ready to prove that the XM Sat Radio Channel 80 broadcasts have an audio CD quality
We can use any method you want.
here
:lol: and at the end it turned out, that xmradio has no cd quality despite of all funny pictures and formulas u have posted!!!

GO TO HYDROGENAUDIO! try to sell ur tales there man. unfortunately it seems, i am the only one here teqniqually participating in ur sweet topics.
and since this is a BTT community, it is the very wrong place for ur pictures!!!
arnanivip D-Formation Gue on July 21st, 2007 / post 20547
i m really so tired from this topic ur genuis guys  :-D
SpasVstar V.I.P. on August 24th, 2007 / post 21186
:-) What follows are results I have already posted at hydrogenaudio forum. The replies were in general negative, some were friendly some not. I am not going to discuss them but I would like to stress on my point of view.
First of all I work with functions – discrete time functions that represent a sound also. This means if these functions are processed with a proper device they generate wave sounds.
Second, I have processed these functions with two kinds of processors – filters and codecs. Both kinds of processors implement legal operations over functions that represent sound, so their results are functions that are sounds also.
The sound is not a formal object but something that can be heard and perceived by a human. And as long as the human’s perceptiveness has many peculiarities the sound evaluation is implemented by listening tests.
I have not implemented listening tests at all. I have not evaluated the sound quality.
I compared functions – the results of the processing with the original function and I measured the differences between them. The difference is not a measure of sound quality.
The difference can tell haw close the functions are, but under the conditions of the model I used, the difference can be a reliable measure of how close the sound waves are also.
That is way, I think, I can make repayable conclusions:
• AAC encoder (m4a) Neroaacenc is better than Lame encoder (mp3),
• Variable bit rate offers better possibilities for the encoder. The only exception is related to Lame and it is - the best possible you can get from it is @320 kbps which is constant bit rate,
• If you have a quality sound with a band width more than 16 kHz (more than FM radio sound) and you are concern about the quality use bit rate higher than, let me say, 220 kbps.

********************************************************************************

mp3, m4a  – how close to the original sound they are.
Is there a difference between a 16 kHz BW and 19 kHz BW sounds?

All these started a few months ago when I saw on a tracker a statement which sounded like this:
“We share (as an mp3 file) the best quality Armin van Buuren’s ASOT show even better than the DI.fm’s.”
It was intriguing, I downloaded the file checked the spectrum and saw … the band width (BW) was around 16 kHz.  So, their source seemed to have been an FM radio broadcast and the recording has been encoded with Lame’s V0 option. I thought it was impossible for an mp3 having a BW of 19 kHz, as mp3 @192 kbps has it, to have worse quality than any other mp3, originating from the same source, with BW less than 19 kHz. The guys were not agreed, so I decided to use a simulation model and to check this.

First of all I decided to measure the distance between two sounds using the mean squared error:



One of the sounds, a reference, is an original while the second is some derived from it through filtering and encoding. With this metric I could say how close the two sounds are and to conclude that one which is closer to the original is better. So, I needed the .wav files to work with the sound samples.
I used as a test (reference sound) a CD rip with a full BW of 22.05 kHz. I filtered this sound through two filters having cut off frequencies of 19 kHz and 16 kHz. I encoded/decoded these two sounds and measured the differences between the results and the reference sound. Initially I thought to compare all the results to the test signal, but I couldn’t obtain well distinguished measurements. That is why I used a third filter with a cut off frequency of 22.05 kHz only to have the filter’s phase distortions in the reference test signal also.
The filters I used are FIR filters designed by Kaiser Window method with parameters: beta = 0.001 (attenuation of 60 dB in the stop band) and transition band of 250 Hz. I used a type I FIR filter with an order of 638 (639 coefficients).
Here is the model.



The distances (as sqrt(mse)) for the two band restricted sounds are: 110 for the 19.wav and 246 for the 16.wav.
To check the model, here are the spectra for the model sounds, where the spectra 22-19.wav, 11-16.wav, and 19-16.wav are for the corresponding difference signals.









I encoded the test signals using Lame 3.98 beta 5 with options:
-q0 –mj for all cases (the best possible results per bit),
-b and 192, 224, 256, 320 for the constant bit rate,
-Vx –lowpass YY.Y where: x = [0, 1, 2], and YY.Y = [19.5, 16.5] – for variable bit rate.
(I used –lowpass option to force the encoder not to determine its low pass filter, and this way to change the spectrum band width, according to the parameter Vx.
Here are the results (provided are SQRT(mse)) for the Lame 3.98 beta5 mp3 encoder



What is easy to see is:
• When working at CBR the encoder distinguishes the BW difference only at 320 kbps.
• When working at VBR the encoder distinguishes the BW difference at bit rate higher than, let say, 210 kbps although the bit rate is not a parameter you can control directly.
(The lines corresponding to 110 and 246 are the differences between the reference 22 kHz signal and the 19 kH and the 16 kHz tests.)

As to the question “Is it possible for a 16 kHz BW sound to be closer to the original than an 18.9 kHz (mp3 @192 kbps)” yes in this model it is possible. To check this I encoded/decoded the TEST.wav and the 22.wav signals and their closeness to the original is 592 and 561. So, the parameters V0, V1 can generate a file from 16 kHz test file which is closer to the original.
AT this point I have to say: I was wrong when I have said this was impossible. Sorry for that. The guys, maybe, were right.
But, what about AAC @192 kbps?  The 22.wav encoded @192 CBR has a distance of 432, possible for Lame @232 kbps VBR. The 22.wav encoded @192 VBR has a distance of 402, possible for Lame @320 kbps. So, the signal band width does essentially matter.

When I was done I decided to see what an AAC encoder (neroaacenc) is capable of.
Here are the results for both – the Lame and neroaacenc. The new information I added concerning Lame encoder is ABR results obtained for 19 kHz test signal. This option is close to the CBR option.

As it can be seen the neroaacenc’s results are closer to the originals at any bit rate for both CBR and VBR.
For curiosity only – Lame with a 19 kHz sound and Neroaacenc at CBR with 16 kHz sound have the same closeness only @320 kbps. Which encoder, do you think, is better?

Skype:spas.velev
TomMixlightning 3daywarning on September 1st, 2007 / post 21301
:lol:
i would say found a organisation! build up a website and try to get some members.
there are lots of people out there in the web doing the strangest things ...
why not to uphold such a ridiculous theory.

but please please with sugar on top: STOP POSTING UR STRANGE SELFMADE TESTS!!! :evil:

i guess the post from uart of hydrogenaudio hits it best:
"uart of hydrogenaudio forums" wrote:
Hi SpacV. People often come here with exactly the same metric and very similar analysis as you've just presented and they all get the same response. It's not a good metric because it makes no reference to psycho-acoustics. Your metric is blind to the ears differential sensitivity at different frequencies, it's blind to the fact that some sounds mask others, it's blind to the fact that some forms of distortion sound really bad and others don't. The metric is just too simple and is no substitute for actual listening tests.


so take ur sweet spectras and go :smile:

here is the full link to spasvs post at hydrogeneaudio forums


post scriptum: im impressed that ur still chasing ur strange-theory. i guess ur evolving/learning on the wrong points.
Ziggylightning EDIT ME!!!! on September 18th, 2007 / post 21488
i have exams all of this week, but as soon as i have time i want to read every word on here... (im studying this in school!!!! i love it)
you cannot post in this forum.
click here to to create a user account to participate in our forum.