Posts: 759
Threads: 35
Joined: 2018 Feb
Thanks: 651
Given 1071 thank(s) in 405 post(s)
Country:
2023-05-15, 03:26 PM
(This post was last modified: 2023-05-15, 03:33 PM by bronan.)
Alright, think I got it! I realized after my last post I was going in the wrong direction just going by difference in amplitude, so I started studying up more on the FFT method. Unfortunately, switching to frequency domain also has its issues, and this is mostly a time domain problem for us since the tracks should be very similar if mono.
So I went back to the drawing board yet again, and looked into some different ideas for comparing two data sets, namely correlation and Mean Squared Error. Essentially, you multiply the 2 samples against each other, then divide by the root. You're left with a value between 0-1, with the closer to 1, the closer the two are to being the same.
Using this new metric I re-ran tests for all the tracks I've been looking at it, and its done the best without any sort of optimizations like normalizing a channel or cutting out content from the beginning and end. Anything close to 1 is mono, and anything 0.5 or less is definitely stereo. Will do a little more experimenting and see if I can improve the scores with a few tweaks, but I think I'm on the right track now.
Code: Stereo Examples
---------------------------------
Akira (LD): 0.49
Akira (BD): 0.508
Commando (JP LD): 0.400
Twin Peaks (LD): 0.628
Sorcerer (LD): 0.512
Predator (JP LD): 0.318
Mono Examples
---------------------------------
Day of the Dead (BD): 0.998
Day of the Dead (VHS): 0.963
Day of the Dead (LD): 0.868
Vampire Hunter D (LD): 0.942
Vampire Hunter D (VHS): 0.940
Rear Window (UHD): 1.000
Posts: 192
Threads: 9
Joined: 2022 Sep
Thanks: 6
Given 114 thank(s) in 66 post(s)
Country:
you have a fake stereo track you can test for fun? don't see them too often but wonder what one would score.
Posts: 759
Threads: 35
Joined: 2018 Feb
Thanks: 651
Given 1071 thank(s) in 405 post(s)
Country:
2023-05-15, 07:36 PM
(This post was last modified: 2023-05-15, 07:39 PM by bronan.)
Any tracks to recommend? I tested one that is supposedly the Terminator fake stereo Chace mix, but it came back mono and LDDB even lists at as mono (LD 2535) so I guess that one can be put to bed
Posts: 192
Threads: 9
Joined: 2022 Sep
Thanks: 6
Given 114 thank(s) in 66 post(s)
Country:
i got a couple i can think of off the top of my head, i'll send them your way.
Posts: 759
Threads: 35
Joined: 2018 Feb
Thanks: 651
Given 1071 thank(s) in 405 post(s)
Country:
2023-05-15, 07:58 PM
(This post was last modified: 2023-05-15, 07:59 PM by bronan.)
Ended up removing the windowing altogether, after remembering the cardinal sin of averaging averages. It actually really tightened things up, none of the mono tracks I've tested now dip below 0.98, even the Day of the Dead LD which scored a .86 previously with the window averaging. Also to note, tracks with digital duplicate channels like Rear Window will come back as a pure 1.0 score so we still know which are which.
I also tried normalizing the right channel to the left, and it actually made this test worse so I will leave it as is. That's good news because I can combine this calculation in with the scan for RMS and Peak, removing yet another pass through the file and saving some time. Once that's done we'll test a few more tracks and hopefully get a good idea out where the cutoff point for the scoring is.
Code: Stereo
---------------
akira (ld): 0.712
commando (jp ld): 0.620
sorcerer (ld): 0.692
twin peaks (LD): 0.852
Mono
---------------
dayofthedead (ld): 0.987
dayofthedead (vhs): 0.999
vampire hunter d (ld): 0.998
rear window (uhd): 1.000
Posts: 759
Threads: 35
Joined: 2018 Feb
Thanks: 651
Given 1071 thank(s) in 405 post(s)
Country:
(2023-05-15, 07:42 PM)Yarp Wrote: i got a couple i can think of off the top of my head, i'll send them your way.
Butch Cassidy (VHS): 0.394
The Graduate (VHS): 0.298
They are VERY stereo comparing to the scores in previous post of original stereo tracks lol. So we can definitely detect fake stereo, if you know the original mix is mono.
Posts: 192
Threads: 9
Joined: 2022 Sep
Thanks: 6
Given 114 thank(s) in 66 post(s)
Country:
(2023-05-15, 08:08 PM)bronan Wrote: (2023-05-15, 07:42 PM)Yarp Wrote: i got a couple i can think of off the top of my head, i'll send them your way.
Butch Cassidy (VHS): 0.394
The Graduate (VHS): 0.298
They are VERY stereo comparing to the scores in previous post of original stereo tracks lol. So we can definitely detect fake stereo, if you know the original mix is mono.
cool, figured it would show up like they were super stereo since the L channel usually looks/sounds like it has half the info as the R channel. fortunately these usually mix down to mono really nicely.
now for the next upgrade, make it detect whether a 2.0 track is matrixed surround or not
Posts: 759
Threads: 35
Joined: 2018 Feb
Thanks: 651
Given 1071 thank(s) in 405 post(s)
Country:
2023-05-15, 08:45 PM
(This post was last modified: 2023-05-15, 08:45 PM by bronan.)
(2023-05-15, 08:28 PM)Yarp Wrote: cool, figured it would show up like they were super stereo since the L channel usually looks/sounds like it has half the info as the R channel. fortunately these usually mix down to mono really nicely.
now for the next upgrade, make it detect whether a 2.0 track is matrixed surround or not
Ah I see what you mean about the channels. Well it may actually be possible to detect fake stereo then if it always falls within these low parameters between 0.2 and 0.4 and other tracks don't. And oh boy, that sounds like another can of worms
Posts: 759
Threads: 35
Joined: 2018 Feb
Thanks: 651
Given 1071 thank(s) in 405 post(s)
Country:
2023-05-15, 08:51 PM
(This post was last modified: 2023-05-15, 08:52 PM by bronan.)
OK, I got the stereo check integrated with the pass for peak & RMS checks so it shouldn't add much overhead at all. I actually reused the squaring multiplication for RMS for this check as well, so talk bout efficiency Here is how the output of a file looks now.
Code: File: dotd-vhs.wav
Format: Extensible
Duration: 01:36:58.95
Bitrate: 288000
Channel(s): 2
Bit Depth: 24
Sampling Rate: 48000
Block Align: 6
Data Index: 68
Data Length: 1675858944
Samples (per channel): 279309824
Calculating channel info...
Channel 0
Peak (dB): -2.625
RMS (dB): -25.667
Average (dB): -131.830
Channel 1
Peak (dB): -2.716
RMS (dB): -25.732
Average (dB): -129.364
Stereo Analysis: Mono (0.999)
Calculating Loudness (EBU R 128)...
Integrated Loudness:
I: -21.38 LUFS
Threshold: -32.82 LUFS
Dynamic Range (LRA): 16.93 LU
Max Short Term Loudness: -10.99 LUFS @ 00:56:04.40
Max Momentary Loudness: -9.89 LUFS @ 00:56:05.80
Processing time: 02:39
Posts: 759
Threads: 35
Joined: 2018 Feb
Thanks: 651
Given 1071 thank(s) in 405 post(s)
Country:
2023-05-16, 09:32 PM
(This post was last modified: 2023-05-16, 09:40 PM by bronan.)
Been thinking more about the scoring of the latest algorithm. The mono stuff being dead on 1.0 makes sense, though I was kinda surprised just how close even the imperfect analog sources ended up scoring. The thing that I didn't really expect though is why stereo stuff scored so high in the 65-70% range. It seemed awful high to me to say that the channels are 70% the same on average. But, after giving it some more thought, and seeing the result of the Fake Stereo tests it does make some sense.
If a 100% score is channels being exactly the same, then 0% would be the channels completely opposite, ie. one super loud and one super quiet. The Fake stereo stuff is processed so heavily with voices completely in one channel, that ends up happening and they scored low around 30-40% the same. But then it occurred to me that when you mix for stereo, the actors are usually directly in front of the camera so the separation level is small anyway. So our scale is roughly:
Code: 0 ----- 0.3 ------ 0.4 ----0.6 ---- 0.8 -----0.9---- 1.0
Wtf Fake Stereo Stereo Mono
But we'll see after we get more examples and can figure out for sure where average range bounds are.
|