Like I said, I haven’t been in the code for this yet to know offhand exactly what it (or the trivial ‘fade’ in/out) is doing - but I can’t think of any user visible control that isn’t purporting to operate in dB - and as a general thing I would have thought that “digital audio” is mature enough in 2025 that most things are unless they have some reason to deliberately not be.
I do know all too well that there are lots of places for subtle bugs to hide in DSP code, so it was safe to say you’ve found something we should investigate, but not to just pile on to the assertion of what was wrong without actually investigating.
So having a quick look around now:
-
it looks like MLT provides an autofade filter, that is indeed just a naive fade with a linear stepped decay_factor - used for both audio and video - and since vision (like audio and pretty much all our senses) has an also approximately logarithmic relationship between stimulus and sensation - that will do neither of them in a perceptually linear manner.
-
It also provides an audioseam filter, which is correctly computing dBFS (20 * log10 of amplitude) - and which at first blush I thought might actually be doing a dynamic constant power transition of the kind I figured was overkill - but is actually doing something much weirder. It looks at only two sample values, the last sample of the old clip and first of the new - and if those two in isolation differ by some ‘dB’ threshold, then it sythesises a (linear) crossfade by reversing the last frame of the old clip and mixing that with the first frame of the new clip …
Which is an … interesting … idea - but not how I’d be looking to detect and smooth a discontinuous seam.
But neither of those appear to actually be in play here. It would seem that single track transitions are handled with yet another implementation, in the mix transition - which does a ‘weighted’ linear fade, with some sides of decimation and low-pass filtering and drift compensation and channel count matching - because the old and new clips may have different sample rates and channel counts, and both of those may be different to the project output audio stream.
And at first blush that all seems to be entirely independent of any manipulation done by the volume control - but definitely earns itself a place on the cans’o’worms league table.
That’s all from just a very quick look at some of the code - so I wouldn’t bet too heavily that I’m not missing some important things in that description - but that quick look did make doing something fundamentally much simpler to achieve the same goal seem like a pretty attractive idea. And it doesn’t look like something you could just replace with afade, it’s doing a lot more than just fading, but it’s all very hardwired together.
I would probably do it in multitrack with more control, only if I want to be very precise.
That’s kind of the crux of the angle I’m looking at this from - the style of transitions you choose to use for your editing (multi or single track) should mostly be a purely UI preference thing. It shouldn’t effect your ability to be precise or to be able to perform some fundamental operation.
And from what I’m seeing right now, it really is just this entanglement of a hardcoded fade with the mix operation that makes this troublesome. The volume control is still functional over the full extent of both clips, so you can already use it to (re)shape the mix more or less however you like - though it’s not terribly intuitive and may not interact well with the hardcoded fade if you try to force things too hard. But you could definitely compensate for the (non)linearity in the (probably few) cases where that is actually audible.
not provide all the curves possible by avfilder.afade, but useful ones
Who gets to decide what is useful? Limiting people to just what someone else arbitrarily considered useful pretty is pretty much always guaranteed to upset someone and come back to bite you later. It’s ok to have simple defaults, but if something is possible, eventually there will be someone or some use that really needs to do it. Limiting possibility to just a subset of already limited possibilities is just a recipe for needing to do this all again later for someone else with different needs.
be a little bit more selected/curated, as there are many “duplicates”, and that can be overwhelming
Yeah, let’s not get that tangled up in this thread - but you’re not alone in thinking that either, Bernd has been putting a lot of work into improving this, but it’s an iterative process - and as we’ve just seen in the last few days can be confounded by each ‘duplicate’ having one tiny little useful thing that it can do which the other versions can’t. The ‘main effects’ category tries to prune out some without completely removing them for people who need them (or historically used them in old projects).
I guess you meant if we want a crossfade we would edit fade out of the left clip, and fade in on the right clip
Yes, that’s effectively all a cross fade is, a mix where the gain of each channel varies with time. I’m not suggesting we force people to create them manually like that for the simple cases, for a few selected common cases we can have magic buttons and a default option. But if we implement it this way we get both a very simple front end for the “common” cases, while still exposing full control when that is what you need - without actually creating anything new or more complicated than what already exists, and without adding even more layers and complexity to the DSP pipeline.
So we’d go from having two (as it is now) or maybe three (if we threw afade into the mix) things potentially trying to multiply the volume up and/or down to just one doing that, with some shortcuts for adding commonly desired keyframes.
This is how crossdade editor looks like in professional audio software
…but that is too much for a video editor
A continental breakfast dialog like that probably is - but we already have all the features that it provides (including user-defined spline curves) in the standard Volume effect. So all we’d need to do is stop using the hard-coded fade in that mix and provide buttons in the audio mix widget to create suitable keyframes for whatever ‘standard’ crossfade styles we want to offer.
Anyone that doesn’t work for remains free to fix those keyframes manually however they like. And we get both a cleaner UI and DSP pipeline, just by removing limits instead of adding more of them.
Volume control should not introduce phase shifts
Right, but you don’t get to pick the start of what you’re mixing in with better than video-frame size granularity, and other audio effects can/will. The FIR lowpass in the samplerate matching mixer certainly will.
Let’s split it into a 3 levels from perspective of development complexity
Yeah, I think that’s overcomplicating what I’m thinking of here. : )
I’m (increasingly) thinking the problem is that’s it’s already overcomplicated with too much duplication of basic DSP tasks - and the way to make it simpler, more correct, more powerful, and less computationally complex - is not to spend more time developing new layers and processing modules, but to delete the duplication we already have that’s just getting in the way of using the things that we’re already considering to be intrinsic to their fullest potential.
I’m just waiting to see what someone points out that I’ve forgotten which makes this too Not That Simple …