It totally depends on whether you are creating a ‘final’ copy (for people to simply listen to), or a ‘master’ copy for further signal processing. But if you really care about the latter, you’re probably also working with a greater bit-depth and sampling rate to give you the headroom and precision you need for that… (and this probably gets a bit blurry if you intend publishing your ‘final’ copy through a service that will lossy->lossy transcode it - for that you probably do want to think a bit more like you are mastering rather than publishing).
24 or 32 bit audio with a 96kHz sample rate is snake oil if you’re sold it on the claim that you’ll Hear The Difference (most people can’t hope to hear even what 16-bit/48k can perfectly represent, let alone with everyday equipment in everyday locations) - but it does help with minimising the errors introduced by complex signal processing in the production stage.
It works the same way as it does for images, you hide the quantisation error distortion in a part of the spectrum that’s less perceptible. And for very low level signals, it can be remarkable how well a good noise shaping dither can make a signal that was totally lost in the noise become clearly recognisable.
I’m not sure I’d say the risk was “low” (on a simple proportion of products scale) - just that on any hardware where this is likely to be a problem, it’s probably among the least of the problems that hardware is introducing to the fidelity of your recording.
If you’re listening to it on beats headphones on the bus, you lost this game before you even got out of bed : ) You can’t fix that with a little extra headroom for your peak signal level.
Fewer and fewer devices have user adjustable analogue gain anymore, it’s all done in the digital domain, so it generally makes sense to maximise the dynamic range you want to offer (which minimises your quantisation distortion), and assume all non-distorting gain changes will be reducing the digital signal level.
Yes, absolutely. Anything that changes the sampled signal in a more complex way than simply scaling it up or down has the risk of landing a sample on an out-of-range transient and getting it clipped (and of introducing other distortions too, which is why the extra headroom is useful for production processing).
But if you are certain that none of your samples were clipped (either in the intermediate stages of processing or the final result), and your sampling frequency is greater than double the maximum frequency in your source - then the sampling theorem (correctly : ) says there is precisely one unique function (set of samples) that maps to each possible signal in that domain. Which means you both precisely recorded it, and can precisely reproduce it, only constrained by the accuracy of the analogue equipment you recorded and reproduced it with.