Here's a little graphic I drew up. This is a toy example of a hypothetical camera that has 4 stops of dynamic range and 5 bit depth, using individual photons as exposure for simplicity.
The X axis represents the range of the camera, similar (not identical) to the X axis in the histogram you see on the back of your LCD. It is scaled by stops.
5 bits = 32 discrete luminosity values possible (let's say 0 = "no photons during exposure, blocked shadow". 31 = "more than we can detect, blown highlight"). They are NOT divided equally across the range.
The top stop of dynamic range has 1/2 of all the lightness values the sensor can detect, but is only 1/4 of the range, since stops are logarithmic.
Above are two examples of histograms. Normally the LCD screen doesnt have the resolution to show you these differences, but our low bit depth example makes it apparent. One is not exposed to the right. Notice that it has less information and is blockier, because most of the image is falling in the low-precision part of the range. Whereas the one exposed to the right has much more data in it, because it's using the higher precision part of the range. You then edit it in RAW with all this extra data to your liking, then digitally stop back to a proper exposure.
TL;DR: You'll get more posterization if you expose the same image to the left of the histogram than to the right
AFAIK, this is all still as true of modern sensors as older ones.
And when you stop and think about it, it becomes extremely intuitive and necessarily obvious: when you ETTR, you're getting the same image, without clipping any of it, but you're doing so with more light in every part of the image.
Of course more light = more fine grained information available. You're literally letting more data into the lens, and sensors are fundamentally capable of taking advantage of that extra data (as long as you shoot in RAW, although even a trivial little bit if you don't at the very lowest stops!)
It's like trying to understand somebody on a low bandwidth audio connection versus a higher one. Same person talking, same sentences, but lower number of data bins makes it almost incomprehensible if you're being stingy with your megabytes. Same thing here, but you're being stingy with your shutter speed (or whatever). If you NEED to (very low light), then go for it. Otherwise ETTR.