I would have thought a bridged design would be cleaner than an equivalent non-bridge design. Isn't the former a balanced design by nature, greatly suppressing even-ordered harmonics?
The differences are very small between the two measurements and don't forget we are still at the mercy of the signal source. The final result is a complex summation of the signal source, the device under test and the measurement system.
There is a lot more to it than expecting a "textbook" improvement. We are right at the limits of all the contributing components.
Distortion down at -140dB is really difficult to measure accurately. You will also see run to run variations with each subsequent measurement being slightly different.

