This separates code relying on inline from that relying on external assembly and fixes instances where the coalesced check was incorrect.
Move vector_fmul() from DSPContext to AVFloatDSPContext.