vec128_storageGeneric wrapper for unparameterized storage of any of the possible impls.
Converting into and out of this type should be essentially free, although it may be more
aligned than a particular impl requires.
dispatchGenerate the full set of optimized implementations to take advantage of the most important
hardware feature sets.
dispatch_light128Generate only the basic implementations necessary to be able to operate efficiently on 128-bit
vectors on this platfrom. For x86-64, that would mean SSE2 and AVX.
dispatch_light256Generate only the basic implementations necessary to be able to operate efficiently on 256-bit
vectors on this platfrom. For x86-64, that would mean SSE2, AVX, and AVX2.