Port libcutils memset16/32 assembly SSE2 optimizations to x86_64 architecture. Ensures the same performance on 64-bit arch. Change-Id: I874a71a884c0d28a152933ddff9cb886c9a6e99e Signed-off-by: Henrik Smiding <henrik.smiding@intel.com>