[X86] Don't use EXTRACT_ELEMENT from v1i1 with i8/i32 result type when we need... (48d5ed26) · Commits · Lorenzo Albano / LLVM bpEVL

Commit 48d5ed26 authored Feb 28, 2018 by Craig Topper

[X86] Don't use EXTRACT_ELEMENT from v1i1 with i8/i32 result type when we need...

[X86] Don't use EXTRACT_ELEMENT from v1i1 with i8/i32 result type when we need to guarantee zeroes in the upper bits of return.

An extract_element where the result type is larger than the scalar element type is semantically an any_extend of from the scalar element type to the result type. If we expect zeroes in the upper bits of the i8/i32 we need to mae sure those zeroes are explicit in the DAG.

For these cases the best way to accomplish this is use an insert_subvector to pad zeroes to the upper bits of the v1i1 first. We extend to either v16i1(for i32) or v8i1(for i8). Then bitcast that to a scalar and finish with a zero_extend up to i32 if necessary. We can't extend past v16i1 because that's the largest mask size on KNL. But isel is smarter enough to know that a zext of a bitcast from v16i1 to i16 can use a KMOVW instruction. The insert_subvectors will be dropped during isel because we can determine that the producing instruction already zeroed the upper bits of the k-register.

llvm-svn: 326308

parent 7275da0f

Hide whitespace changes

Inline Side-by-side

Please register or to comment