Commits · 38cd821dc426deb51e1141a721b55229fc301d19 · Roger Ferrer / llvm-epi-0.8

Aug 24, 2011
- Fix whitespace. · 38cd821d
  Eli Friedman authored Aug 24, 2011
```
llvm-svn: 138487
```
  38cd821d
- Basic x86 code generation for atomic load and store instructions. · 342e8df0
  Eli Friedman authored Aug 24, 2011
```
llvm-svn: 138478
```
  342e8df0
- Mark VZEROALL as clobbering all YMM registers · ce028406
  Bruno Cardoso Lopes authored Aug 24, 2011
```
llvm-svn: 138461
```
  ce028406
- Move TargetRegistry and TargetSelect from Target to Support where they belong. · 2bb40357
  Evan Cheng authored Aug 24, 2011
```
These are strictly utilities for registering targets and components.

llvm-svn: 138450
```
  2bb40357
- Break 256-bit vector int add/sub/mul into two 128-bit operations to avoid... · de92622a
  Craig Topper authored Aug 24, 2011
```
Break 256-bit vector int add/sub/mul into two 128-bit operations to avoid costly scalarization. Fixes PR10711.

llvm-svn: 138427
```
  de92622a
- Fix a nasty bug where a v4i64 was being wrong emitted with 32-bit · 9e9f2ce3
  Bruno Cardoso Lopes authored Aug 23, 2011
```
permutations. Also tidy up some patterns and make them close to their
instruction definition!

llvm-svn: 138392
```
  9e9f2ce3
Aug 23, 2011
- Some refactoring so TargetRegistry.h no longer has to include any files · 4d6c9d71
  Evan Cheng authored Aug 23, 2011
```
from MC.

llvm-svn: 138367
```
  4d6c9d71
- PerformSubCombine to work on integers larger than i128. Fixes a crasher. · 4c8ff77f
  Nick Lewycky authored Aug 23, 2011
```
llvm-svn: 138354
```
  4c8ff77f
- Add support for breaking 256-bit v16i16 and v32i8 VSETCC into two 128-bit... · 6612e35b
  Craig Topper authored Aug 23, 2011
```
Add support for breaking 256-bit v16i16 and v32i8 VSETCC into two 128-bit ones, avoiding sclarization. Add vex form of pcmpeqq and pcmpgtq. Fixes more cases for PR10712.

llvm-svn: 138321
```
  6612e35b
- Introduce a pass to insert vzeroupper instructions to avoid AVX to · 2a3ffb5d
  Bruno Cardoso Lopes authored Aug 23, 2011
```
SSE transition penalty. The pass is enabled through the "x86-use-vzeroupper"
llc command line option. This is only the first step (very naive and
conservative one) to sketch out the idea, but proper DFA is coming next
to allow smarter decisions. Comments and ideas now and in further commits
will be very appreciated.

llvm-svn: 138317
```
  2a3ffb5d
- X86: Add some operand types required to identify calls. · 9dc808e7
  Benjamin Kramer authored Aug 22, 2011
```
llvm-svn: 138285
```
  9dc808e7
Aug 22, 2011
- Add support for breaking 256-bit int VETCC into two 128-bit ones, · 74f090d4
  Bruno Cardoso Lopes authored Aug 22, 2011
```
avoding scalarization of the compare. Reduces code from 59 to 6
instructions. Fix PR10712.

llvm-svn: 138271
```
  74f090d4
- Add 128-bit AVX codegen for PCMP* family of integer instructions · 6e62ca94
  Bruno Cardoso Lopes authored Aug 22, 2011
```
llvm-svn: 138270
```
  6e62ca94
Aug 20, 2011
- Re-write part of VEX encoding logic, to be more easy to read! Also fix · d126347f
  Bruno Cardoso Lopes authored Aug 19, 2011
```
a bug and add a testcase!

llvm-svn: 138123
```
  d126347f
Aug 19, 2011

Add TB encoding to VEX versions of SSE fp logical operations to fix disassembler · ba6c2a52
Craig Topper authored Aug 19, 2011
```
llvm-svn: 138034
```
ba6c2a52
Fix PR10677. Initial patch and idea by Peter Cooper but I've changed the · 22241acc
Bruno Cardoso Lopes authored Aug 19, 2011
```
implementation!

llvm-svn: 138029
```
22241acc

Re-encoded 128-bit AVX versions of SQRT, RSQRT, RCP have 3 operands · 5647d84a

Bruno Cardoso Lopes authored Aug 18, 2011

instead of 2. They were already defined this way in their regular
version, but not for the intrinsics versions (*_Int), and that would work
for assembly emission but not for object code, since a MachineOperand
would be missing. This commit fix PR10697.

Also removed the {VSQRT,VRSQRT,VRCP}r_Int forms and match the intrinsic
via INSERT_SUBREG+EXTRACT_SUBREG patterns. The same couldn't be done for
memory versions because sse_load_f32/sse_load_f64 operand need special
handling and don't work like regular "addr" operands.

There are right now 114 "*_Int" and 98 "Int_*" forms! I'm slowly
removing them as I step through, but hope we can get rid of these
someday, they are really annoying :)

llvm-svn: 138012

5647d84a

Aug 18, 2011
- Cleanup vector logical ops in AVX and add use int versions for simple · 3c7d6eb6
  Bruno Cardoso Lopes authored Aug 18, 2011
```
v2i64

llvm-svn: 137919
```
  3c7d6eb6
- Fix PR10688. Add support for spliting 256-bit vector shifts when the · 1a87fcb9
  Bruno Cardoso Lopes authored Aug 17, 2011
```
shift amount is variable

llvm-svn: 137885
```
  1a87fcb9
Aug 17, 2011

Allow the MCDisassembler to return a "soft fail" status code, indicating an... · a4043c4b

Owen Anderson authored Aug 17, 2011

Allow the MCDisassembler to return a "soft fail" status code, indicating an instruction that is disassemblable, but invalid. Only used for ARM UNPREDICTABLE instructions at the moment.
Patch by James Molloy.

llvm-svn: 137830

a4043c4b

Introduce matching patterns for vbroadcast AVX instruction. The idea is to · be5e9873

Bruno Cardoso Lopes authored Aug 17, 2011

match splats in the form (splat (scalar_to_vector (load ...))) whenever
the load can be folded. All the logic and instruction emission is
working but because of PR8156, there are no ways to match loads, cause
they can never be folded for splats. Thus, the tests are XFAILed, but
I've tested and exercised all the logic using a relaxed version for
checking the foldable loads, as if the bug was already fixed. This
should work out of the box once PR8156 gets fixed since MayFoldLoad will
work as expected.

llvm-svn: 137810

be5e9873

Update comments about vector splat handling in x86 · 6d33c7f3
Bruno Cardoso Lopes authored Aug 17, 2011
```
llvm-svn: 137808
```
6d33c7f3

Now that we have a canonical way to handle 256-bit splats: · ed786a34

Bruno Cardoso Lopes authored Aug 17, 2011

vinsertf128 $1 + vpermilps $0, remove the old code that used to first
do the splat in a 128-bit vector and then insert it into a larger one.
This is better because the handling code gets simpler and also makes a
better room for the upcoming vbroadcast!

llvm-svn: 137807

ed786a34

Aug 16, 2011

Instead of always leaving the work to the generic legalizer when · 2e99f1b3

Bruno Cardoso Lopes authored Aug 16, 2011

there is no support for native 256-bit shuffles, be more smart in some
cases, for example, when you can extract specific 128-bit parts and use
regular 128-bit shuffles for them. Example:

For this shuffle:
  shufflevector <4 x i64> %a, <4 x i64> %b, <4 x i32>
                <i32 1, i32 0, i32 7, i32 6>

This was expanded to:
  vextractf128  $1, %ymm1, %xmm2
  vpextrq $0, %xmm2, %rax
  vmovd %rax, %xmm1
  vpextrq $1, %xmm2, %rax
  vmovd %rax, %xmm2
  vpunpcklqdq %xmm1, %xmm2, %xmm1
  vpextrq $0, %xmm0, %rax
  vmovd %rax, %xmm2
  vpextrq $1, %xmm0, %rax
  vmovd %rax, %xmm0
  vpunpcklqdq %xmm2, %xmm0, %xmm0
  vinsertf128 $1, %xmm1, %ymm0, %ymm0
  ret

Now we get:
  vshufpd $1, %xmm0, %xmm0, %xmm0
  vextractf128  $1, %ymm1, %xmm1
  vshufpd $1, %xmm1, %xmm1, %xmm1
  vinsertf128 $1, %xmm1, %ymm0, %ymm0

llvm-svn: 137733

2e99f1b3

While I'm here, remove the "_alt" hacks to a series of INSERT_SUBREG and · c1676e41
Bruno Cardoso Lopes authored Aug 15, 2011
```
also add the AVX versions of the 128-bit patterns

llvm-svn: 137685
```
c1676e41
Reorder declarations of vmovmskp* and also put the necessary AVX · 67005029
Bruno Cardoso Lopes authored Aug 15, 2011
```
predicate and TB encoding fields. This fix the encoding for the
attached testcase. This fixes PR10625.

llvm-svn: 137684
```
67005029

MCTargetAsmParser target match predicate support. · 120a96a7

Jim Grosbach authored Aug 15, 2011

Allow a target assembly parser to do context sensitive constraint checking
on a potential instruction match. This will be used, for example, to handle
Thumb2 IT block parsing.

llvm-svn: 137675

120a96a7

Aug 15, 2011
- Fix PR10656. It's only profitable to use 128-bit inserts and extracts · cbe7feea
  Bruno Cardoso Lopes authored Aug 15, 2011
```
when AVX mode is one. Otherwise is just more work for the type
legalizer.

llvm-svn: 137661
```
  cbe7feea
Aug 12, 2011
- Fix comment! · c53dd2ac
  Bruno Cardoso Lopes authored Aug 12, 2011
```
llvm-svn: 137521
```
  c53dd2ac
- The VPERM2F128 is a AVX instruction which permutes between two 256-bit · f15dfe58
  Bruno Cardoso Lopes authored Aug 12, 2011
```
vectors. It operates on 128-bit elements instead of regular scalar
types. Recognize shuffles that are suitable for VPERM2F128 and teach
the x86 legalizer how to handle them.

llvm-svn: 137519
```
  f15dfe58
- Move code around and add comments · 960c8f71
  Bruno Cardoso Lopes authored Aug 12, 2011
```
llvm-svn: 137518
```
  960c8f71
- Silence a bunch (but not all) "variable written but not read" warnings · a41634e3
  Duncan Sands authored Aug 12, 2011
```
when building with assertions disabled.

llvm-svn: 137460
```
  a41634e3
- findDeadCallerSavedReg fix: Missing NULL terminator in register arrays. · 210bf835
  Andrew Trick authored Aug 12, 2011
```
Fix by Ivan Baev. Sorry I don't have a unit test, but the fix is obvious so I don't want to delay it.

llvm-svn: 137404
```
  210bf835
Aug 11, 2011
- Add a dag combine to xform 256-bit shuffles into simple vector · 8fbf023c
  Bruno Cardoso Lopes authored Aug 11, 2011
```
inserts and extracts. This simple combine makes us generate only 1
instruction instead of 11 in the v8 case.

llvm-svn: 137362
```
  8fbf023c
- Fix PR10492 by teaching MOVHLPS and MOVLPS mask matching to be more strict. · 043c8208
  Bruno Cardoso Lopes authored Aug 11, 2011
```
llvm-svn: 137324
```
  043c8208
- Add a comment, per Bruno's CR. · efdd183f
  Nadav Rotem authored Aug 11, 2011
```
llvm-svn: 137313
```
  efdd183f
- [AVX] If the data which is going to be saved is already in two XMM registers · 1542d5a0
  Nadav Rotem authored Aug 11, 2011
```
(for example, after integer operation), do not pack the registers into a YMM
before saving. Its better to save as two XMM registers.

Before:
                vinsertf128         $1, %xmm3, %ymm0, %ymm3
                vinsertf128         $0, %xmm1, %ymm3, %ymm1
                vmovaps              %ymm1, 416(%rsp)

After:
                vmovaps              %xmm3, 416+16(%rsp)
                vmovaps              %xmm1, 416(%rsp)

llvm-svn: 137308
```
  1542d5a0
- Cleanup: Remove Int_ CVTSS2SI* forms · dbd1352c
  Bruno Cardoso Lopes authored Aug 11, 2011
```
llvm-svn: 137297
```
  dbd1352c
- Splats for v8i32/v8f32 can be handled by VPERMILPSY. This was causing · a2d8bb97
  Bruno Cardoso Lopes authored Aug 11, 2011
```
infinite recursive calls in legalize. Fix PR10562

llvm-svn: 137296
```
  a2d8bb97
- Use the splat index to generate the desired shuffle. Otherwise we · 572c9aaf
  Bruno Cardoso Lopes authored Aug 11, 2011
```
could only get undefs and the vector shuffle becomes an undef,
generating wrong code.

llvm-svn: 137295
```
  572c9aaf