Commits · d6cc4062be820df20d047050bdc959503ca02b2d · Roger Ferrer / llvm-epi-0.8

Sep 01, 2012
- Typos · d6cc4062
  Craig Topper authored Sep 01, 2012
  
  llvm-svn: 163053
  d6cc4062
- SelectionDAG: when constructing VZEXT_LOAD from other loads, make sure its · 26c5d0f6
  Manman Ren authored Aug 31, 2012
  
  output chain is correctly setup. As an example, if the original load must happen before later stores, we need to make sure the constructed VZEXT_LOAD is constrained to be before the stores. rdar://11457792 llvm-svn: 163036
  26c5d0f6
- Mark FMA4 instructions as commutable and add them to the folding tables. · 908e6851
  Craig Topper authored Aug 31, 2012
  
  llvm-svn: 163035
  908e6851
- Remove an unused argument. The MCInst opcode is set in the ConvertToMCInst() · 451ef13c
  Chad Rosier authored Aug 31, 2012
  
  function nowadays. llvm-svn: 163030
  451ef13c
- Add selection of RegOp2MemOpTable3 to canFoldMemoryOperand · 7573c8f0
  Craig Topper authored Aug 31, 2012
  
  llvm-svn: 163029
  7573c8f0
Aug 31, 2012
- Fix PR12359 · 3224543b
  Michael Liao authored Aug 31, 2012
  
  - In addition to undefined, if V2 is zero vector, skip 2nd PSHUFB and POR as well as PSHUFB will zero elements with negative indices. Patch by Sriram Murali <sriram.murali@intel.com> llvm-svn: 163018
  3224543b
- The instruction DINS may be transformed into DINSU or DEXTM depending · b3f3b17e
  Jack Carter authored Aug 31, 2012
  
  on the size of the extraction and its position in the 64 bit word. This patch allows support of the dext transformations with mips64 direct object output. 0 <= msb < 32 0 <= lsb < 32 0 <= pos < 32 1 <= size <= 32 DINS The field is entirely contained in the right-most word of the doubleword 32 <= msb < 64 0 <= lsb < 32 0 <= pos < 32 2 <= size <= 64 DINSM The field straddles the words of the doubleword 32 <= msb < 64 32 <= lsb < 64 32 <= pos < 64 1 <= size <= 32 DINSU The field is entirely contained in the left-most word of the doubleword llvm-svn: 163010
  b3f3b17e
- Add a comment to explain what's really going on. · 9d1fc367
  Chad Rosier authored Aug 31, 2012
  
  llvm-svn: 163005
  9d1fc367
- The ConvertToMCInst() function can't fail, so remove the now dead Match_ConversionFail enum. · a8f3c4fe
  Chad Rosier authored Aug 31, 2012
  
  llvm-svn: 163002
  a8f3c4fe
- Mark FMA3 instructions as commutable so that the operands to the multiply part can be commuted. · c0387f6b
  Craig Topper authored Aug 31, 2012
  
  llvm-svn: 163001
  c0387f6b
- Add support for converting llvm.fma to fma4 instructions. · c30fdbc4
  Craig Topper authored Aug 31, 2012
  
  llvm-svn: 162999
  c30fdbc4
- Clean up AddedComplexity further after adding UseSSEx · 969f3913
  Michael Liao authored Aug 31, 2012
  
  llvm-svn: 162973
  969f3913
- Fix a couple of typos in EmitAtomic. · d3bda3c5
  Jakob Stoklund Olesen authored Aug 31, 2012
  
  Thumb2 instructions are mostly constrained to rGPR, not tGPR which is for Thumb1. rdar://problem/12203728 llvm-svn: 162968
  d3bda3c5
- X86: Fix encoding of 'movd %xmm0, %rax' · e423e865
  Jim Grosbach authored Aug 31, 2012
  
  The assembly string for the VMOVPQIto64rr instruction incorrectly lacked the 'v' prefix, resulting in mis-assembly of the vanilla movd instruction. llvm-svn: 162963
  e423e865
- With the fix in r162954/162955 every cvt function returns true. Thus, have · 98cfa104
  Chad Rosier authored Aug 31, 2012
  
  the ConvertToMCInst() return void, rather then a bool. Update all the cvt functions as well. llvm-svn: 162961
  98cfa104
- Fix for r162954. Return the Error. · db482ef7
  Chad Rosier authored Aug 30, 2012
  
  llvm-svn: 162955
  db482ef7
- Move a check to the validateInstruction() function where it more properly belongs. · 8513ffbb
  Chad Rosier authored Aug 30, 2012
  
  llvm-svn: 162954
  8513ffbb
- Typo. · 5eec49fe
  Chad Rosier authored Aug 30, 2012
  
  llvm-svn: 162952
  5eec49fe
Aug 30, 2012

Introduce 'UseSSEx' to force SSE legacy encoding · bbd10792

Michael Liao authored Aug 30, 2012

- Add 'UseSSEx' to force SSE legacy insn not being selected when AVX is
  enabled.

  As the penalty of inter-mixing SSE and AVX instructions, we need
  prevent SSE legacy insn from being generated except explicitly
  specified through some intrinsics. For patterns supported by both
  SSE and AVX, so far, we force AVX insn will be tried first relying on
  AddedComplexity or position in td file. It's error-prone and
  introduces bugs accidentally.

  'UseSSEx' is disabled when AVX is turned on. For SSE insns inherited
  by AVX, we need this predicate to force VEX encoding or SSE legacy
  encoding only.

  For insns not inherited by AVX, we still use the previous predicates,
  i.e. 'HasSSEx'. So far, these insns fall into the following
  categories:
  * SSE insns with MMX operands
  * SSE insns with GPR/MEM operands only (xFENCE, PREFETCH, CLFLUSH,
    CRC, and etc.)
  * SSE4A insns.
  * MMX insns.
  * x87 insns added by SSE.

2 test cases are modified:

 - test/CodeGen/X86/fast-isel-x86-64.ll
   AVX code generation is different from SSE one. 'vcvtsi2sdq' cannot be
   selected by fast-isel due to complicated pattern and fast-isel
   fallback to materialize it from constant pool.

 - test/CodeGen/X86/widen_load-1.ll
   AVX code generation is different from SSE one after fixing SSE/AVX
   inter-mixing. Exec-domain fixing prefers 'vmovapd' instead of
   'vmovaps'.

llvm-svn: 162919

bbd10792

PPCISelLowering.cpp: Fix r162725. · ac49029f

NAKAMURA Takumi authored Aug 30, 2012

[Tobias von Koch] What's happening here is that the CR6SET/CR6UNSET is breaking the chain of register copies glued to the function call (BL_SVR4 node). The scheduler then moves other instructions in between those and the function call, which isn't good!

Right. That's the case where there is no chain of register copies before the call, so InFlag == 0... Attached is a new revision of the patch which should fix this for good.

llvm-svn: 162916

ac49029f

PPCISelLowering.cpp: Whitespace. · 8ad54e04
NAKAMURA Takumi authored Aug 30, 2012
```
llvm-svn: 162915
```
8ad54e04
Add support for moving pure S-register to NEON pipeline if desired · ca9f384f
Tim Northover authored Aug 30, 2012
```
llvm-svn: 162898
```
ca9f384f
Only perform DAG combine on FMAs of legal types. · e39ad7b5
Craig Topper authored Aug 30, 2012
```
llvm-svn: 162892
```
e39ad7b5

Fix PR13727 · 3c898064

Michael Liao authored Aug 30, 2012

- The root cause is that target constant materialization in X86 fast-isel
  creates a PC-rel addressing which may overflow 32-bit range in non-Small code
  model if .rodata section is allocated too far away from code segment in
  MCJIT, which uses Large code model so far.
- Follow the similar logic to fix non-Small code model in fast-isel by skipping
  non-Small code model.

llvm-svn: 162881

3c898064

Aug 29, 2012

Rename hasVolatileMemoryRef() to hasOrderedMemoryRef(). · cea3e774

Jakob Stoklund Olesen authored Aug 29, 2012

Ordered memory operations are more constrained than volatile loads and
stores because they must be ordered with respect to all other memory
operations.

llvm-svn: 162861

cea3e774

Reserve space for the mandatory traceback fields on PPC64. · 1859d265

Hal Finkel authored Aug 29, 2012

We need to reserve space for the mandatory traceback fields,
though leaving them as zero is appropriate for now.

Although the ABI calls for these fields to be filled in fully, no
compiler on Linux currently does this, and GDB does not read these
fields.  GDB uses the first word of zeroes during exception handling to
find the end of the function and the size field, allowing it to compute
the beginning of the function.  DWARF information is used for everything
else.  We need the extra 8 bytes of pad so the size field is found in
the right place.

As a comparison, GCC fills in a few of the fields -- language, number
of saved registers -- but ignores the rest.  IBM's proprietary OSes do
make use of the full traceback table facility.

Patch by Bill Schmidt.

llvm-svn: 162854

1859d265

Refactor setExecutionDomain to be clearer about what it's doing and more robust. · 771f1607
Tim Northover authored Aug 29, 2012
```
llvm-svn: 162844
```
771f1607
Make helper function static. · 8f5c5ded
Benjamin Kramer authored Aug 29, 2012
```
llvm-svn: 162843
```
8f5c5ded

Make MemoryBuiltins aware of TargetLibraryInfo. · 8bcc9711

Benjamin Kramer authored Aug 29, 2012

This disables malloc-specific optimization when -fno-builtin (or -ffreestanding)
is specified. This has been a problem for a long time but became more severe
with the recent memory builtin improvements.

Since the memory builtin functions are used everywhere, this required passing
TLI in many places. This means that functions that now have an optional TLI
argument, like RecursivelyDeleteTriviallyDeadFunctions, won't remove dead
mallocs anymore if the TLI argument is missing. I've updated most passes to do
the right thing.

Fixes PR13694 and probably others.

llvm-svn: 162841

8bcc9711

Convert FMA4 patterns to use target specific nodes instead of intrinsics to align with FMA3. · a999c662
Craig Topper authored Aug 29, 2012
```
llvm-svn: 162829
```
a999c662
Cleanup sloppy code. Jakob's review. · b57e2257
Andrew Trick authored Aug 29, 2012
```
llvm-svn: 162825
```
b57e2257
[arm-fast-isel] Add support for ARM PIC. · e87e559e
Jush Lu authored Aug 29, 2012
```
llvm-svn: 162823
```
e87e559e

Fix ARM vector copies of overlapping register tuples. · bd0073dd

Andrew Trick authored Aug 29, 2012

I have tested the fix, but have not been successfull in generating
a robust unit test. This can only be exposed through particular
register assignments.

llvm-svn: 162821

bd0073dd

cleanup · 4cc6949a
Andrew Trick authored Aug 29, 2012
```
llvm-svn: 162820
```
4cc6949a
Typo. · 3b1336ce
Chad Rosier authored Aug 28, 2012
```
llvm-svn: 162807
```
3b1336ce
Add comments on the literal value used. · 407d659f
Michael Liao authored Aug 28, 2012
```
llvm-svn: 162805
```
407d659f

Aug 28, 2012

The instruction DEXT may be transformed into DEXTU or DEXTM depending · cd6b0e13

Jack Carter authored Aug 28, 2012

on the size of the extraction and its position in the 64 bit word.

This patch allows support of the dext transformations with mips64 direct
object output.

0 <= msb < 32 0 <= lsb < 32 0 <= pos < 32 1 <= size <= 32
DINS
The field is entirely contained in the right-most word of the doubleword

32 <= msb < 64 0 <= lsb < 32 0 <= pos < 32 2 <= size <= 64
DINSM
The field straddles the words of the doubleword

32 <= msb < 64 32 <= lsb < 64 32 <= pos < 64 1 <= size <= 32
DINSU
The field is entirely contained in the left-most word of the doubleword

llvm-svn: 162782

cd6b0e13

Explicitly update the number of nodes to be traversed · 710e1a59
Michael Liao authored Aug 28, 2012
```
llvm-svn: 162780
```
710e1a59

Some instructions are passed to the assembler to be · c20a21b8

Jack Carter authored Aug 28, 2012

transformed to the final instruction variant. An
example would be dsrll which is transformed into 
dsll32 if the shift value is greater than 32.

For direct object output we need to do this transformation
in the codegen. If the instruction was inside branch
delay slot, it was being missed. This patch corrects this
oversight.

llvm-svn: 162779

c20a21b8

Emit word of zeroes after the last instruction as a start of the mandatory · 8c4b6a30

Roman Divacky authored Aug 28, 2012

traceback table on PowerPC64. This helps gdb handle exceptions. The other
mandatory fields are ignored by gdb and harder to implement so just add
there a FIXME.

Patch by Bill Schmidt. PR13641.

llvm-svn: 162778

8c4b6a30