Commits · 37521aa89c2a34301dbb8c1afc82823862ca5194 · Roger Ferrer / llvm-epi-0.8

Sep 16, 2012

The PMOVZXWD family of functions had patterns extends narrow vector types to wide vector types. · 37521aa8

Nadav Rotem authored Sep 16, 2012

It had patterns for zext-loading and extending. This commit adds patterns for loading a wide type, performing a bitcast,
and extending. This is an odd pattern, but it is commonly used when writing code with intrinsics.

rdar://11897677

llvm-svn: 163995

37521aa8

Sep 15, 2012
- X86: Emitting x87 fsin/fcos for sinf/cosf is not safe without unsafe fp math. · ece43425
  Benjamin Kramer authored Sep 15, 2012
```
This was only an issue if sse is disabled.

llvm-svn: 163967
```
  ece43425
- Handled unaligned load/stores properly in Mips16 · 189d0add
  Akira Hatanaka authored Sep 15, 2012
```
Patch by Reed Kotler.

llvm-svn: 163956
```
  189d0add
Sep 14, 2012
- Fix both the test for zero and what we do if we have a zero for · b83dba2b
  Eric Christopher authored Sep 13, 2012
```
umulo legalization.

Fixes PR13839

llvm-svn: 163856
```
  b83dba2b
Sep 13, 2012

Add wider vector/integer support for PR12312 · 137f8aed

Michael Liao authored Sep 13, 2012

- Enhance the fix to PR12312 to support wider integer, such as 256-bit
  integer. If more than 1 fully evaluated vectors are found, POR them
  first followed by the final PTEST.

llvm-svn: 163832

137f8aed

Enhance type legalization on bitcast from vector to integer · 460fc46e

Michael Liao authored Sep 13, 2012

- Find a legal vector type before casting and extracting element from it.
- As the new vector type may have more than 2 elements, build the final
  hi/lo pair by BFS pairing them from bottom to top.

llvm-svn: 163830

460fc46e

Fix test case to avoid PIC magic. · 32a56fa3
Jakob Stoklund Olesen authored Sep 13, 2012
```
llvm-svn: 163827
```
32a56fa3

Fix the TCRETURNmi64 bug differently. · 3cf3ffce

Jakob Stoklund Olesen authored Sep 13, 2012

Add a PatFrag to match X86tcret using 6 fixed registers or less. This
avoids folding loads into TCRETURNmi64 using 7 or more volatile
registers.

<rdar://problem/12282281>

llvm-svn: 163819

3cf3ffce

Revert r163761 "Don't fold indexed loads into TCRETURNmi64." · 78b9f8fc
Jakob Stoklund Olesen authored Sep 13, 2012
```
The patch caused "Wrong topological sorting" assertions.

llvm-svn: 163810
```
78b9f8fc
This patch introduces A15 as a target in LLVM. · b47bb94f
Silviu Baranga authored Sep 13, 2012
```
llvm-svn: 163803
```
b47bb94f

Fix a dagcombine optimization. The optimization attempts to optimize a bitcast of fneg to integers · 24a822a5

Nadav Rotem authored Sep 13, 2012

by xoring the high-bit. This fails if the source operand is a vector because we need to negate
each of the elements in the vector.

Fix rdar://12281066 PR13813.

llvm-svn: 163802

24a822a5

· 4e9ad066

Nadav Rotem authored Sep 13, 2012

Stack Coloring: We have code that checks that all of the uses of allocas
are within the lifetime zone. Sometime legitimate usages of allocas are
hoisted outside of the lifetime zone. For example, GEPS may calculate the
address of a member of an allocated struct. This commit makes sure that
we only check (abort regions or assert) for instructions that read and write
memory using stack frames directly. Notice that by allowing legitimate
usages outside the lifetime zone we also stop checking for instructions
which use derivatives of allocas. We will catch less bugs in user code
and in the compiler itself.

llvm-svn: 163791

4e9ad066

Don't fold indexed loads into TCRETURNmi64. · bfacef45

Jakob Stoklund Olesen authored Sep 13, 2012

We don't have enough GR64_TC registers when calling a varargs function
with 6 arguments. Since %al holds the number of vector registers used,
only %r11 is available as a scratch register.

This means that addressing modes using both base and index registers
can't be folded into TCRETURNmi64.

<rdar://problem/12282281>

llvm-svn: 163761

bfacef45

Sep 12, 2012

Fix PR11985 · abb87d48

Michael Liao authored Sep 12, 2012

    
- BlockAddress has no support of BA + offset form and there is no way to
  propagate that offset into machine operand;
- Add BA + offset support and a new interface 'getTargetBlockAddress' to
  simplify target block address forming;
- All targets are modified to use new interface and X86 backend is enhanced to
  support BA + offset addressing.

llvm-svn: 163743

abb87d48

This patch corrects logic in PPCFrameLowering for save and restore of ... · c9e23d93

Roman Divacky authored Sep 12, 2012

This patch corrects logic in PPCFrameLowering for save and restore of                                              
nonvolatile condition register fields across calls under the SVR4 ABIs.                                            
                                                                                                                   
 * With the 64-bit ABI, the save location is at a fixed offset of 8 from                                           
the stack pointer.  The frame pointer cannot be used to access this                                                
portion of the stack frame since the distance from the frame pointer may                                           
change with alloca calls.                                                                                          
                                                                                                                   
 * With the 32-bit ABI, the save location is just below the general
register save area, and is accessed via the frame pointer like the rest
of the save areas.  This is an optional slot, so it must only be created                                           
if any of CR2, CR3, and CR4 were modified.                                                                      
                                                                                                                   
 * For both ABIs, save/restore logic is generated only if one of the     
nonvolatile CR fields were modified.                                   

I also took this opportunity to clean up an extra FIXME in
PPCFrameLowering.h.  Save area offsets for 32-bit GPRs are meaningless
for the 64-bit ABI, so I removed them for correctness and efficiency.


Fixes PR13708 and partially also PR13623. It lets us enable exception handling
on PPC64.

Patch by William J. Schmidt!

llvm-svn: 163713

c9e23d93

Fix constant folding through bitcasts by no longer relying on undefined... · e6b876f4

Kristof Beyls authored Sep 12, 2012

Fix constant folding through bitcasts by no longer relying on undefined behaviour (converting NaN values between float and double).

SelectionDAG::getConstantFP(double Val, EVT VT, bool isTarget);
should not be used when Val is not a simple constant (as the comment in
SelectionDAG.h indicates). This patch avoids using this function
when folding an unknown constant through a bitcast, where it cannot be
guaranteed that Val will be a simple constant.

llvm-svn: 163703

e6b876f4

Stack coloring: remove lifetime intervals which contain escaped allocas. · 8ff00989

Nadav Rotem authored Sep 12, 2012

The input program may contain intructions which are not inside lifetime
markers. This can happen due to a bug in the compiler or due to a bug in
user code (for example, returning a reference to a local variable).
This commit adds checks that all of the instructions in the function and
invalidates lifetime ranges which do not contain all of the instructions.

llvm-svn: 163678

8ff00989

Sep 11, 2012
- [ms-inline asm] Split the parsing of IR asm strings into GCC and MS variants. · 1778831a
  Chad Rosier authored Sep 11, 2012
```
Add support in the EmitMSInlineAsmStr() function for handling integer consts.

llvm-svn: 163645
```
  1778831a
- Formatting. No functional change intended. · ab51c9de
  Chad Rosier authored Sep 11, 2012
```
llvm-svn: 163627
```
  ab51c9de
- Stack Coloring: Dont crash on dbg values which use stack frames. · 65ba95eb
  Nadav Rotem authored Sep 11, 2012
```
llvm-svn: 163616
```
  65ba95eb
- test/CodeGen/X86/ms-inline-asm.ll: Relax for non-darwin x86 targets.... · 8c72306c
  NAKAMURA Takumi authored Sep 10, 2012
```
test/CodeGen/X86/ms-inline-asm.ll: Relax for non-darwin x86 targets. '##InlineAsm' could not be seen in other hosts.

llvm-svn: 163554
```
  8c72306c
Sep 10, 2012
- [ms-inline asm] Properly emit the asm directives when the AsmPrinterVariant · 7641f587
  Chad Rosier authored Sep 10, 2012
```
and InlineAsmVariant don't match.

llvm-svn: 163550
```
  7641f587
- Update test case for Release builds. · 1c1319b9
  Chad Rosier authored Sep 10, 2012
```
llvm-svn: 163549
```
  1c1319b9
- [ms-inline asm] Pass the correct AsmVariant to the PrintAsmOperand() function · db20a41d
  Chad Rosier authored Sep 10, 2012
```
and update the printOperand() function accordingly.

llvm-svn: 163544
```
  db20a41d
- Don't attempt to use flags from predicated instructions. · 8b9dce5c
  Jakob Stoklund Olesen authored Sep 10, 2012
```
The ARM backend can eliminate cmp instructions by reusing flags from a
nearby sub instruction with similar arguments.

Don't do that if the sub is predicated - the flags are not written
unconditionally.

<rdar://problem/12263428>

llvm-svn: 163535
```
  8b9dce5c
- Stack Coloring: Handle the case where END markers come before BEGIN markers properly. · 3c86b78a
  Nadav Rotem authored Sep 10, 2012
```
llvm-svn: 163530
```
  3c86b78a
- Enhance PR11334 fix to support extload from v2f32/v4f32 · 400f7ef8
  Michael Liao authored Sep 10, 2012
```
    
- Fix an remaining issue of PR11674 as well

llvm-svn: 163528
```
  400f7ef8
- Add boolean simplification support from CMOV · c3d5b21c
  Michael Liao authored Sep 10, 2012
```
- If a boolean value is generated from CMOV and tested as boolean value,
  simplify the use of test result by referencing the original condition.
  RDRAND intrinisc is one of such cases.

llvm-svn: 163516
```
  c3d5b21c
- Fix an assertion failure when optimising a shufflevector incorrectly into... · 1e5c6118
  James Molloy authored Sep 10, 2012
```
Fix an assertion failure when optimising a shufflevector incorrectly into concat_vectors, and a followup bug with SelectionDAG::getNode() creating nodes with invalid types.

llvm-svn: 163511
```
  1e5c6118
- Stack Coloring: Add support for multiple regions of the same slot, within a single basic block. · 67313631
  Nadav Rotem authored Sep 10, 2012
```
llvm-svn: 163507
```
  67313631
- The VPSHUFB 256-bit instruction may be generated when one of input vector is... · 264fb021
  Elena Demikhovsky authored Sep 10, 2012
```
The VPSHUFB 256-bit instruction may be generated when one of input vector is undefined or zeroinitializer.
I've added the "zeroinitializer" case in this patch.

llvm-svn: 163506
```
  264fb021
- Teach the DAGBuilder about lifetime markers which are generated from PHINodes. · d753a952
  Nadav Rotem authored Sep 10, 2012
```
llvm-svn: 163494
```
  d753a952
- Teach DAG combiner to constant fold fneg of a BUILD_VECTOR of constants. · 03f39773
  Craig Topper authored Sep 09, 2012
```
llvm-svn: 163483
```
  03f39773
Sep 08, 2012
- Add instruction selection for ffloor of vectors when SSE4.1 or AVX is enabled. · 4ed79bd7
  Craig Topper authored Sep 08, 2012
```
llvm-svn: 163473
```
  4ed79bd7
- Add support for lowering FABS of vector types. · 98f2e861
  Craig Topper authored Sep 08, 2012
```
llvm-svn: 163461
```
  98f2e861
- Set operation action for FFLOOR to Expand for all vector types for X86. Set... · 3e41a5bb
  Craig Topper authored Sep 08, 2012
```
Set operation action for FFLOOR to Expand for all vector types for X86. Set FFLOOR of v4f32 to Expand for ARM. v2f64 was already correct.

llvm-svn: 163458
```
  3e41a5bb
Sep 06, 2012

Allow overlaps between virtreg and physreg live ranges. · 866908c4

Jakob Stoklund Olesen authored Sep 06, 2012

The RegisterCoalescer understands overlapping live ranges where one
register is defined as a copy of the other. With this change, register
allocators using LiveRegMatrix can do the same, at least for copies
between physical and virtual registers.

When a physreg is defined by a copy from a virtreg, allow those live
ranges to overlap:

  %CL<def> = COPY %vreg11:sub_8bit; GR32_ABCD:%vreg11
  %vreg13<def,tied1> = SAR32rCL %vreg13<tied0>, %CL<imp-use,kill>

We can assign %vreg11 to %ECX, overlapping the live range of %CL.

llvm-svn: 163336

866908c4

Disable stack coloring by default in order to resolve the i386 failures. · 9e3cc9f8
Nadav Rotem authored Sep 06, 2012
```
llvm-svn: 163316
```
9e3cc9f8

AVX2 optimization. · 42777877

Elena Demikhovsky authored Sep 06, 2012

Added generation of VPSHUB instruction for <32 x i8> vector shuffle when possible.

llvm-svn: 163312

42777877

Fix the test by specifying an exact cpu model. · ea0d36be
Nadav Rotem authored Sep 06, 2012
```
llvm-svn: 163307
```
ea0d36be