Commits · 727a771a5f8492665817729a84a8bc83d21f37c5 · Roger Ferrer / llvm-epi-0.8

Jun 26, 2012

There are a number of generic inline asm operand modifiers that · 5e69cffe

Jack Carter authored Jun 26, 2012

up to r158925 were handled as processor specific. Making them 
generic and putting tests for these modifiers in the CodeGen/Generic
directory caused a number of targets to fail. 

This commit addresses that problem by having the targets call 
the generic routine for generic modifiers that they don't currently
have explicit code for.

For now only generic print operands 'c' and 'n' are supported.vi


Affected files:

    test/CodeGen/Generic/asm-large-immediate.ll
    lib/Target/PowerPC/PPCAsmPrinter.cpp
    lib/Target/NVPTX/NVPTXAsmPrinter.cpp
    lib/Target/ARM/ARMAsmPrinter.cpp
    lib/Target/XCore/XCoreAsmPrinter.cpp
    lib/Target/X86/X86AsmPrinter.cpp
    lib/Target/Hexagon/HexagonAsmPrinter.cpp
    lib/Target/CellSPU/SPUAsmPrinter.cpp
    lib/Target/Sparc/SparcAsmPrinter.cpp
    lib/Target/MBlaze/MBlazeAsmPrinter.cpp
    lib/Target/Mips/MipsAsmPrinter.cpp
    
MSP430 isn't represented because it did not even run with
the long existing 'c' modifier and it was not apparent what
needs to be done to get it inline asm ready.

Contributer: Jack Carter
llvm-svn: 159203

5e69cffe

Removed unused variable · 863d2d32
Elena Demikhovsky authored Jun 26, 2012
```
llvm-svn: 159197
```
863d2d32
Rename to match other X86_64* names. · 8ed44466
Bill Wendling authored Jun 26, 2012
```
llvm-svn: 159196
```
8ed44466

Shuffle optimization for AVX/AVX2. · 26088d2e

Elena Demikhovsky authored Jun 26, 2012

The current patch optimizes frequently used shuffle patterns and gives these instruction sequence reduction.
Before:
      vshufps $-35, %xmm1, %xmm0, %xmm2 ## xmm2 = xmm0[1,3],xmm1[1,3]
       vpermilps       $-40, %xmm2, %xmm2 ## xmm2 = xmm2[0,2,1,3]
       vextractf128    $1, %ymm1, %xmm1
       vextractf128    $1, %ymm0, %xmm0
       vshufps $-35, %xmm1, %xmm0, %xmm0 ## xmm0 = xmm0[1,3],xmm1[1,3]
       vpermilps       $-40, %xmm0, %xmm0 ## xmm0 = xmm0[0,2,1,3]
       vinsertf128     $1, %xmm0, %ymm2, %ymm0
After:
      vshufps $13, %ymm0, %ymm1, %ymm1 ## ymm1 = ymm1[1,3],ymm0[0,0],ymm1[5,7],ymm0[4,4]
      vshufps $13, %ymm0, %ymm0, %ymm0 ## ymm0 = ymm0[1,3,0,0,5,7,4,4]
      vunpcklps       %ymm1, %ymm0, %ymm0 ## ymm0 = ymm0[0],ymm1[0],ymm0[1],ymm1[1],ymm0[4],ymm1[4],ymm0[5],ymm1[5]

llvm-svn: 159188

26088d2e

Remove some duplicate instructions that exist only to given different... · 94bf0f38

Craig Topper authored Jun 26, 2012

Remove some duplicate instructions that exist only to given different mnemonics for the assembler. Use InstAlias instead.

llvm-svn: 159184

94bf0f38

Make some ugly hacks for inline asm operands which name a specific register a... · bbcd09cc
Eli Friedman authored Jun 25, 2012
```
Make some ugly hacks for inline asm operands which name a specific register a bit more thorough.  PR13196.

llvm-svn: 159176
```
bbcd09cc

Jun 25, 2012

ARM: update peephole optimization. · 606953fb

Manman Ren authored Jun 25, 2012

More condition codes are included when deciding whether to remove cmp after
a sub instruction. Specifically, we extend from GE|LT|GT|LE to 
GE|LT|GT|LE|HS|LS|HI|LO|EQ|NE. If we have "sub a, b; cmp b, a; movhs", we
should be able to replace with "sub a, b; movls".

rdar: 11725965
llvm-svn: 159166

606953fb

Add SSE2 predicate to CVTPS2PD instructions. Doesn't matter much because there... · 357de815

Craig Topper authored Jun 25, 2012

Add SSE2 predicate to CVTPS2PD instructions. Doesn't matter much because there are no patterns in the instruction.

llvm-svn: 159127

357de815

Remove codegen only instruction in favor of one that has the same definition.... · b6eb513c

Craig Topper authored Jun 25, 2012

Remove codegen only instruction in favor of one that has the same definition. Make some pattern operands more explicit about types.

llvm-svn: 159126

b6eb513c

Jun 24, 2012
- %RCX is not a function live-out in eh.return functions. · 2e22e6a3
  Jakob Stoklund Olesen authored Jun 24, 2012
```
The function live-out registers must be live at all function returns,
and %RCX is only used by eh.return. When a function also has a normal
return, only %RAX holds a return value.

This fixes PR13188.

llvm-svn: 159116
```
  2e22e6a3
- llvm/lib: [CMake] Add explicit dependency to intrinsics_gen. · 704de074
  NAKAMURA Takumi authored Jun 24, 2012
```
llvm-svn: 159112
```
  704de074
- Remove intrinsic specific instructions for (V)CVTPS2DQ and replace with patterns. · fd5e6e7d
  Craig Topper authored Jun 24, 2012
```
llvm-svn: 159109
```
  fd5e6e7d
- Remove intrinsic specific instructions for (V)CVTPS2DQ and replace with patterns. · b925230f
  Craig Topper authored Jun 24, 2012
```
llvm-svn: 159108
```
  b925230f
- Fix build failures from r159106. · f48ec7a7
  Craig Topper authored Jun 24, 2012
```
llvm-svn: 159107
```
  f48ec7a7
- Remove intrinsic specific instructions for CVTPD2PS and replace with just patterns. · bab2b899
  Craig Topper authored Jun 24, 2012
```
llvm-svn: 159106
```
  bab2b899
- Remove intrinsic specific instructions for CVTPD2DQ. Replace with patterns. · 3cee08ce
  Craig Topper authored Jun 24, 2012
```
llvm-svn: 159105
```
  3cee08ce
- Remove code i'd been testing with but didn't mean to commit. Oops · 3c680dec
  Pete Cooper authored Jun 24, 2012
```
llvm-svn: 159094
```
  3c680dec
- DAG legalisation can now handle illegal fma vector types by scalarisation · fe212e76
  Pete Cooper authored Jun 24, 2012
```
llvm-svn: 159092
```
  fe212e76
- Remove intrinsic specific instructions for (V)CVTDQ2PS. Use a Pat instead instead. · a899cc15
  Craig Topper authored Jun 23, 2012
```
llvm-svn: 159090
```
  a899cc15
Jun 23, 2012

Make CVTDQ2PS instruction use SSE2 predicate instead of SSE1. No functional... · 7e941522

Craig Topper authored Jun 23, 2012

Make CVTDQ2PS instruction use SSE2 predicate instead of SSE1. No functional change because there are no patterns in the instructions. Also fix a typo in a comment.

llvm-svn: 159087

7e941522

Move CVTPD2DQ to use SSE2 predicate instead of SSE3. Move DQ2PD and PD2DQ to... · 24e34182
Craig Topper authored Jun 23, 2012
```
Move CVTPD2DQ to use SSE2 predicate instead of SSE3. Move DQ2PD and PD2DQ to the SSE2 section of the file.

llvm-svn: 159086
```
24e34182
Add a microoptimization note. · 53ffe55a
Benjamin Kramer authored Jun 23, 2012
```
llvm-svn: 159082
```
53ffe55a

Extend the IL for selecting TLS models (PR9788) · cbe34b4c

Hans Wennborg authored Jun 23, 2012

This allows the user/front-end to specify a model that is better
than what LLVM would choose by default. For example, a variable
might be declared as

  @x = thread_local(initialexec) global i32 42

if it will not be used in a shared library that is dlopen'ed.

If the specified model isn't supported by the target, or if LLVM can
make a better choice, a different model may be used.

llvm-svn: 159077

cbe34b4c

Use correct memory types for (V)CVTDQ2PD instructions. · 8c03ea79
Craig Topper authored Jun 23, 2012
```
llvm-svn: 159075
```
8c03ea79
Silence an unused variable warning on release builds. · 2361cd98
Craig Topper authored Jun 23, 2012
```
llvm-svn: 159074
```
2361cd98
Compress flags in X86 op folding to reduce space in static tables. · 1cac50bc
Craig Topper authored Jun 23, 2012
```
llvm-svn: 159073
```
1cac50bc
Make helper method static since it doesn't use anything in the class. · d9c7d0dd
Craig Topper authored Jun 23, 2012
```
llvm-svn: 159071
```
d9c7d0dd

Remove intrinsic specific instructions for 128-bit (V)CVTDQ2PD. Replace with... · 431f1e71

Craig Topper authored Jun 23, 2012

Remove intrinsic specific instructions for 128-bit (V)CVTDQ2PD. Replace with intrinsic patterns. Mem forms omitted because the load size is only 64-bits.

llvm-svn: 159070

431f1e71

Handle aliases to tls variables in all architectures, not just x86. · a3088f09
Rafael Espindola authored Jun 23, 2012
```
llvm-svn: 159058
```
a3088f09

(sub X, imm) gets canonicalized to (add X, -imm) · 68c2f9a9

Evan Cheng authored Jun 23, 2012

There are patterns to handle immediates when they fit in the immediate field.
e.g. %sub = add i32 %x, -123
=>   sub r0, r0, #123
Add patterns to catch immediates that do not fit but should be materialized
with a single movw instruction rather than movw + movt pair.
e.g. %sub = add i32 %x, -65535
=>   movw r1, #65535
     sub r0, r0, r1

rdar://11726136

llvm-svn: 159057

68c2f9a9

ARM: Add a better diagnostic for some out of range immediates. · 087affe2

Jim Grosbach authored Jun 22, 2012

As an example of how the custom DiagnosticType can be used to provide
better operand-mismatch diagnostics, add a custom diagnostic for
the imm0_15 operand class used for several system instructions.
Update the tests to expect the improved diagnostic.

rdar://8987109

llvm-svn: 159051

087affe2

Add support for the PPC isel instruction. · 460e94d8

Hal Finkel authored Jun 22, 2012

The isel (integer select) instruction is supported on the 440 and A2
embedded cores and on the POWER7.

llvm-svn: 159045

460e94d8

Whitespace. · f5cdea3d
Chad Rosier authored Jun 22, 2012
```
llvm-svn: 159035
```
f5cdea3d

Jun 22, 2012

Revert r158679 - use case is unclear (and it increases the memory footprint). · 8db55472

Hal Finkel authored Jun 22, 2012

Original commit message:
    Allow up to 64 functional units per processor itinerary.

    This patch changes the type used to hold the FU bitset from unsigned to uint64_t.
    This will be needed for some upcoming PowerPC itineraries.

llvm-svn: 159027

8db55472

Use "NoItineraries" for processors with no itineraries. · 9c302673

Andrew Trick authored Jun 22, 2012

This makes it explicit when ScoreboardHazardRecognizer will be used.
"GenericItineraries" would only make sense if it contained real
itinerary values and still required ScoreboardHazardRecognizer.

llvm-svn: 158963

9c302673

Functions calling __builtin_eh_return must have a frame pointer. · 321d41a8

Jakob Stoklund Olesen authored Jun 22, 2012

The code in X86TargetLowering::LowerEH_RETURN() assumes that a frame
pointer exists, but the frame pointer was forced by the presence of
llvm.eh.unwind.init which isn't guaranteed.

If llvm.eh.unwind.init is actually required in functions calling
eh.return (is it?), we should diagnose that instead of emitting bad
machine code.

This should fix the dragonegg-x86_64-linux-gcc-4.6-test bot.

llvm-svn: 158961

321d41a8

ARM scheduling fix: don't guess at implicit operand latency. · 77d0b889

Andrew Trick authored Jun 22, 2012

This is a minor drive-by fix with no robust way to unit test.
As an example see neon-div.ll:
SU(16):   %Q8<def> = VMOVLsv4i32 %D17, pred:14, pred:%noreg, %Q8<imp-use,kill>
 val SU(1): Latency=2 Reg=%Q8
...should be latency=1

llvm-svn: 158960

77d0b889

ARM scheduling fix: compute predicated implicit use properly. · 3ccb1b8c

Andrew Trick authored Jun 22, 2012

Minor drive by fix to cleanup latency computation. Calling
getOperandLatency with a deliberately incorrect operand index does not
give you the latency you want.

llvm-svn: 158959

3ccb1b8c

Rename -allow-excess-fp-precision flag to -fuse-fp-ops, and switch from a · b8650f10

Lang Hames authored Jun 22, 2012

boolean flag to an enum: { Fast, Standard, Strict } (default = Standard).

This option controls the creation by optimizations of fused FP ops that store
intermediate results in higher precision than IEEE allows (E.g. FMAs). The
behavior of this option is intended to match the behaviour specified by a
soon-to-be-introduced frontend flag: '-ffuse-fp-ops'.

Fast mode - allows formation of fused FP ops whenever they're profitable.

Standard mode - allow fusion only for 'blessed' FP ops. At present the only
blessed op is the fmuladd intrinsic. In the future more blessed ops may be
added.

Strict mode - allow fusion only if/when it can be proven that the excess
precision won't effect the result.

Note: This option only controls formation of fused ops by the optimizers.  Fused
operations that are explicitly requested (e.g. FMA via the llvm.fma.* intrinsic)
will always be honored, regardless of the value of this option.

Internally TargetOptions::AllowExcessFPPrecision has been replaced by
TargetOptions::AllowFPOpFusion.

llvm-svn: 158956

b8650f10

Convert the PPC backend to use the new FMA infrastructure. · 0a479ae7

Hal Finkel authored Jun 22, 2012

The existing contraction patterns are replaced with fma/fneg.
Overall functionality should be the same.

llvm-svn: 158955

0a479ae7