Commits · 78b9f8fc67fce74976ec1107ccd30789451aec8d · Roger Ferrer / llvm-epi-0.8

Sep 13, 2012

Revert r163761 "Don't fold indexed loads into TCRETURNmi64." · 78b9f8fc
Jakob Stoklund Olesen authored Sep 13, 2012
```
The patch caused "Wrong topological sorting" assertions.

llvm-svn: 163810
```
78b9f8fc

Add a new compression type to ModRM table that detects when the memory modRM... · 963305b4

Craig Topper authored Sep 13, 2012

Add a new compression type to ModRM table that detects when the memory modRM byte represent 8 instructions and the reg modRM byte represents up to 64 instructions. Reduces modRM table from 43k entreis to 25k entries. Based on a patch from Manman Ren.

llvm-svn: 163774

963305b4

Don't fold indexed loads into TCRETURNmi64. · bfacef45

Jakob Stoklund Olesen authored Sep 13, 2012

We don't have enough GR64_TC registers when calling a varargs function
with 6 arguments. Since %al holds the number of vector registers used,
only %r11 is available as a scratch register.

This means that addressing modes using both base and index registers
can't be folded into TCRETURNmi64.

<rdar://problem/12282281>

llvm-svn: 163761

bfacef45

Sep 12, 2012

Fix PR11985 · abb87d48

Michael Liao authored Sep 12, 2012

    
- BlockAddress has no support of BA + offset form and there is no way to
  propagate that offset into machine operand;
- Add BA + offset support and a new interface 'getTargetBlockAddress' to
  simplify target block address forming;
- All targets are modified to use new interface and X86 backend is enhanced to
  support BA + offset addressing.

llvm-svn: 163743

abb87d48

[ms-inline asm] Make the operand size directives case insensitive. · ab53b4f6
Chad Rosier authored Sep 12, 2012
```
llvm-svn: 163729
```
ab53b4f6
Add support for AMD Geode. · fd690094
Roman Divacky authored Sep 12, 2012
```
llvm-svn: 163710
```
fd690094
Indentation fixes. No functional change. · ad495964
Craig Topper authored Sep 12, 2012
```
llvm-svn: 163682
```
ad495964

Release build: guard dump functions with · 19f49ac6

Manman Ren authored Sep 11, 2012

"#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)"

No functional change. Update r163339.

llvm-svn: 163653

19f49ac6

Sep 11, 2012
- StringSwitchify. · b6b8e966
  Chad Rosier authored Sep 11, 2012
```
llvm-svn: 163649
```
  b6b8e966
- Simplify logic. No functional change intended. · 30888b17
  Chad Rosier authored Sep 11, 2012
```
llvm-svn: 163648
```
  30888b17
- Make a bunch of lowering helper functions static instead of member functions. No functional change. · a29ed865
  Craig Topper authored Sep 11, 2012
```
llvm-svn: 163596
```
  a29ed865
- Change unsigned to a uint16_t in static disassembler tables to reduce the table size. · 8702c5b7
  Craig Topper authored Sep 11, 2012
```
llvm-svn: 163594
```
  8702c5b7
- Update function names to conform to guidelines. No functional change intended. · 38e05a9e
  Chad Rosier authored Sep 10, 2012
```
llvm-svn: 163561
```
  38e05a9e
- Revert r163556. Missed updates to tablegen files. · 41ff85d7
  Chad Rosier authored Sep 10, 2012
```
llvm-svn: 163557
```
  41ff85d7
- Update function names to conform to guidelines. No functional change intended. · 2089c49d
  Chad Rosier authored Sep 10, 2012
```
llvm-svn: 163556
```
  2089c49d
Sep 10, 2012
- Remove redundant semicolons which are null statements. · ca1e27be
  Dmitri Gribenko authored Sep 10, 2012
```
llvm-svn: 163547
```
  ca1e27be
- [ms-inline asm] Pass the correct AsmVariant to the PrintAsmOperand() function · db20a41d
  Chad Rosier authored Sep 10, 2012
```
and update the printOperand() function accordingly.

llvm-svn: 163544
```
  db20a41d
- [ms-inline asm] Add support for .att_syntax directive. · 6f8d8b24
  Chad Rosier authored Sep 10, 2012
```
llvm-svn: 163542
```
  6f8d8b24
- Enhance PR11334 fix to support extload from v2f32/v4f32 · 400f7ef8
  Michael Liao authored Sep 10, 2012
```
    
- Fix an remaining issue of PR11674 as well

llvm-svn: 163528
```
  400f7ef8
- Add boolean simplification support from CMOV · c3d5b21c
  Michael Liao authored Sep 10, 2012
```
- If a boolean value is generated from CMOV and tested as boolean value,
  simplify the use of test result by referencing the original condition.
  RDRAND intrinisc is one of such cases.

llvm-svn: 163516
```
  c3d5b21c
- The VPSHUFB 256-bit instruction may be generated when one of input vector is... · 264fb021
  Elena Demikhovsky authored Sep 10, 2012
```
The VPSHUFB 256-bit instruction may be generated when one of input vector is undefined or zeroinitializer.
I've added the "zeroinitializer" case in this patch.

llvm-svn: 163506
```
  264fb021
- Add missing space before {. No functionality change. · 74bf42c9
  Nick Lewycky authored Sep 09, 2012
```
llvm-svn: 163484
```
  74bf42c9
Sep 08, 2012
- Add instruction selection for ffloor of vectors when SSE4.1 or AVX is enabled. · 4ed79bd7
  Craig Topper authored Sep 08, 2012
```
llvm-svn: 163473
```
  4ed79bd7
- Use 256-bit alignment for constant pool value for 256-bit vector FNEG lowering. · 0955a9f4
  Craig Topper authored Sep 08, 2012
```
llvm-svn: 163463
```
  0955a9f4
- Add support for lowering FABS of vector types. · 98f2e861
  Craig Topper authored Sep 08, 2012
```
llvm-svn: 163461
```
  98f2e861
- Set operation action for FFLOOR to Expand for all vector types for X86. Set... · 3e41a5bb
  Craig Topper authored Sep 08, 2012
```
Set operation action for FFLOOR to Expand for all vector types for X86. Set FFLOOR of v4f32 to Expand for ARM. v2f64 was already correct.

llvm-svn: 163458
```
  3e41a5bb
Sep 07, 2012

PR13754: llvm-mc/x86 crashes on .cfi directives without the % prefix for registers. · e3d658bb

Benjamin Kramer authored Sep 07, 2012

gas accepts this and it seems to be common enough to be worth supporting. This
doesn't affect the parsing of reg operands outside of .cfi directives.

llvm-svn: 163390

e3d658bb

Sep 06, 2012
- Release build: guard dump functions with "ifndef NDEBUG" · 742534c4
  Manman Ren authored Sep 06, 2012
```
No functional change.

llvm-svn: 163339
```
  742534c4
- AVX2 optimization. · 42777877
  Elena Demikhovsky authored Sep 06, 2012
```
Added generation of VPSHUB instruction for <32 x i8> vector shuffle when possible.

llvm-svn: 163312
```
  42777877
- Remove duplicated helper function · 2d95a2b5
  Michael Liao authored Sep 06, 2012
```
llvm-svn: 163295
```
  2d95a2b5
- Use iPTR instead of i32 for extract_subvector/insert_subvector index in... · f3e4aa8c
  Craig Topper authored Sep 06, 2012
```
Use iPTR instead of i32 for extract_subvector/insert_subvector index in lowering and patterns. This makes it consistent with the incoming DAG nodes from the DAG builder.

llvm-svn: 163293
```
  f3e4aa8c
- Add patterns for converting stores of subvector_extracts of lower 128-bits of... · daa5ed1e
  Craig Topper authored Sep 06, 2012
```
Add patterns for converting stores of subvector_extracts of lower 128-bits of a 256-bit vector to VMOVAPSmr/VMOVUPSmr.

llvm-svn: 163292
```
  daa5ed1e
- Stop casting away const qualifier needlessly. · ad06cee2
  Roman Divacky authored Sep 05, 2012
```
llvm-svn: 163258
```
  ad06cee2
Sep 05, 2012

Use const properly so that we dont remove const qualifier from region and MII · 6792380e
Roman Divacky authored Sep 05, 2012
```
by casting. Found with gcc48.

llvm-svn: 163247
```
6792380e

Remove some of the patterns added in r163196. Increasing the complexity on... · 81f06df6

Craig Topper authored Sep 05, 2012

Remove some of the patterns added in r163196. Increasing the complexity on insert_subvector into undef accomplishes the same thing.

llvm-svn: 163198

81f06df6

Add patterns for integer forms of VINSERTF128/VINSERTI128 folded with loads.... · f7c87d6e

Craig Topper authored Sep 05, 2012

Add patterns for integer forms of VINSERTF128/VINSERTI128 folded with loads. Also add patterns to turn subvector inserts with loads to index 0 of an undef into VMOVAPS.

llvm-svn: 163196

f7c87d6e

Convert vextracti128/vextractf128 intrinsics to extract_subvector at DAG build... · 2db2353b

Craig Topper authored Sep 05, 2012

Convert vextracti128/vextractf128 intrinsics to extract_subvector at DAG build time. Similar was previously done for vinserti128/vinsertf128. Add patterns for folding these extract_subvectors with stores.

llvm-svn: 163192

2db2353b

Fix function name per coding standard. · a05ea0f3
Chad Rosier authored Sep 05, 2012
```
llvm-svn: 163187
```
a05ea0f3

Sep 04, 2012

Generic Bypass Slow Div · cdf540d5

Preston Gurd authored Sep 04, 2012

- CodeGenPrepare pass for identifying div/rem ops
- Backend specifies the type mapping using addBypassSlowDivType
- Enabled only for Intel Atom with O2 32-bit -> 8-bit
- Replace IDIV with instructions which test its value and use DIVB if the value
is positive and less than 256.
- In the case when the quotient and remainder of a divide are used a DIV
and a REM instruction will be present in the IR. In the non-Atom case
they are both lowered to IDIVs and CSE removes the redundant IDIV instruction,
using the quotient and remainder from the first IDIV. However,
due to this optimization CSE is not able to eliminate redundant
IDIV instructions because they are located in different basic blocks.
This is overcome by calculating both the quotient (DIV) and remainder (REM)
in each basic block that is inserted by the optimization and reusing the result
values when a subsequent DIV or REM instruction uses the same operands.
- Test cases check for the presents of the optimization when calculating
either the quotient, remainder,  or both.

Patch by Tyler Nowicki!

llvm-svn: 163150

cdf540d5

This patch optimizes shuffle instruction - generates 2 instructions instead of 4. · cbe99bbb

Elena Demikhovsky authored Sep 04, 2012

Since this specific shuffle is widely used in many workloads we have ~10% performance on them.

shufflevector <8 x float> %A, <8 x float> %B, <8 x i32> <i32 0, i32 8, i32 2, i32 10, i32 4, i32 12, i32 6, i32 14>

vmovaps (%rdx), %ymm0
vshufps $8, %ymm0, %ymm0, %ymm0
vmovaps (%rcx), %ymm1
vshufps $8, %ymm0, %ymm1, %ymm1
vunpcklps       %ymm0, %ymm1, %ymm0

vmovaps (%rcx), %ymm0
vmovsldup       (%rdx), %ymm1
vblendps        $85, %ymm0, %ymm1, %ymm0

llvm-svn: 163134

cbe99bbb