Commits · 3854bad4a1936dcf481926dba0e2bd28a3573843 · Roger Ferrer / llvm-epi-0.8

Mar 24, 2011

Andrew Trick authored Mar 23, 2011

I'm backing this out for the second time. It was supposed to be fixed by r128164, but the mingw self-host must be defeating the fix.

llvm-svn: 128181

4ab9a165

Mar 23, 2011
- Reapply Eli's r127852 now that the pre-RA scheduler can spill EFLAGS. · 4046a0de
  Andrew Trick authored Mar 23, 2011
```
(target-specific branchless method for double-width relational comparisons on x86)

llvm-svn: 128175
```
  4046a0de
Mar 22, 2011
- Fix fast-isel address mode folding to avoid folding instructions · c1783b31
  Dan Gohman authored Mar 22, 2011
```
outside of the current basic block. This fixes PR9500, rdar://9156159.

llvm-svn: 128041
```
  c1783b31
Mar 21, 2011

We need to pass the TargetMachine object to the InstPrinter if we are printing · 00f0cddf

Bill Wendling authored Mar 21, 2011

the alias of an InstAlias instead of the thing being aliased. Because we need to
know the features that are valid for an InstAlias.

This is part of a work-in-progress.

llvm-svn: 127986

00f0cddf

Re-apply r127953 with fixes: eliminate empty return block if it has no... · 0663f23b

Evan Cheng authored Mar 21, 2011

Re-apply r127953 with fixes: eliminate empty return block if it has no predecessors; update dominator tree if cfg is modified.

llvm-svn: 127981

0663f23b

Mar 19, 2011

Revert r127953, "SimplifyCFG has stopped duplicating returns into predecessors · 327cd36f
Daniel Dunbar authored Mar 19, 2011
```
to canonicalize IR", it broke a lot of things.

llvm-svn: 127954
```
327cd36f

SimplifyCFG has stopped duplicating returns into predecessors to canonicalize IR · 824a7113

Evan Cheng authored Mar 19, 2011

to have single return block (at least getting there) for optimizations. This
is general goodness but it would prevent some tailcall optimizations.
One specific case is code like this:
int f1(void);
int f2(void);
int f3(void);
int f4(void);
int f5(void);
int f6(void);
int foo(int x) {
  switch(x) {
  case 1: return f1();
  case 2: return f2();
  case 3: return f3();
  case 4: return f4();
  case 5: return f5();
  case 6: return f6();
  }
}

=>
LBB0_2:                                 ## %sw.bb
  callq   _f1
  popq    %rbp
  ret
LBB0_3:                                 ## %sw.bb1
  callq   _f2
  popq    %rbp
  ret
LBB0_4:                                 ## %sw.bb3
  callq   _f3
  popq    %rbp
  ret

This patch teaches codegenprep to duplicate returns when the return value
is a phi and where the phi operands are produced by tail calls followed by
an unconditional branch:

sw.bb7:                                           ; preds = %entry
  %call8 = tail call i32 @f5() nounwind
  br label %return
sw.bb9:                                           ; preds = %entry
  %call10 = tail call i32 @f6() nounwind
  br label %return
return:
  %retval.0 = phi i32 [ %call10, %sw.bb9 ], [ %call8, %sw.bb7 ], ... [ 0, %entry ]
  ret i32 %retval.0

This allows codegen to generate better code like this:

LBB0_2:                                 ## %sw.bb
        jmp     _f1                     ## TAILCALL
LBB0_3:                                 ## %sw.bb1
        jmp     _f2                     ## TAILCALL
LBB0_4:                                 ## %sw.bb3
        jmp     _f3                     ## TAILCALL

rdar://9147433

llvm-svn: 127953

824a7113

Add support for legalizing UINT_TO_FP of vectors on platforms which do · e7a101cc

Nadav Rotem authored Mar 19, 2011

not have native support for this operation (such as X86).
The legalized code uses two vector INT_TO_FP operations and is faster
than scalarizing.

llvm-svn: 127951

e7a101cc

Mar 18, 2011

Revert r127852; it's apparently causing an ICE on mingw. · 59721e32
Eli Friedman authored Mar 18, 2011
```
llvm-svn: 127909
```
59721e32
Support explicit argument forms for the X86 string instructions. · 3fbfcc0e
Joerg Sonnenberger authored Mar 18, 2011
```
For now, only the default segments are supported.

llvm-svn: 127875
```
3fbfcc0e

Add a target-specific branchless method for double-width relational · 1a916a3c

Eli Friedman authored Mar 18, 2011

comparisons on x86.  Essentially, the way this works is that SUB+SBB sets
the relevant flags the same way a double-width CMP would.

This is a substantial improvement over the generic lowering in LLVM. The output
is also shorter than the gcc-generated output; I haven't done any detailed
benchmarking, though.

llvm-svn: 127852

1a916a3c

Mar 17, 2011
- Move more logic into getTypeForExtArgOrReturn. · 2ef0c69d
  Cameron Zwarich authored Mar 17, 2011
```
llvm-svn: 127809
```
  2ef0c69d
- Rename getTypeForExtendedInteger() to getTypeForExtArgOrReturn(). · 34e7b3f7
  Cameron Zwarich authored Mar 17, 2011
```
llvm-svn: 127807
```
  34e7b3f7
- A couple new README entries. · e8f2be0c
  Eli Friedman authored Mar 17, 2011
```
llvm-svn: 127786
```
  e8f2be0c
Mar 16, 2011

The x86-64 ABI says that a bool is only guaranteed to be sign-extended to a byte · ac106273

Cameron Zwarich authored Mar 16, 2011

rather than an int. Thankfully, this only causes LLVM to miss optimizations, not
generate incorrect code.

This just fixes the zext at the return. We still insert an i32 ZextAssert when
reading a function's arguments, but it is followed by a truncate and another i8
ZextAssert so it is not optimized.

llvm-svn: 127766

ac106273

Mar 15, 2011

Enabled disassembler support for AVX instructions · b60b0bc4

Sean Callanan authored Mar 15, 2011

in the instruction tables and fixed a few bugs that
were causing decode conflicts.  Rudimentary tests
are coming up in the next patch.

llvm-svn: 127646

b60b0bc4

X86 table-generator and disassembler support for the AVX · c3fd5237

Sean Callanan authored Mar 15, 2011

instruction set.  This code adds support for the VEX prefix
and for the YMM registers accessible on AVX-enabled
architectures.  Instruction table support that enables AVX
instructions for the disassembler is in an upcoming patch.

llvm-svn: 127644

c3fd5237

Mar 11, 2011

Change the x86 32-bit scheduler to register pressure and fix up the · cf56a503

Eric Christopher authored Mar 11, 2011

corresponding testcases back to the previous versions.

Fixes some performance regressions only seen on 32-bit.

llvm-svn: 127441

cf56a503

Mar 10, 2011
- Revert 127359; it broke lencod. · d17ae4e9
  Stuart Hastings authored Mar 10, 2011
```
llvm-svn: 127382
```
  d17ae4e9
- Re-commit 127368 and 127371. They are exonerated. · b4c6a344
  Evan Cheng authored Mar 10, 2011
```
llvm-svn: 127380
```
  b4c6a344
- Revert 127368 and 127371 for now. · d4b3f8e0
  Evan Cheng authored Mar 09, 2011
```
llvm-svn: 127376
```
  d4b3f8e0
Mar 09, 2011
- Change the definition of TargetRegisterInfo::getCrossCopyRegClass to be more · ca9a9363
  Evan Cheng authored Mar 09, 2011
```
flexible.

If it returns a register class that's different from the input, then that's the
register class used for cross-register class copies.
If it returns a register class that's the same as the input, then no cross-
register class copies are needed (normal copies would do).
If it returns null, then it's not at all possible to copy registers of the
specified register class.

llvm-svn: 127368
```
  ca9a9363
- Fix a pasto that broke all x86_64-elf targets. · 801c9afd
  Benjamin Kramer authored Mar 09, 2011
```
llvm-svn: 127365
```
  801c9afd
- X86 byval copies no longer always_inline. <rdar://problem/8706628> · 9955e2f9
  Stuart Hastings authored Mar 09, 2011
```
llvm-svn: 127359
```
  9955e2f9
- Add createELFObjectTargetWriter method to TargetAsmBackend, which enables... · 6348dc05
  Jan Sjödin authored Mar 09, 2011
```
Add createELFObjectTargetWriter method to TargetAsmBackend, which enables construction of non-standard ELFObjectWriters that can be used in MCJIT.

llvm-svn: 127346
```
  6348dc05
- Target/X86: Tweak va_arg for Win64 not to miss taking va_start when number of fixed args > 4. · 58d1f93b
  NAKAMURA Takumi authored Mar 09, 2011
```
llvm-svn: 127328
```
  58d1f93b
Mar 08, 2011

X86: Fix the (saddo/ssub x, 1) -> incl/decl selection to check the right operand for 1. · 679cfb54
Benjamin Kramer authored Mar 08, 2011
```
Found by inspection.

llvm-svn: 127247
```
679cfb54

Turn on list-ilp scheduling by default on x86 and x86-64, fix up · eb19e9e9

Eric Christopher authored Mar 08, 2011

testcases accordingly. Some are currently xfailed and will be filed
as bugs to be fixed or understood.

Performance results:

roughly neutral on SPEC
some micro benchmarks in the llvm suite are up between 100 and 150%, only
a pair of regressions that are due to be investigated

john-the-ripper saw:
10% improvement in traditional DES
8% improvement in BSDI DES
59% improvement in FreeBSD MD5
67% improvement in OpenBSD Blowfish
14% improvement in LM DES

Small compile time impact.

llvm-svn: 127208

eb19e9e9

Mar 07, 2011
- Move getRegPressureLimit() from TargetLoweringInfo to TargetRegisterInfo. · df616944
  Cameron Zwarich authored Mar 07, 2011
```
llvm-svn: 127175
```
  df616944
Mar 05, 2011

Increased the register pressure limit on x86_64 from 8 to 12 · 641e2d4f

Andrew Trick authored Mar 05, 2011

regs. This is the only change in this checkin that may affects the
default scheduler. With better register tracking and heuristics, it
doesn't make sense to artificially lower the register limit so much.

Added -sched-high-latency-cycles and X86InstrInfo::isHighLatencyDef to
give the scheduler a way to account for div and sqrt on targets that
don't have an itinerary. It is currently defaults to 10 (the actual
number doesn't matter much), but only takes effect on non-default
schedulers: list-hybrid and list-ilp.

Added several heuristics that can be individually disabled for the
non-default sched=list-ilp mode. This helps us determine how much
better we can do on a given benchmark than the default
scheduler. Certain compute intensive loops run much faster in this
mode with the right set of heuristics, and it doesn't seem to have
much negative impact elsewhere. Not all of the heuristics are needed,
but we still need to experiment to decide which should be disabled by
default for sched=list-ilp.

llvm-svn: 127067

641e2d4f

whitespace · 27c079e1
Andrew Trick authored Mar 05, 2011
```
llvm-svn: 127065
```
27c079e1

Mar 04, 2011
- PR9377: Handle x86 str with register operand in a way consistent with gas. · f63614a9
  Eli Friedman authored Mar 04, 2011
```
llvm-svn: 126970
```
  f63614a9
Mar 03, 2011
- Use X86_thiscall calling convention for Win64 as well. · 3bc0bcf3
  Tilmann Scheller authored Mar 03, 2011
```
llvm-svn: 126934
```
  3bc0bcf3
Mar 02, 2011
- Add Win64 thiscall calling convention. · a3769f80
  Tilmann Scheller authored Mar 02, 2011
```
llvm-svn: 126862
```
  a3769f80
- [AVX] Fix mask predicates for 256-bit UNPCKLPS/D and implement · dd567b21
  David Greene authored Mar 02, 2011
```
      missing patterns for them.

      Add a SIMD test subdirectory to hold tests for SIMD instruction
      selection correctness and quality.
'

llvm-svn: 126845
```
  dd567b21
Mar 01, 2011
- Add datalayout information for the IEEE quad precision fp128 type. · c76ae9c8
  Duncan Sands authored Mar 01, 2011
```
llvm-svn: 126780
```
  c76ae9c8
Feb 28, 2011

fix a signed comparison warning. · c93d207e
Chris Lattner authored Feb 28, 2011
```
llvm-svn: 126682
```
c93d207e

· 20a1cbef

David Greene authored Feb 28, 2011

[AVX] Add decode support for VUNPCKLPS/D instructions, both 128-bit
      and 256-bit forms.  Because the number of elements in a vector
      does not determine the vector type (4 elements could be v4f32 or
      v4f64), pass the full type of the vector to decode routines.

llvm-svn: 126664

20a1cbef

Feb 27, 2011

Silence enum conversion warnings. · 25bddae4
Benjamin Kramer authored Feb 27, 2011
```
llvm-svn: 126578
```
25bddae4

Target/X86: Always emit "push/pop GPRs" in prologue/epilogue and emit... · d4e5003a

NAKAMURA Takumi authored Feb 27, 2011

Target/X86: Always emit "push/pop GPRs" in prologue/epilogue and emit "spill/reload frames" for XMMs.

It improves Win64's prologue/epilogue but it would not affect ia32 and amd64 (lack of nonvolatile XMMs).

llvm-svn: 126568

d4e5003a