Commits · e37eebabe7dedfa9bb5d3edd726c60080c98eb90 · Roger Ferrer / llvm-epi-0.8

May 26, 2011
- Reverting 132105: it broke some LLVM-GCC DejaGNU tests. · 493a12bf
  Stuart Hastings authored May 26, 2011
```
llvm-svn: 132108
```
  493a12bf
- Correctly handle a one-word struct passed byval on x86_64. · 276f231c
  Stuart Hastings authored May 26, 2011
```
rdar://problem/6920088

llvm-svn: 132105
```
  276f231c
May 24, 2011

- Teach SelectionDAG::isKnownNeverZero to return true (op x, c) when c is · 88f9137f

Evan Cheng authored May 24, 2011

  non-zero.
- Teach X86 cmov optimization to eliminate the cmov from ctlz, cttz extension
  when the source of X86ISD::BSR / X86ISD::BSF is proven to be non-zero.

rdar://9490949

llvm-svn: 131948

88f9137f

May 20, 2011
- Don't attempt to tail call optimize for Win64. · 552f8c48
  Chad Rosier authored May 20, 2011
```
llvm-svn: 131709
```
  552f8c48
- Revert r131664 and fix it in instcombine instead. rdar://9467055 · e8d2e9eb
  Evan Cheng authored May 20, 2011
```
llvm-svn: 131708
```
  e8d2e9eb
May 19, 2011
- Oddly people want to use the 'r' constraint for fp constants on x86. · 4014e5e2
  Eric Christopher authored May 19, 2011
```
Fixes rdar://9218925
Fixes PR9601

llvm-svn: 131682
```
  4014e5e2
- crc32 with 64-bit output zeros upper 32-bits. rdar://9467055 · 2b9bd386
  Evan Cheng authored May 19, 2011
```
llvm-svn: 131664
```
  2b9bd386
May 18, 2011
- Enables vararg functions that pass all arguments via registers to be optimized... · f4e832b1
  Chad Rosier authored May 18, 2011
```
Enables vararg functions that pass all arguments via registers to be optimized into tail-calls when possible.

llvm-svn: 131560
```
  f4e832b1
May 17, 2011
- Clean up the mess created by r131467+r131469. · d000a2c2
  Eli Friedman authored May 17, 2011
```
llvm-svn: 131471
```
  d000a2c2
- Revert 131467 due to buildbot complaint. · c65d8eda
  Stuart Hastings authored May 17, 2011
```
llvm-svn: 131469
```
  c65d8eda
- Fix an obscure issue in X86_64 parameter passing: if a tiny byval is · 3cf53088
  Stuart Hastings authored May 17, 2011
```
passed as the fifth parameter, insure it's passed correctly (in R9).
rdar://problem/6920088

llvm-svn: 131467
```
  3cf53088
- · d8edb1d5
  Nadav Rotem authored May 17, 2011
```
Fix a bug in PerformEXTRACT_VECTOR_ELTCombine. The code created an ADD SDNode
with two different types, in cases where the index and the ptr had different
types.

llvm-svn: 131461
```
  d8edb1d5
May 16, 2011
- Remove dead code. Fix associated test to use FileCheck. · d4a3609d
  Eli Friedman authored May 16, 2011
```
llvm-svn: 131424
```
  d4a3609d
May 11, 2011

· 8f971c27

Nadav Rotem authored May 11, 2011

Add custom lowering of X86 vector SRA/SRL/SHL when the shift amount is a splat vector.

llvm-svn: 131179

8f971c27

May 06, 2011
- Make the logic for determining function alignment more explicit. No functionality change. · 2518f837
  Eli Friedman authored May 06, 2011
```
llvm-svn: 131012
```
  2518f837
Apr 20, 2011
- ADT/Triple: Renambe isOSX... methods to isMacOSX for consistency with the OS · cd01ed5b
  Daniel Dunbar authored Apr 20, 2011
```
triple component.

llvm-svn: 129838
```
  cd01ed5b
Apr 19, 2011
- Target/X86: Eliminate uses of getDarwinVers(). · 100455a3
  Daniel Dunbar authored Apr 19, 2011
```
llvm-svn: 129813
```
  100455a3
Apr 15, 2011
- Fix a ton of comment typos found by codespell. Patch by · 0ab5e2cd
  Chris Lattner authored Apr 15, 2011
```
Luis Felipe Strano Moraes!

llvm-svn: 129558
```
  0ab5e2cd
Mar 31, 2011
- Don't try to create zero-sized stack objects. · ee9d45dd
  Evan Cheng authored Mar 30, 2011
```
llvm-svn: 128586
```
  ee9d45dd
Mar 26, 2011
- Make helper static. · 8d222737
  Benjamin Kramer authored Mar 26, 2011
```
llvm-svn: 128338
```
  8d222737
Mar 24, 2011

Target/X86: [PR8777][PR8778] Tweak alloca/chkstk for Windows targets. · 521eb7c1
NAKAMURA Takumi authored Mar 24, 2011
```
FIXME: Some cleanups would be needed.
llvm-svn: 128206
```
521eb7c1

Revert r128175. · 4ab9a165

Andrew Trick authored Mar 23, 2011

I'm backing this out for the second time. It was supposed to be fixed by r128164, but the mingw self-host must be defeating the fix.

llvm-svn: 128181

4ab9a165

Mar 23, 2011
- Reapply Eli's r127852 now that the pre-RA scheduler can spill EFLAGS. · 4046a0de
  Andrew Trick authored Mar 23, 2011
```
(target-specific branchless method for double-width relational comparisons on x86)

llvm-svn: 128175
```
  4046a0de
Mar 21, 2011

Re-apply r127953 with fixes: eliminate empty return block if it has no... · 0663f23b

Evan Cheng authored Mar 21, 2011

Re-apply r127953 with fixes: eliminate empty return block if it has no predecessors; update dominator tree if cfg is modified.

llvm-svn: 127981

0663f23b

Mar 19, 2011

Revert r127953, "SimplifyCFG has stopped duplicating returns into predecessors · 327cd36f
Daniel Dunbar authored Mar 19, 2011
```
to canonicalize IR", it broke a lot of things.

llvm-svn: 127954
```
327cd36f

SimplifyCFG has stopped duplicating returns into predecessors to canonicalize IR · 824a7113

Evan Cheng authored Mar 19, 2011

to have single return block (at least getting there) for optimizations. This
is general goodness but it would prevent some tailcall optimizations.
One specific case is code like this:
int f1(void);
int f2(void);
int f3(void);
int f4(void);
int f5(void);
int f6(void);
int foo(int x) {
  switch(x) {
  case 1: return f1();
  case 2: return f2();
  case 3: return f3();
  case 4: return f4();
  case 5: return f5();
  case 6: return f6();
  }
}

=>
LBB0_2:                                 ## %sw.bb
  callq   _f1
  popq    %rbp
  ret
LBB0_3:                                 ## %sw.bb1
  callq   _f2
  popq    %rbp
  ret
LBB0_4:                                 ## %sw.bb3
  callq   _f3
  popq    %rbp
  ret

This patch teaches codegenprep to duplicate returns when the return value
is a phi and where the phi operands are produced by tail calls followed by
an unconditional branch:

sw.bb7:                                           ; preds = %entry
  %call8 = tail call i32 @f5() nounwind
  br label %return
sw.bb9:                                           ; preds = %entry
  %call10 = tail call i32 @f6() nounwind
  br label %return
return:
  %retval.0 = phi i32 [ %call10, %sw.bb9 ], [ %call8, %sw.bb7 ], ... [ 0, %entry ]
  ret i32 %retval.0

This allows codegen to generate better code like this:

LBB0_2:                                 ## %sw.bb
        jmp     _f1                     ## TAILCALL
LBB0_3:                                 ## %sw.bb1
        jmp     _f2                     ## TAILCALL
LBB0_4:                                 ## %sw.bb3
        jmp     _f3                     ## TAILCALL

rdar://9147433

llvm-svn: 127953

824a7113

Add support for legalizing UINT_TO_FP of vectors on platforms which do · e7a101cc

Nadav Rotem authored Mar 19, 2011

not have native support for this operation (such as X86).
The legalized code uses two vector INT_TO_FP operations and is faster
than scalarizing.

llvm-svn: 127951

e7a101cc

Mar 18, 2011

Revert r127852; it's apparently causing an ICE on mingw. · 59721e32
Eli Friedman authored Mar 18, 2011
```
llvm-svn: 127909
```
59721e32

Add a target-specific branchless method for double-width relational · 1a916a3c

Eli Friedman authored Mar 18, 2011

comparisons on x86.  Essentially, the way this works is that SUB+SBB sets
the relevant flags the same way a double-width CMP would.

This is a substantial improvement over the generic lowering in LLVM. The output
is also shorter than the gcc-generated output; I haven't done any detailed
benchmarking, though.

llvm-svn: 127852

1a916a3c

Mar 17, 2011
- Move more logic into getTypeForExtArgOrReturn. · 2ef0c69d
  Cameron Zwarich authored Mar 17, 2011
```
llvm-svn: 127809
```
  2ef0c69d
- Rename getTypeForExtendedInteger() to getTypeForExtArgOrReturn(). · 34e7b3f7
  Cameron Zwarich authored Mar 17, 2011
```
llvm-svn: 127807
```
  34e7b3f7
Mar 16, 2011

The x86-64 ABI says that a bool is only guaranteed to be sign-extended to a byte · ac106273

Cameron Zwarich authored Mar 16, 2011

rather than an int. Thankfully, this only causes LLVM to miss optimizations, not
generate incorrect code.

This just fixes the zext at the return. We still insert an i32 ZextAssert when
reading a function's arguments, but it is followed by a truncate and another i8
ZextAssert so it is not optimized.

llvm-svn: 127766

ac106273

Mar 11, 2011

Change the x86 32-bit scheduler to register pressure and fix up the · cf56a503

Eric Christopher authored Mar 11, 2011

corresponding testcases back to the previous versions.

Fixes some performance regressions only seen on 32-bit.

llvm-svn: 127441

cf56a503

Mar 10, 2011
- Revert 127359; it broke lencod. · d17ae4e9
  Stuart Hastings authored Mar 10, 2011
```
llvm-svn: 127382
```
  d17ae4e9
Mar 09, 2011
- X86 byval copies no longer always_inline. <rdar://problem/8706628> · 9955e2f9
  Stuart Hastings authored Mar 09, 2011
```
llvm-svn: 127359
```
  9955e2f9
- Target/X86: Tweak va_arg for Win64 not to miss taking va_start when number of fixed args > 4. · 58d1f93b
  NAKAMURA Takumi authored Mar 09, 2011
```
llvm-svn: 127328
```
  58d1f93b
Mar 08, 2011

X86: Fix the (saddo/ssub x, 1) -> incl/decl selection to check the right operand for 1. · 679cfb54
Benjamin Kramer authored Mar 08, 2011
```
Found by inspection.

llvm-svn: 127247
```
679cfb54

Turn on list-ilp scheduling by default on x86 and x86-64, fix up · eb19e9e9

Eric Christopher authored Mar 08, 2011

testcases accordingly. Some are currently xfailed and will be filed
as bugs to be fixed or understood.

Performance results:

roughly neutral on SPEC
some micro benchmarks in the llvm suite are up between 100 and 150%, only
a pair of regressions that are due to be investigated

john-the-ripper saw:
10% improvement in traditional DES
8% improvement in BSDI DES
59% improvement in FreeBSD MD5
67% improvement in OpenBSD Blowfish
14% improvement in LM DES

Small compile time impact.

llvm-svn: 127208

eb19e9e9

Mar 07, 2011
- Move getRegPressureLimit() from TargetLoweringInfo to TargetRegisterInfo. · df616944
  Cameron Zwarich authored Mar 07, 2011
```
llvm-svn: 127175
```
  df616944
Mar 05, 2011

Increased the register pressure limit on x86_64 from 8 to 12 · 641e2d4f

Andrew Trick authored Mar 05, 2011

regs. This is the only change in this checkin that may affects the
default scheduler. With better register tracking and heuristics, it
doesn't make sense to artificially lower the register limit so much.

Added -sched-high-latency-cycles and X86InstrInfo::isHighLatencyDef to
give the scheduler a way to account for div and sqrt on targets that
don't have an itinerary. It is currently defaults to 10 (the actual
number doesn't matter much), but only takes effect on non-default
schedulers: list-hybrid and list-ilp.

Added several heuristics that can be individually disabled for the
non-default sched=list-ilp mode. This helps us determine how much
better we can do on a given benchmark than the default
scheduler. Certain compute intensive loops run much faster in this
mode with the right set of heuristics, and it doesn't seem to have
much negative impact elsewhere. Not all of the heuristics are needed,
but we still need to experiment to decide which should be disabled by
default for sched=list-ilp.

llvm-svn: 127067

641e2d4f