Commits · 57b1694df10d30cef3248ba86e2e95ecdd2b8817 · Roger Ferrer / llvm-epi-0.8

Dec 11, 2012

Change TargetLowering::getRepRegClassFor to take an MVT, instead of · 57b1694d
Patrik Hagglund authored Dec 11, 2012
```
EVT.

Accordingly, change RegDefIter to contain MVTs instead of EVTs.

llvm-svn: 169838
```
57b1694d

Change TargetLowering::getRegClassFor to take an MVT, instead of EVT. · 3708e548

Patrik Hagglund authored Dec 11, 2012

Accordingly, add helper funtions getSimpleValueType (in parallel to
getValueType) in SDValue, SDNode, and TargetLowering.

This is the first, in a series of patches.

llvm-svn: 169837

3708e548

Fix a miscompile in the DAG combiner. Previously, we would incorrectly · b27041c5

Chandler Carruth authored Dec 11, 2012

try to reduce the width of this load, and would end up transforming:

  (truncate (lshr (sextload i48 <ptr> as i64), 32) to i32)
to
  (truncate (zextload i32 <ptr+4> as i64) to i32)

We lost the sext attached to the load while building the narrower i32
load, and replaced it with a zext because lshr always zext's the
results. Instead, bail out of this combine when there is a conflict
between a sextload and a zext narrowing. The rest of the DAG combiner
still optimize the code down to the proper single instruction:

  movswl 6(...),%eax

Which is exactly what we wanted. Previously we read past the end *and*
missed the sign extension:

  movl 6(...), %eax

llvm-svn: 169802

b27041c5

Fall back to the selection dag isel to select tail calls. · df42cf39

Chad Rosier authored Dec 11, 2012

This shouldn't affect codegen for -O0 compiles as tail call markers are not
emitted in unoptimized compiles.  Testing with the external/internal nightly
test suite reveals no change in compile time performance.  Testing with -O1,
-O2 and -O3 with fast-isel enabled did not cause any compile-time or
execution-time failures.  All tests were performed on my x86 machine.
I'll monitor our arm testers to ensure no regressions occur there.

In an upcoming clang patch I will be marking the objc_autoreleaseReturnValue
and objc_retainAutoreleaseReturnValue as tail calls unconditionally.  While
it's theoretically true that this is just an optimization, it's an
optimization that we very much want to happen even at -O0, or else ARC
applications become substantially harder to debug.

Part of rdar://12553082

llvm-svn: 169796

df42cf39

Refactor out the abbreviation handling into a separate class that · c8a310ed

Eric Christopher authored Dec 10, 2012

controls each of the abbreviation sets (only a single one at the
moment) and computes offsets separately as well for each set
of DIEs.

No real function change, ordering of abbreviations for the skeleton
CU changed but only because we're computing in a separate order. Fix
the testcase not to care.

llvm-svn: 169793

c8a310ed

Some enhancements for memcpy / memset inline expansion. · 79e2ca90

Evan Cheng authored Dec 10, 2012

1. Teach it to use overlapping unaligned load / store to copy / set the trailing
   bytes. e.g. On 86, use two pairs of movups / movaps for 17 - 31 byte copies.
2. Use f64 for memcpy / memset on targets where i64 is not legal but f64 is. e.g.
   x86 and ARM.
3. When memcpy from a constant string, do *not* replace the load with a constant
   if it's not possible to materialize an integer immediate with a single
   instruction (required a new target hook: TLI.isIntImmLegal()).
4. Use unaligned load / stores more aggressively if target hooks indicates they
   are "fast".
5. Update ARM target hooks to use unaligned load / stores. e.g. vld1.8 / vst1.8.
   Also increase the threshold to something reasonable (8 for memset, 4 pairs
   for memcpy).

This significantly improves Dhrystone, up to 50% on ARM iOS devices.

rdar://12760078

llvm-svn: 169791

79e2ca90

Dec 10, 2012
- Defer call to InitSections until after MCContext has been initialized. If · 517fc8b2
  Lang Hames authored Dec 10, 2012
```
InitSections is called before the MCContext is initialized it could cause
duplicate temporary symbols to be emitted later (after context initialization
resets the temporary label counter).

llvm-svn: 169785
```
  517fc8b2
- Rearrange vars and make comments more obvious. · 0aa4a670
  Eric Christopher authored Dec 10, 2012
```
llvm-svn: 169780
```
  0aa4a670
- Remove blank line at top of file. · 81d091ee
  Eric Christopher authored Dec 10, 2012
```
llvm-svn: 169779
```
  81d091ee
- Fix a coding style nit. · 200dd760
  Eric Christopher authored Dec 10, 2012
```
llvm-svn: 169776
```
  200dd760
- LegalizeDAG: Allow type promotion of scalar loads · 30e2aa50
  Tom Stellard authored Dec 10, 2012
```
llvm-svn: 169773
```
  30e2aa50
- LegalizeDAG: Allow type promotion for scalar stores · b785bd77
  Tom Stellard authored Dec 10, 2012
```
llvm-svn: 169772
```
  b785bd77
- Use the somewhat semantic term "split dwarf" it more matches what's · cdf218d6
  Eric Christopher authored Dec 10, 2012
```
going on and makes a lot of the terminology in comments make more sense.

llvm-svn: 169758
```
  cdf218d6
- Delete the FissionCU. · 8afd7b60
  Eric Christopher authored Dec 10, 2012
```
llvm-svn: 169757
```
  8afd7b60
- Reorder fission variables. · d79f5480
  Eric Christopher authored Dec 10, 2012
```
llvm-svn: 169756
```
  d79f5480
- Use GetUnderlyingObjects in misched · 66859ae0
  Hal Finkel authored Dec 10, 2012
```
misched used GetUnderlyingObject in order to break false load/store
dependencies, and the -enable-aa-sched-mi feature similarly relied on
GetUnderlyingObject in order to ensure it is safe to use the aliasing analysis.
Unfortunately, GetUnderlyingObject does not recurse through phi nodes, and so
(especially due to LSR) all of these mechanisms failed for
induction-variable-dependent loads and stores inside loops.

This change replaces uses of GetUnderlyingObject with GetUnderlyingObjects
(which will recurse through phi and select instructions) in misched.

Andy reviewed, tested and simplified this patch; Thanks!

llvm-svn: 169744
```
  66859ae0
- Teach DAG combine to handle vector add/sub with vectors of all 0s. · d8005db4
  Craig Topper authored Dec 10, 2012
```
llvm-svn: 169727
```
  d8005db4
Dec 09, 2012
- Remove extra blank line. · 5ea3bdd7
  Craig Topper authored Dec 09, 2012
```
llvm-svn: 169692
```
  5ea3bdd7
Dec 08, 2012

Teach DAG combine to handle vector logical operations with vectors of all 1s... · a183ddb0

Craig Topper authored Dec 08, 2012

Teach DAG combine to handle vector logical operations with vectors of all 1s or all 0s. These cases can show up when vectors are split for legalizing. Fix some tests that were dependent on these cases not being combined.

llvm-svn: 169684

a183ddb0

Dec 07, 2012

Add higher-level API for dealing with bundled MachineInstrs. · fead62d4

Jakob Stoklund Olesen authored Dec 07, 2012

This is still a work in progress. The purpose is to make bundling and
unbundling operations explicit, and to catch errors where bundles are
broken or created inadvertently.

The old IsInsideBundle flag is replaced by two MI flags: BundledPred
which has the same meaning as IsInsideBundle, and BundledSucc which is
set on instructions that are bundled with a successor. Having two flags
provdes redundancy to detect when a bundle is inadvertently torn by a
splice() or insert(), and it makes it possible to write bundle iterators
that don't need to peek at adjacent instructions.

The new flags can't be manipulated directly (once setIsInsideBundle is
gone). Instead there are MI functions to make and break bundle bonds.

The setIsInsideBundle function will be removed in a future commit. It
should be replaced by bundleWithPred().

llvm-svn: 169583

fead62d4

Dec 06, 2012

fixed valgrind issues of prior commit, this change applies r169456 changes... · e84b13f0

Pedro Artigas authored Dec 06, 2012

fixed valgrind issues of prior commit, this change applies r169456 changes back to the tree with fixes. on darwin no valgrind issues exist in the tests that used to fail.

original change description:

change MCContext to work on the doInitialization/doFinalization model

reviewed by Evan Cheng <evan.cheng@apple.com>

llvm-svn: 169553

e84b13f0

Replace r169459 with something safer. Rather than having computeMaskedBits to · 9ec512d7

Evan Cheng authored Dec 06, 2012

understand target implementation of any_extend / extload, just generate
zero_extend in place of any_extend for liveouts when the target knows the
zero_extend will be implicit (e.g. ARM ldrb / ldrh) or folded (e.g. x86 movz).

rdar://12771555

llvm-svn: 169536

9ec512d7

Fix a bug in the code that merges consecutive stores. Previously we did not · ac450eb5

Nadav Rotem authored Dec 06, 2012

check if loads that happen in between stores alias with the first store in the
chain, only with the second store onwards.

llvm-svn: 169516

ac450eb5

s/getLowerBoundDefault/getDefaultLowerBound/ for consistency. Also put the... · 3495f9b6

Bill Wendling authored Dec 06, 2012

s/getLowerBoundDefault/getDefaultLowerBound/ for consistency. Also put the more natural check first in the if-then statement.

llvm-svn: 169486

3495f9b6

Handle non-default array bounds. · 28fe9e7a

Bill Wendling authored Dec 06, 2012

Some languages, e.g. Ada and Pascal, allow you to specify that the array bounds
are different from the default (1 in these cases). If we have a lower bound
that's non-default, then we emit the lower bound. We also calculate the correct
upper bound in those cases.

llvm-svn: 169484

28fe9e7a

Revert r169456, "change MCContext to work on the doInitialization/doFinalization model" · d985d760
NAKAMURA Takumi authored Dec 06, 2012
```
It broke many builders.

llvm-svn: 169462
```
d985d760

Let targets provide hooks that compute known zero and ones for any_extend · 5213139f

Evan Cheng authored Dec 06, 2012

and extload's. If they are implemented as zero-extend, or implicitly
zero-extend, then this can enable more demanded bits optimizations. e.g.

define void @foo(i16* %ptr, i32 %a) nounwind {
entry:
  %tmp1 = icmp ult i32 %a, 100
  br i1 %tmp1, label %bb1, label %bb2
bb1:
  %tmp2 = load i16* %ptr, align 2
  br label %bb2
bb2:
  %tmp3 = phi i16 [ 0, %entry ], [ %tmp2, %bb1 ]
  %cmp = icmp ult i16 %tmp3, 24
  br i1 %cmp, label %bb3, label %exit
bb3:
  call void @bar() nounwind
  br label %exit
exit:
  ret void
}

This compiles to the followings before:
        push    {lr}
        mov     r2, #0
        cmp     r1, #99
        bhi     LBB0_2
@ BB#1:                                 @ %bb1
        ldrh    r2, [r0]
LBB0_2:                                 @ %bb2
        uxth    r0, r2
        cmp     r0, #23
        bhi     LBB0_4
@ BB#3:                                 @ %bb3
        bl      _bar
LBB0_4:                                 @ %exit
        pop     {lr}
        bx      lr

The uxth is not needed since ldrh implicitly zero-extend the high bits. With
this change it's eliminated.

rdar://12771555

llvm-svn: 169459

5213139f

· bf7d3bab

Pedro Artigas authored Dec 06, 2012

change MCContext to work on the doInitialization/doFinalization model

reviewed by Evan Cheng <evan.cheng@apple.com>

llvm-svn: 169456

bf7d3bab

RegPressureTracker::dump(): Remove unnecessary argument. · d3226eee
Andrew Trick authored Dec 05, 2012
```
llvm-svn: 169443
```
d3226eee

Dec 05, 2012

RegisterPressureTracker: fix findUseBetween to handle DebugValue · fda7a883
Andrew Trick authored Dec 05, 2012
```
llvm-svn: 169427
```
fda7a883
RegisterPressureTracker: unify virtual registers and physical regunits. · 7bbcad7b
Andrew Trick authored Dec 05, 2012
```
Now that live register units are tracked individually, the code can be simplified.

llvm-svn: 169426
```
7bbcad7b

RegisterPresssureTracker: Track live physical register by unit. · 7f7cee39

Andrew Trick authored Dec 05, 2012

This is much simpler to reason about, more efficient, and
fixes some corner cases involving implicit super-register defs.
Fixed rdar://12797931.

llvm-svn: 169425

7f7cee39

Remove unused MachineInstr constructors. · a97cec79

Jakob Stoklund Olesen authored Dec 05, 2012

A MachineInstr can only ever be constructed by CreateMachineInstr() and
CloneMachineInstr(), and those factories don't use the removed
constructors.

llvm-svn: 169395

a97cec79

· 41b98843

Pedro Artigas authored Dec 05, 2012

- Added calls to doInitialization/doFinalization to immutable passes
- fixed ordering of calls to doFinalization to be the reverse of the pass run order due to potential dependencies
- fixed machine module info to operate in the doInitialization/doFinalization model, also fixes some FIXMEs

reviewed by Evan Cheng <evan.cheng@apple.com>

llvm-svn: 169391

41b98843

Added RegisterPressureTracker::dump() for debugging. · d52ab339
Andrew Trick authored Dec 05, 2012
```
llvm-svn: 169359
```
d52ab339

Dec 04, 2012

Speed up the AllocationOrder class a bit. · 3cb2cb80

Jakob Stoklund Olesen authored Dec 04, 2012

Allow the central functions to be inlined, and use the argumentless
isHint() function when possible.

llvm-svn: 169319

3cb2cb80

Comment change made in r169304 as requested by Eric Christopher. · 67cb31eb
David Blaikie authored Dec 04, 2012
```
llvm-svn: 169315
```
67cb31eb

Use the 'count' attribute to calculate the upper bound of an array. · d7767125

Bill Wendling authored Dec 04, 2012

The count attribute is more accurate with regards to the size of an array. It
also obviates the upper bound attribute in the subrange. We can also better
handle an unbound array by setting the count to -1 instead of the lower bound to
1 and upper bound to 0.

llvm-svn: 169312

d7767125

Reapply r160148 (reverted in r163570) fixing spurious breakpoints in modern GDB · 5a773bb6

David Blaikie authored Dec 04, 2012

This reapplies the fix for PR13303 now with more justification. Based on my
execution of the GDB 7.5 test suite this results in:

expected passes: 16101 -> 20890 (+30%)
unexpected failures: 4826 -> 637 (-77%)

There are 23 checks that used to pass and now fail. They are all in
gdb.reverse. Investigating a few looks like they were accidentally passing
due to extra breakpoints being set by this bug. They're generally due to the
difference in end location between gcc and clang, the test suite is trying to
set breakpoints on the closing '}' that clang doesn't associate with any
instructions.

llvm-svn: 169304

5a773bb6

Sort includes for all of the .h files under the 'lib' tree. These were · 802d7555

Chandler Carruth authored Dec 04, 2012

missed in the first pass because the script didn't yet handle include
guards.

Note that the script is now able to handle all of these headers without
manual edits. =]

llvm-svn: 169224

802d7555