Commits · bed1abf9cafb5965fc1ec5b80c6030f1d3790f06 · Roger Ferrer / llvm-epi-0.8

Apr 08, 2012

Remove an over zealous assert. The assert was trying to catch places · bed1abf9

Chandler Carruth authored Apr 08, 2012

where a chain outside of the loop block-set ended up in the worklist for
scheduling as part of the contiguous loop. However, asserting the first
block in the chain is in the loop-set isn't a valid check -- we may be
forced to drag a chain into the worklist due to one block in the chain
being part of the loop even though the first block is *not* in the loop.
This occurs when we have been forced to form a chain early due to
un-analyzable branches.

No test case here as I have no idea how to even begin reducing one, and
it will be hopelessly fragile. We have to somehow end up with a loop
header of an inner loop which is a successor of a basic block with an
unanalyzable pair of branch instructions. Ow. Self-host triggers it so
it is unlikely it will regress.

This at least gets block placement back to passing selfhost and the test
suite. There are still a lot of slowdown that I don't like coming out of
block placement, although there are now also a lot of speedups. =[ I'm
seeing swings in both directions up to 10%. I'm going to try to find
time to dig into this and see if we can turn this on for 3.1 as it does
a really good job of cleaning up after some loops that degraded with the
inliner changes.

llvm-svn: 154287

bed1abf9

Add a debug-only 'dump' method to the BlockChain structure to ease · 49158908
Chandler Carruth authored Apr 08, 2012
```
debugging.

llvm-svn: 154286
```
49158908

Teach InstCombine to nuke a common alloca pattern -- an alloca which has · f82b0e2d

Chandler Carruth authored Apr 08, 2012

GEPs, bit casts, and stores reaching it but no other instructions. These
often show up during the iterative processing of the inliner, SROA, and
DCE. Once we hit this point, we can completely remove the alloca. These
were actually showing up in the final, fully optimized code in a bunch
of inliner tests I've been working on, and notably they show up after
LLVM finishes optimizing away all function calls involved in
hash_combine(a, b).

llvm-svn: 154285

f82b0e2d

AVX2: Build splat vectors by broadcasting a scalar from the constant pool. · 82609df6

Nadav Rotem authored Apr 08, 2012

Previously we used three instructions to broadcast an immediate value into a
vector register.
On Sandybridge we continue to load the broadcasted value from the constant pool.

llvm-svn: 154284

82609df6

Remove old 'grep' lines. · 8c783d41
Bill Wendling authored Apr 08, 2012
```
llvm-svn: 154283
```
8c783d41
Formatting changes. Don't put spaces in front of some code, which only makes it look 'off'. · ccf11090
Bill Wendling authored Apr 08, 2012
```
llvm-svn: 154282
```
ccf11090
FileCheckize these testcases. · 57f8e5eb
Bill Wendling authored Apr 08, 2012
```
llvm-svn: 154281
```
57f8e5eb

Remove the 'Parent' pointer from the MDNodeOperand class. · 5c0068f8

Bill Wendling authored Apr 08, 2012

An MDNode has a list of MDNodeOperands allocated directly after it as part of
its allocation. Therefore, the Parent of the MDNodeOperands can be found by
walking back through the operands to the beginning of that list. Mark the first
operand's value pointer as being the 'first' operand so that we know where the
beginning of said list is.

This saves a *lot* of space during LTO with -O0 -g flags.

llvm-svn: 154280

5c0068f8

Allow subclasses of the ValueHandleBase to store information as part of the · 9b2503a0
Bill Wendling authored Apr 08, 2012
```
value pointer by making the value pointer into a pointer-int pair with 2 bits
available for flags.

llvm-svn: 154279
```
9b2503a0
Don't forget to evaluate the subexpression in a null pointer cast. If we're · 4051ff76
Richard Smith authored Apr 08, 2012
```
converting from std::nullptr_t, the subexpression might have side-effects.

llvm-svn: 154278
```
4051ff76
[docs] Add more open projects. · d73a53f1
Michael J. Spencer authored Apr 08, 2012
```
llvm-svn: 154277
```
d73a53f1
[docs] Add documentation todos. · 00d9e87c
Michael J. Spencer authored Apr 08, 2012
```
llvm-svn: 154276
```
00d9e87c
[docs] Make the index page ReST based instead of html based. · d01c8fe7
Michael J. Spencer authored Apr 08, 2012
```
llvm-svn: 154275
```
d01c8fe7
[docs] Add open projects page that includes the TODO.txt files. · f9bc125c
Michael J. Spencer authored Apr 07, 2012
```
llvm-svn: 154274
```
f9bc125c

ext_reserved_user_defined_literal must not default to Error in MicrosoftMode.... · 7ebc4c19

Francois Pichet authored Apr 07, 2012

ext_reserved_user_defined_literal must not default to Error in MicrosoftMode. Hence create ext_ms_reserved_user_defined_literal that doesn't default to Error; otherwise MSVC headers won't parse.

Fixes PR12383.

llvm-svn: 154273

7ebc4c19

Turn avx2 vinserti128 intrinsic calls into INSERT_SUBVECTOR DAG nodes and... · d024cef2

Craig Topper authored Apr 07, 2012

Turn avx2 vinserti128 intrinsic calls into INSERT_SUBVECTOR DAG nodes and remove patterns for selecting the intrinsic. Similar was already done for avx1.

llvm-svn: 154272

d024cef2

MIPS: Pass -mabi option to the assmbler when compile MIPS targets. · 571d7bde
Simon Atanasyan authored Apr 07, 2012
```
llvm-svn: 154270
```
571d7bde
MIPS: Move code calculates CPU and ABI names to the separate function to reuse this function later. · 3b7589a6
Simon Atanasyan authored Apr 07, 2012
```
llvm-svn: 154269
```
3b7589a6

Apr 07, 2012

Move vinsertf128 patterns near the instruction definitions. Add... · aa9aab5a

Craig Topper authored Apr 07, 2012

Move vinsertf128 patterns near the instruction definitions. Add AddedComplexity to AVX2 vextracti128 patterns to give them priority over the integer versions of vextractf128 patterns.

llvm-svn: 154268

aa9aab5a

Remove 'else' after 'if' that ends in return. · e09d1c5c
Craig Topper authored Apr 07, 2012
```
llvm-svn: 154267
```
e09d1c5c

1. Remove the part of r153848 which optimizes shuffle-of-shuffle into a new · 71d07ae5

Nadav Rotem authored Apr 07, 2012

   shuffle node because it could introduce new shuffle nodes that were not
   supported efficiently by the target.

2. Add a more restrictive shuffle-of-shuffle optimization for cases where the
   second shuffle reverses the transformation of the first shuffle.

llvm-svn: 154266

71d07ae5

Convert floating point division by a constant into multiplication by the · 5f8397a9

Duncan Sands authored Apr 07, 2012

reciprocal if converting to the reciprocal is exact.  Do it even if inexact
if -ffast-math.  This substantially speeds up ac.f90 from the polyhedron
benchmarks.

llvm-svn: 154265

5f8397a9

Perform partial SROA on the helper hashing structure. I really wish the · 75a1cf32

Chandler Carruth authored Apr 07, 2012

optimizers could do this for us, but expecting partial SROA of classes
with template methods through cloning is probably expecting too much
heroics. With this change, the begin/end pointer pairs which indicate
the status of each loop iteration are actually passed directly into each
layer of the combine_data calls, and the inliner has a chance to see
when most of the combine_data function could be deleted by inlining.
Similarly for 'length'.

We have to be careful to limit the places where in/out reference
parameters are used as those will also defeat the inliner / optimizers
from properly propagating constants.

With this change, LLVM is able to fully inline and unroll the hash
computation of small sets of values, such as two or three pointers.
These now decompose into essentially straight-line code with no loops or
function calls.

There is still one code quality problem to be solved with the hashing --
LLVM is failing to nuke the alloca. It removes all loads from the
alloca, leaving only lifetime intrinsics and dead(!!) stores to the
alloca. =/ Very unfortunate.

llvm-svn: 154264

75a1cf32

Fix ValueTracking to conclude that debug intrinsics are safe to · 28192c93

Chandler Carruth authored Apr 07, 2012

speculate. Without this, loop rotate (among many other places) would
suddenly stop working in the presence of debug info. I found this
looking at loop rotate, and have augmented its tests with a reduction
out of a very hot loop in yacr2 where failing to do this rotation costs
sometimes more than 10% in runtime performance, perturbing numerous
downstream optimizations.

This should have no impact on performance without debug info, but the
change in performance when debug info is enabled can be extreme. As
a consequence (and this how I got to this yak) any profiling of
performance problems should be treated with deep suspicion -- they may
have been wildly innacurate of debug info was enabled for profiling. =/
Just a heads up.

llvm-svn: 154263

28192c93

SCEV: When expanding a GEP the final addition to the base pointer has NUW but not NSW. · e1f4ca1b
Benjamin Kramer authored Apr 07, 2012
```
Found by inspection.

llvm-svn: 154262
```
e1f4ca1b

Fix Thumb __builtin_longjmp with integrated assembler. <rdar://problem/11203543 > · 6f9be7e2

Bob Wilson authored Apr 07, 2012

The tLDRr instruction with the last register operand set to the zero register
prints in assembly as if no register was specified, and the assembler encodes
it as a tLDRi instruction with a zero immediate. With the integrated assembler,
that zero register gets emitted as "r0", so we get "ldr rx, [ry, r0]" which
is broken. Emit the instruction as tLDRi with a zero immediate. I don't
know if there's a good way to write a testcase for this. Suggestions welcome.

Opportunities for follow-up work:
1) The asm printer should complain if a non-optional register operand is set
to the zero register, instead of silently dropping it.
2) The integrated assembler should complain in the same situation, instead of
silently emitting the operand as "r0".

llvm-svn: 154261

6f9be7e2

Rewritten expandRegion to clarify the intention and improve · ed986ab6
Hongbin Zheng authored Apr 07, 2012
```
  performance, patched by Johannes Doerfert <johannes@jdoerfert.de>.

llvm-svn: 154260
```
ed986ab6
ScopDetection: Add some comments to function "expandRegion". · 3a2d6035
Hongbin Zheng authored Apr 07, 2012
```
llvm-svn: 154259
```
3a2d6035
Speed up SCoP detection time by checking the exit of the region first, · 94868e6c
Hongbin Zheng authored Apr 07, 2012
```
  patched by Johannes Doerfert <johannes@jdoerfert.de>.

llvm-svn: 154258
```
94868e6c
Linux/ProcessMonitor: include sys/user.h for user_regs_struct and user_fpregs_struct. · c2b5c67d
Benjamin Kramer authored Apr 07, 2012
```
llvm-svn: 154255
```
c2b5c67d
[Cygwin] Work around to flush stdout in a thread, or stdout in threads won't be flushed at exit. · a7d49883
NAKAMURA Takumi authored Apr 07, 2012
```
llvm-svn: 154254
```
a7d49883
Version bump to lldb-138. · 7a2f4333
Jason Molenda authored Apr 07, 2012
```
llvm-svn: 154252
```
7a2f4333

CodeGen: Allow Polly to do 'grouped unrolling', but no vector generation. · 84ecc47e

Tobias Grosser authored Apr 07, 2012

Grouped unrolling means that we unroll a loop such that the different instances
of a certain statement are scheduled right after each other, but we do
not generate any vector code. The idea here is that we can schedule the
bb vectorizer right afterwards and use it heuristics to decide when
vectorization should be performed.

llvm-svn: 154251

84ecc47e

Fix a integer trauction issue - calculating the current time in · b9e88d41

Jason Molenda authored Apr 07, 2012

nanoseconds in 32-bit expression would cause pthread_cond_timedwait
to time out immediately.  Add explicit casts to the TimeValue::TimeValue
ctor that takes a struct timeval and change the NanoSecsPerSec etc
constants defined in TimeValue to be uint64_t so any other calculations
involving these should be promoted to 64-bit even when lldb is built
for 32-bit.

<rdar://problem/11204073>, <rdar://problem/11179821>, <rdar://problem/11194705>.

llvm-svn: 154250

b9e88d41

Refactor: Use positive field names in VectorizeConfig. · 5758f495
Hongbin Zheng authored Apr 07, 2012
```
llvm-svn: 154249
```
5758f495

Fix several problems with protected access control: · 5dadb65e

John McCall authored Apr 07, 2012

  - The [class.protected] restriction is non-trivial for any instance
    member, even if the access lacks an object (for example, if it's
    a pointer-to-member constant).  In this case, it is equivalent to
    requiring the naming class to equal the context class.
  - The [class.protected] restriction applies to accesses to constructors
    and destructors.  A protected constructor or destructor can only be
    used to create or destroy a base subobject, as a direct result.
  - Several places were dropping or misapplying object information.

The standard could really be much clearer about what the object type is
supposed to be in some of these accesses.  Usually it's easy enough to
find a reasonable answer, but still, the standard makes a very confident
statement about accesses to instance members only being possible in
either pointer-to-member literals or member access expressions, which
just completely ignores concepts like constructor and destructor
calls, using declarations, unevaluated field references, etc.

llvm-svn: 154248

5dadb65e

Target/X86/MCTargetDesc/X86MCAsmInfo.cpp: Enable DwarfCFI (aka DW2) on Cygming. · b95f6413

NAKAMURA Takumi authored Apr 07, 2012

Cygwin-1.7 supports dw2. Some recent mingw distros support one, too.
I have confirmed test-suite/SingleSource/Benchmarks/Shootout-C++/except.cpp can pass on Cygwin.

llvm-svn: 154247

b95f6413

Add to-do lists · 5a1528f4
Nick Kledzik authored Apr 07, 2012
```
llvm-svn: 154246
```
5a1528f4
Make the test for r154235 more platform-independent with a shorter · 78fce432
Alexis Hunt authored Apr 07, 2012
```
string.

llvm-svn: 154243
```
78fce432

First implementation of Darwin Platform. It is rich enough to generate · b334be1e

Nick Kledzik authored Apr 07, 2012

a hello world executable from atoms.  There is still much to be flushed out.
Added one test case, test/darwin/hello-world.objtxt, which exercises the
darwin platform.

Added -platform option to lld-core tool to dynamically select platform.

llvm-svn: 154242

b334be1e