Commits · dc118270ac8e189c75cb9e90b5b0ecd8e191f53c · Roger Ferrer / llvm-epi-0.8

Jul 29, 2013

Make file_status::getUniqueID const. · d123099a
Rafael Espindola authored Jul 29, 2013
```
llvm-svn: 187383
```
d123099a
Include st_dev to make the result of getUniqueID actually unique. · 7f822a93
Rafael Espindola authored Jul 29, 2013
```
This will let us use getUniqueID instead of st_dev directly on clang.

llvm-svn: 187378
```
7f822a93
[mips] Add comment and simplify function. · 52dd808b
Akira Hatanaka authored Jul 29, 2013
```
llvm-svn: 187371
```
52dd808b
SLPVectorier: update the debug location for the new instructions. · d9c74cc6
Nadav Rotem authored Jul 29, 2013
```
llvm-svn: 187363
```
d9c74cc6

Use proper section suffix for COFF weak symbols · 7fdaee8f

Nico Rieck authored Jul 29, 2013

32-bit symbols have "_" as global prefix, but when forming the name of
COMDAT sections this prefix is ignored. The current behavior assumes that
this prefix is always present which is not the case for 64-bit and names
are truncated.

llvm-svn: 187356

7fdaee8f

Proper va_arg/va_copy lowering on win64 · 06d17c80

Nico Rieck authored Jul 29, 2013

Win64 uses CharPtrBuiltinVaList instead of X86_64ABIBuiltinVaList like
other 64-bit targets.

llvm-svn: 187355

06d17c80

Add support for the 's' operation to llvm-ar. · b6b5f52e

Rafael Espindola authored Jul 29, 2013

If no other operation is specified, 's' becomes an operation instead of an
modifier. The s operation just creates a symbol table. It is the same as
running ranlib.

We assume the archive was created by a sane ar (like llvm-ar or gnu ar) and
if the symbol table is present, then it is current. We use that to optimize
the most common case: a broken build system that thinks it has to run ranlib.

llvm-svn: 187353

b6b5f52e

MC: Support larger COFF string tables · 2c9c89b2

Nico Rieck authored Jul 29, 2013

Single-slash encoded entries do not require a terminating null. This bumps
the maximum table size from ~1MB to ~9.5MB.

llvm-svn: 187352

2c9c89b2

Some Intel Penryn CPUs come with SSE4 disabled. Detect them as core 2. · fb34989a
Benjamin Kramer authored Jul 29, 2013
```
PR16721.

llvm-svn: 187350
```
fb34989a

Allow generation of vmla.f32 instructions when targeting Cortex-A15. The patch... · 91ddaa1b

Silviu Baranga authored Jul 29, 2013

Allow generation of vmla.f32 instructions when targeting Cortex-A15. The patch also adds the VFP4 feature to Cortex-A15 and fixes the DontUseFusedMAC predicate so that we can still generate vmla.f32 instructions on non-darwin targets with VFP4.

llvm-svn: 187349

91ddaa1b

test commit · 862b0451
Robert Lytton authored Jul 29, 2013
```
llvm-svn: 187348
```
862b0451

Teach the AllocaPromoter which is wrapped around the SSAUpdater · cd7c8cdf

Chandler Carruth authored Jul 29, 2013

infrastructure to do promotion without a domtree the same smarts about
looking through GEPs, bitcasts, etc., that I just taught mem2reg about.
This way, if SROA chooses to promote an alloca which still has some
noisy instructions this code can cope with them.

I've not used as principled of an approach here for two reasons:
1) This code doesn't really need it as we were already set up to zip
   through the instructions used by the alloca.
2) I view the code here as more of a hack, and hopefully a temporary one.

The SSAUpdater path in SROA is a real sore point for me. It doesn't make
a lot of architectural sense for many reasons:
- We're likely to end up needing the domtree anyways in a subsequent
  pass, so why not compute it earlier and use it.
- In the future we'll likely end up needing the domtree for parts of the
  inliner itself.
- If we need to we could teach the inliner to preserve the domtree. Part
  of the re-work of the pass manager will allow this to be very powerful
  even in large SCCs with many functions.
- Ultimately, computing a domtree has gotten significantly faster since
  the original SSAUpdater-using code went into ScalarRepl. We no longer
  use domfrontiers, and much of domtree is lazily done based on queries
  rather than eagerly.
- At this point keeping the SSAUpdater-based promotion saves a total of
  0.7% on a build of the 'opt' tool for me. That's not a lot of
  performance given the complexity!

So I'm leaving this a bit ugly in the hope that eventually we just
remove all of this nonsense.

I can't even readily test this because this code isn't reachable except
through SROA. When I re-instate the patch that fast-tracks allocas
already suitable for promotion, I'll add a testcase there that failed
before this change. Before that, SROA will fix any test case I give it.

llvm-svn: 187347

cd7c8cdf

Don't vectorize when the attribute NoImplicitFloat is used. · 750e42cb
Nadav Rotem authored Jul 29, 2013
```
llvm-svn: 187340
```
750e42cb
Fix -Wdocumentation warnings. · caa776be
Rafael Espindola authored Jul 28, 2013
```
llvm-svn: 187336
```
caa776be

Update comments for SSAUpdater to use the modern doxygen comment · 6b55dbea

Chandler Carruth authored Jul 28, 2013

standards for LLVM. Remove duplicated comments on the interface from the
implementation file (implementation comments are left there of course).
Also clean up, re-word, and fix a few typos and errors in the commenst
spotted along the way.

This is in preparation for changes to these files and to keep the
uninteresting tidying in a separate commit.

llvm-svn: 187335

6b55dbea

Jul 28, 2013

Temporarily revert r187323 until I update SSAUpdater to match mem2reg. · d31370e0
Chandler Carruth authored Jul 28, 2013
```
I forgot that we had two totally independent things here. :: sigh ::

llvm-svn: 187327
```
d31370e0
Added encoding prefixes for KNL instructions (EVEX). · 003e7d73
Elena Demikhovsky authored Jul 28, 2013
```
Added 512-bit operands printing.
Added instruction formats for KNL instructions.

llvm-svn: 187324
```
003e7d73

Now that mem2reg understands how to cope with a slightly wider set of · 9d96100f

Chandler Carruth authored Jul 28, 2013

uses of an alloca, we can pre-compute promotability while analyzing an
alloca for splitting in SROA. That lets us short-circuit the common case
of a bunch of trivially promotable allocas. This cuts 20% to 30% off the
run time of SROA for typical frontend-generated IR sequneces I'm seeing.
It gets the new SROA to within 20% of ScalarRepl for such code. My
current benchmark for these numbers is PR15412, but it fits the general
pattern of IR emitted by Clang so it should be widely applicable.

llvm-svn: 187323

9d96100f

Thread DataLayout through the callers and into mem2reg. This will be · d5b806a2

Chandler Carruth authored Jul 28, 2013

useful in a subsequent patch, but causes an unfortunate amount of noise,
so I pulled it out into a separate patch.

llvm-svn: 187322

d5b806a2

[PowerPC] Add comment explaining preprocessor directive. · 40f78a2a
Bill Schmidt authored Jul 28, 2013
```
llvm-svn: 187320
```
40f78a2a
Revert 187318 · 20573225
Bill Schmidt authored Jul 28, 2013
```
llvm-svn: 187319
```
20573225

[PowerPC] Remove unnecessary preprocessor checking. · f5b32e39

Bill Schmidt authored Jul 28, 2013

The tests !defined(__ppc__) && !defined(__powerpc__) are not needed
or helpful when verifying that code is being compiled for a 64-bit
target.  The simpler test provided by this revision is sufficient to
tell if the target is 64-bit.

llvm-svn: 187318

f5b32e39

Update the comment · 3e50c689
Nadav Rotem authored Jul 27, 2013
```
llvm-svn: 187316
```
3e50c689

Jul 27, 2013

[APFloat] Make all arithmetic operations with NaN produce positive NaNs. · b0e688e8

Michael Gottesman authored Jul 27, 2013

IEEE-754R 1.4 Exclusions states that IEEE-754R does not specify the
interpretation of the sign of NaNs. In order to remove an irrelevant
variable that most floating point implementations do not use,
standardize add, sub, mul, div, mod so that operating anything with
NaN always yields a positive NaN.

In a later commit I am going to update the APIs for creating NaNs so
that one can not even create a negative NaN.

llvm-svn: 187314

b0e688e8

[APFloat] Move setting fcNormal in zeroSignificand() to calling code. · 30a90eb1

Michael Gottesman authored Jul 27, 2013

Zeroing the significand of a floating point number does not necessarily cause a
floating point number to become finite non zero. For instance, if one has a NaN,
zeroing the significand will cause it to become +/- infinity.

llvm-svn: 187313

30a90eb1

Minor code simplification suggested by Duncan · 517cf483
Matt Arsenault authored Jul 27, 2013
```
llvm-svn: 187309
```
517cf483
DwarfDebug: MD5 is always little endian, bswap on big endian platforms. · 409afcf1
Benjamin Kramer authored Jul 27, 2013
```
This makes LLVM emit the same signature regardless of host and target endianess.

llvm-svn: 187304
```
409afcf1

Create a constant pool symbol for the GOT in the ARMCGBR the same way we · 26ad41ed

Chandler Carruth authored Jul 27, 2013

do in the SDag when lowering references to the GOT: use
ARMConstantPoolSymbol rather than creating a dummy global variable. The
computation of the alignment still feels weird (it uses IR types and
datalayout) but it preserves the exact previous behavior. This change
fixes the memory leak of the global variable detected on the valgrind
leak checking bot.

Thanks to Benjamin Kramer for pointing me at ARMConstantPoolSymbol to
handle this use case.

llvm-svn: 187303

26ad41ed

Fix yet another memory leak found by the vg-leak bot. Folks (including · 1c82d331

Chandler Carruth authored Jul 27, 2013

me) should start watching this bot more as its catching lots of bugs.

The fix here is to not construct the global if we aren't going to need
it. That's cheaper anyways, and globals have highly predictable types in
practice. I've added an assert to catch skew between our manual testing
of the type and the actual type just for paranoia's sake.

Note that this pattern is actually fine in most globals because when you
build a global with a module it automatically is moved to be owned by
that module. But here, we're in isel and don't really want to do that.
The solution of not creating a global is simpler anyways.

llvm-svn: 187302

1c82d331

Fix a memory leak in the debug emission by simply not allocating memory. · 2a1c0d2c

Chandler Carruth authored Jul 27, 2013

There doesn't appear to be any reason to put this variable on the heap.
I'm suspicious of the LexicalScope above that we stuff in a map and then
delete afterward, but I'm just trying to get the valgrind bot clean.

llvm-svn: 187301

2a1c0d2c

Fix a memory leak in the hexagon scheduler. We call initialize here more · c18e39ca

Chandler Carruth authored Jul 27, 2013

than once, and the second time through we leaked memory. Found thanks to
the vg-leak bot, but I can't locally reproduce it with valgrind. The
debugger confirms that it is in fact leaking here.

This whole code is totally gross. Why is initialize being called on each
runOnFunction??? Why aren't these OwningPtr<>s, and why aren't their
lifetimes better defined? Anyways, this is just a surgical change to
help out the leak checking bots.

llvm-svn: 187299

c18e39ca

Don't use all the #ifdefs to hide the stats counters and instead rely on · 8e3c4dc5

Chandler Carruth authored Jul 27, 2013

their being optimized out in debug mode. Realistically, this just isn't
going to be the slow part anyways. This also fixes unused variable
warnings that are breaking LLD build bots. =/ I didn't see these at
first, and kept losing track of the fact that they were broken.

llvm-svn: 187297

8e3c4dc5

Merge the removal of dead instructions and lifetime markers with the · e8f5812a

Chandler Carruth authored Jul 27, 2013

analysis of the alloca. We don't need to visit all the users twice for
this. We build up a kill list during the analysis and then just process
it afterward. This recovers the tiny bit of performance lost by moving
to the visitor based analysis system as it removes one entire use-list
walk from mem2reg. In some cases, this is now faster than mem2reg was
previously.

llvm-svn: 187296

e8f5812a

Debug Info Verifier: verify SPs in llvm.dbg.sp. · 921382ed

Manman Ren authored Jul 27, 2013

Also always add DIType, DISubprogram and DIGlobalVariable to the list
in DebugInfoFinder without checking them, so we can verify them later
on.

llvm-svn: 187285

921382ed

Also update CMakeLists.txt for r187283. · cd1e8930
Nick Lewycky authored Jul 27, 2013
```
llvm-svn: 187284
```
cd1e8930

Reimplement isPotentiallyReachable to make nocapture deduction much stronger. · 0b68245e

Nick Lewycky authored Jul 27, 2013

Adds unit tests for it too.

Split BasicBlockUtils into an analysis-half and a transforms-half, and put the
analysis bits into a new Analysis/CFG.{h,cpp}. Promote isPotentiallyReachable
into llvm::isPotentiallyReachable and move it into Analysis/CFG.

llvm-svn: 187283

0b68245e

SimplifyCFG: Use parallel-and and parallel-or mode to consolidate branch conditions · 8b1e021e

Tom Stellard authored Jul 27, 2013

Merge consecutive if-regions if they contain identical statements.
Both transformations reduce number of branches.  The transformation
is guarded by a target-hook, and is currently enabled only for +R600,
but the correctness has been tested on X86 target using a variety of
CPU benchmarks.

Patch by: Mei Ye

llvm-svn: 187278

8b1e021e

SLP Vectorier: Don't vectorize really short chains because they are already... · cfd40da9

Nadav Rotem authored Jul 26, 2013

SLP Vectorier:  Don't vectorize really short chains because they are already handled by the SelectionDAG store-vectorizer, which does a better job in deciding when to vectorize.

llvm-svn: 187267

cfd40da9

SLP Vectorizer: Disable the vectorization of non power of two chains, such as... · 9ce0f779

Nadav Rotem authored Jul 26, 2013

SLP Vectorizer: Disable the vectorization of non power of two chains, such as <3 x float>, because we dont have a good cost model for these types.

llvm-svn: 187265

9ce0f779

Revert "[PowerPC] Improve consistency in use of __ppc__, __powerpc__, etc." · 05b5a46e
Rafael Espindola authored Jul 26, 2013
```
This reverts commit r187248. It broke many bots.

llvm-svn: 187254
```
05b5a46e