Commits · 48cbda5850264671e982ecdd834c1587b1732c15 · Roger Ferrer / llvm-epi-0.8

Oct 29, 2013

DebugInfo: Introduce the notion of "form classes" · 48cbda58

Alexey Samsonov authored Oct 28, 2013

Summary:
Use DWARF4 table of form classes to fetch attributes from DIE
in a more consistent way. This shouldn't change the functionality and
serves as a refactoring for upcoming change: DW_AT_high_pc has different
semantics depending on its form class.

Reviewers: dblaikie, echristo

Reviewed By: echristo

CC: echristo, llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D1961

llvm-svn: 193553

48cbda58

Oct 28, 2013

[mips] Simplify LowerFormalArguments using getRegClassFor. · 7d82252d
Akira Hatanaka authored Oct 28, 2013
```
No functionality change.

llvm-svn: 193540
```
7d82252d

Return early from getUnconditionalBranchTargetOpValue if the branch target is · b5281661

Lang Hames authored Oct 28, 2013

an MCExpr, in order to avoid writing an encoded zero value in the immediate
field.

When getUnconditionalBranchTargetOpValue is called with an MCExpr target, we
don't know what the final immediate field value should be. We shouldn't
explicitly set the immediate field to an encoded zero value as zero is encoded
with a non-zero bit pattern. This leads to bits being set that pollute the
final immediate value. The nature of the encoding is such that the polluted
bits only affect very large immediate values, explaining why this hasn't
caused problems earlier.

Fixes <rdar://problem/15155975>.

llvm-svn: 193535

b5281661

[arm] Implement eabi_attribute, cpu, and fpu directives. · 8cbb80d1

Logan Chien authored Oct 28, 2013

This commit allows the ARM integrated assembler to parse
and assemble the code with .eabi_attribute, .cpu, and
.fpu directives.

To implement the feature, this commit moves the code from
AttrEmitter to ARMTargetStreamers, and several new test
cases related to cortex-m4, cortex-r5, and cortex-a15 are
added.

Besides, this commit also change the Subtarget->isFPOnlySP()
to Subtarget->hasD16() to match the usage of .fpu directive.

This commit changes the test cases:

* Several .eabi_attribute directives in
  2010-09-29-mc-asm-header-test.ll are removed because the .fpu
  directive already cover the functionality.

* In the Cortex-A15 test case, the value for
  Tag_Advanced_SIMD_arch has be changed from 1 to 2,
  which is more precise.

llvm-svn: 193524

8cbb80d1

simplify ConstantRange::getSetSize() · 8a241520
Nuno Lopes authored Oct 28, 2013
```
llvm-svn: 193523
```
8a241520

[SystemZ] Set usaAA to true · 094e6097

Richard Sandiford authored Oct 28, 2013

useAA significantly improves the handling of vector code that has TBAA
information attached. It also helps other cases, as shown by the testsuite
changes here. The only real downside I've seen is that it interferes with
MergeConsecutiveStores. The problem is that that optimization works top
down, starting at the first store in the chain, and looks for cases where
the chain result is only used by a single related store. These related
stores don't alias, so useAA will have rewritten all the later stores to
use a different chain input (typically the same one as the first store).

I think the advantages outweigh the disadvantages though, so for now I've
just disabled alias analysis for the unaligned-01.ll test.

llvm-svn: 193521

094e6097

[DAGCombiner] Respect volatility when checking for aliases · 981fdeb4

Richard Sandiford authored Oct 28, 2013

Making useAA() default to true for SystemZ showed that the combiner alias
analysis wasn't handling volatile accesses.  This hit many of the SystemZ
tests, but I arbitrarily picked one for the purpose of this patch.

llvm-svn: 193518

981fdeb4

Keep TBAA info when rewriting SelectionDAG loads and stores · 39c1ce4d

Richard Sandiford authored Oct 28, 2013

Most SelectionDAG code drops the TBAA info when creating a new form of a
load and store (e.g. during legalization, or when converting a plain
load to an extending one). This patch tries to catch all cases where
the TBAA information can legitimately be carried over.

The patch adds alternative forms of getLoad() and getExtLoad() that take
a MachineMemOperand instead of individual fields. (The corresponding
getTruncStore() already exists.) The idea is to use the MachineMemOperand
forms when all fields are carried over (size, pointer info, isVolatile,
isNonTemporal, alignment and TBAA info). If some adjustment is being
made, e.g. to narrow the load, then we still pass the individual fields
but also pass the TBAA info.

llvm-svn: 193517

39c1ce4d

SCEV: Make the final add of an inbounds GEP nuw if we know that the index is positive. · 6094f30d

Benjamin Kramer authored Oct 28, 2013

We can't do this for the general case as saying a GEP with a negative index
doesn't have unsigned wrap isn't valid for negative indices.
  %gep = getelementptr inbounds i32* %p, i64 -1

But an inbounds GEP cannot run past the end of address space. So we check for
the very common case of a positive index and make GEPs derived from that NUW.
Together with Andy's recent non-unit stride work this lets us analyze loops
like

  void foo3(int *a, int *b) {
    for (; a < b; a++) {}
  }

PR12375, PR12376.

Differential Revision: http://llvm-reviews.chandlerc.com/D2033

llvm-svn: 193514

6094f30d

Prune utf8 chars in comments. · 8a046439
NAKAMURA Takumi authored Oct 28, 2013
```
llvm-svn: 193512
```
8a046439
Prune trailing linefeeds. · 0b865d44
NAKAMURA Takumi authored Oct 28, 2013
```
llvm-svn: 193511
```
0b865d44
Target/R600: Un-tab-ify. · 4bb85f90
NAKAMURA Takumi authored Oct 28, 2013
```
llvm-svn: 193510
```
4bb85f90

Oct 27, 2013

Make first substantial checkin of my port of ARM constant islands code to Mips. · 91ae9829

Reed Kotler authored Oct 27, 2013

Before I just ported the shell of the pass. I've tried to keep everything
nearly identical to the ARM version. I think it will be very easy to eventually
merge these two and create a new more general pass that other targets can
use. I have some improvements I would like to make to allow pools to
be shared across functions and some other things. When I'm all done we
can think about making a more general pass. More to be ported but the
basic mechanism works now almost as good as gcc mips16.

llvm-svn: 193509

91ae9829

NVPTX: Remove unused globals. · 7ad4100f
Benjamin Kramer authored Oct 27, 2013
```
llvm-svn: 193500
```
7ad4100f
Hexagon: Remove global state. · 602bb4ad
Benjamin Kramer authored Oct 27, 2013
```
llvm-svn: 193499
```
602bb4ad
AVX-512: PMIN/PMAX intrinsics and patterns · 199c8235
Elena Demikhovsky authored Oct 27, 2013
```
Patch by Cameron McInally <cameron.mcinally@nyu.edu>

llvm-svn: 193497
```
199c8235
Revert r193251 : Use address-taken to disambiguate global variable and indirect memops. · 2e1890e1
Shuxin Yang authored Oct 27, 2013
```
llvm-svn: 193489
```
2e1890e1

Oct 26, 2013

Quick look-up for block in loop. · be640b28

Wan Xiaofei authored Oct 26, 2013

This patch implements quick look-up for block in loop by maintaining a hash set for blocks.
It improves the efficiency of loop analysis a lot, the biggest improvement could be 5-6%(458.sjeng).
Below are the compilation time for our benchmark in llc before & after the patch.

Benchmark	llc - trunk		llc - patched	
401.bzip2	0.339081	100.00%	0.329657	102.86%
403.gcc		19.853966	100.00%	19.605466	101.27%
429.mcf		0.049823	100.00%	0.048451	102.83%
433.milc	0.514898	100.00%	0.510217	100.92%
444.namd	1.109328	100.00%	1.103481	100.53%
445.gobmk	4.988028	100.00%	4.929114	101.20%
456.hmmer	0.843871	100.00%	0.825865	102.18%
458.sjeng	0.754238	100.00%	0.714095	105.62%
464.h264ref	2.9668		100.00%	2.90612		102.09%
471.omnetpp	4.556533	100.00%	4.511886	100.99%
bitmnp01	0.038168	100.00%	0.0357		106.91%
idctrn01	0.037745	100.00%	0.037332	101.11%
libquake2	3.78689		100.00%	3.76209		100.66%
libquake_	2.251525	100.00%	2.234104	100.78%
linpack		0.033159	100.00%	0.032788	101.13%
matrix01	0.045319	100.00%	0.043497	104.19%
nbench		0.333161	100.00%	0.329799	101.02%
tblook01	0.017863	100.00%	0.017666	101.12%
ttsprk01	0.054337	100.00%	0.053057	102.41%

Reviewer	: Andrew Trick <atrick@apple.com>, Hal Finkel <hfinkel@anl.gov>
Approver	: Andrew Trick <atrick@apple.com>
Test		: Pass make check-all & llvm test-suite

llvm-svn: 193460

be640b28

Oct 25, 2013

Fix SCEVExpander: don't try to expand quadratic recurrences outside a loop. · 57243da7

Andrew Trick authored Oct 25, 2013

Partial fix for PR17459: wrong code at -O3 on x86_64-linux-gnu
(affecting trunk and 3.3)

When SCEV expands a recurrence outside of a loop it attempts to scale
by the stride of the recurrence. Chained recurrences don't work that
way. We could compute binomial coefficients, but would hve to
guarantee that the chained AddRec's are in a perfectly reduced form.

llvm-svn: 193438

57243da7

Fix LSR: don't normalize quadratic recurrences. · 29abce31

Andrew Trick authored Oct 25, 2013

Partial fix for PR17459: wrong code at -O3 on x86_64-linux-gnu
(affecting trunk and 3.3)

ScalarEvolutionNormalization was attempting to normalize by adding and
subtracting strides. Chained recurrences don't work that way.

llvm-svn: 193437

29abce31

Handle calls and invokes in GlobalStatus. · 7749d7cc

Rafael Espindola authored Oct 25, 2013

This patch teaches GlobalStatus to analyze a call that uses the global value as
a callee, not as an argument.

With this change internalize call handle the common use of linkonce_odr
functions. This reduces the number of linkonce_odr functions in a LTO build of
clang (checked with the emit-llvm gold plugin option) from 1730 to 60.

llvm-svn: 193436

7749d7cc

LoopVectorizer: Don't attempt to vectorize extractelement instructions · 02f562df

Hal Finkel authored Oct 25, 2013

The loop vectorizer does not currently understand how to vectorize
extractelement instructions. The existing check, which excluded all
vector-valued instructions, did not catch extractelement instructions because
it checked only the return value. As a result, vectorization would proceed,
producing illegal instructions like this:

  %58 = extractelement <2 x i32> %15, i32 0
  %59 = extractelement i32 %58, i32 0

where the second extractelement is illegal because its first operand is not a vector.

llvm-svn: 193434

02f562df

DIEHash: Summary hashing of member functions · 8bc7db77
David Blaikie authored Oct 25, 2013
```
llvm-svn: 193432
```
8bc7db77
Change MemoryBuffer::getFile to take a Twine. · 1d19c8f0
Rafael Espindola authored Oct 25, 2013
```
llvm-svn: 193429
```
1d19c8f0
DIEHash: Summary hashing of nested types · 65cc969f
David Blaikie authored Oct 25, 2013
```
llvm-svn: 193427
```
65cc969f
[X86][AVX512] Add patterns that match the AVX512 floating point register vbroadcast intrinsics. · 8761a8f5
Quentin Colombet authored Oct 25, 2013
```
Patch by Cameron McInally <cameron.mcinally@nyu.edu>

llvm-svn: 193422
```
8761a8f5
[X86][AVX512] Add patterns that match the AVX512 floating point vbroadcast intrinsics. · 4bf1c282
Quentin Colombet authored Oct 25, 2013
```
Patch by Cameron McInally <cameron.mcinally@nyu.edu>

llvm-svn: 193421
```
4bf1c282

Call destroy from ~BasicCallGraph. · 64cc1b00

Rafael Espindola authored Oct 25, 2013

This fix a memory leak found by valgrind.

Calling it from the base class destructor would not destroy the BasicCallGraph
bits.

FIXME: BasicCallGraph is the only thing that inherits from CallGraph. Can
we merge the two?

llvm-svn: 193412

64cc1b00

ARM: allow .thumb_func to be separated from symbol definition · 1744d0ad

Tim Northover authored Oct 25, 2013

When assembling, a .thumb_func directive is supposed to be applicable to the
next symbol definition, even if there are intervening directives. We were
racing ahead to try and find it, and this commit should fix the issue.

Patch by Gabor Ballabas

llvm-svn: 193403

1744d0ad

The FIXME was indeed fixed in the linker, comment removed. · 2eac8986
Yaron Keren authored Oct 25, 2013
```
llvm-svn: 193402
```
2eac8986

ARM: don't expand atomicrmw inline on Cortex-M0 · c7ea8048

Tim Northover authored Oct 25, 2013

There's a barrier instruction so that should still be used, but most actual
atomic operations are going to need a platform decision on the correct
behaviour (either nop if single-threaded or OS-support otherwise).

rdar://problem/15287210

llvm-svn: 193399

c7ea8048

LegalizeDAG: allow libcalls for max/min atomic operations · a564d329

Tim Northover authored Oct 25, 2013

ARM processors without ldrex/strex need to be able to make libcalls for all
atomic operations, including the newer min/max versions.

The alternative would probably be expanding these operations in terms of
cmpxchg (as x86 does always), but in the configurations where this matters
code-size tends to be paramount so the libcall is more desirable.

llvm-svn: 193398

a564d329

Optimize concat_vectors(X, undef) -> scalar_to_vector(X). · d369d4bd

Nadav Rotem authored Oct 25, 2013

This optimization is not SSE specific so I am moving it to DAGco.
The new scalar_to_vector dag node exposed a missing pattern in the AArch64 target that I needed to add.

llvm-svn: 193393

d369d4bd

llvm-cov dump to dbgs() instead of outs(). · 03678157
Yuchen Wu authored Oct 25, 2013
```
llvm-svn: 193390
```
03678157

Support for reading program counts in llvm-cov. · 14ae8e61

Yuchen Wu authored Oct 25, 2013

llvm-cov will now be able to read program counts from the GCDA file and
output it in the same format as gcov. The program summary tag was
identified from gcov-io.h as "\0\0\0\a3".

There is currently a bug in GCOVProfiling.cpp which does not generate
the
run- or program-counting IR, so this change was tested manually by
modifying the GCDA file and comparing the gcov and llvm-cov outputs.

llvm-svn: 193389

14ae8e61

ARM: Tweak usage of '*vfp' compiler_rt functions. · 1d1d6d46

Jim Grosbach authored Oct 24, 2013

Only use them if the subtarget has ARM mode, as these routines are implemented
as ARM code.

rdar://15302004

llvm-svn: 193381

1d1d6d46

MCStreamer: Reimplement the virtual EmitRawText as a protected member,... · d8c5b4e8

David Blaikie authored Oct 24, 2013

MCStreamer: Reimplement the virtual EmitRawText as a protected member, EmitRawTextImpl, to avoid string literal ambiguities

Also improve the implementation of EmitRawText(Twine) so it doesn't
bother using the SmallString buffer if the Twine is a simple StringRef
anyway.

llvm-svn: 193378

d8c5b4e8

DWARF emission: Remove unnecessary/redundant DIE reference code · 68642d31
David Blaikie authored Oct 24, 2013
```
The default case at the end of the switch handles this just fine.

llvm-svn: 193374
```
68642d31

Oct 24, 2013
- Fix name of variable in comment. · e3411675
  Eric Christopher authored Oct 24, 2013
```
llvm-svn: 193373
```
  e3411675
- Grammar. · 670ee0e9
  Eric Christopher authored Oct 24, 2013
```
llvm-svn: 193372
```
  670ee0e9