Commits · 3ef5e46b6d7ef95435328936864f509fa7f62880 · Roger Ferrer / llvm-epi-0.8

Mar 10, 2014

MemCpyOpt: When merging memsets also merge the trivial case of two memsets... · 3ef5e46b

Benjamin Kramer authored Mar 10, 2014

MemCpyOpt: When merging memsets also merge the trivial case of two memsets with the same destination.

The testcase is from PR19092, but I think the bug described there is actually a clang issue.

llvm-svn: 203489

3ef5e46b

For functions with ARM target specific calling convention, when simplify-libcall · 0e8f4612

Evan Cheng authored Mar 10, 2014

optimize a call to a llvm intrinsic to something that invovles a call to a C
library call, make sure it sets the right calling convention on the call.

e.g.
extern double pow(double, double);
double t(double x) {
  return pow(10, x);
}

Compiles to something like this for AAPCS-VFP:
define arm_aapcs_vfpcc double @t(double %x) #0 {
entry:
  %0 = call double @llvm.pow.f64(double 1.000000e+01, double %x)
  ret double %0
}

declare double @llvm.pow.f64(double, double) #1

Simplify libcall (part of instcombine) will turn the above into:
define arm_aapcs_vfpcc double @t(double %x) #0 {
entry:
  %__exp10 = call double @__exp10(double %x) #1
  ret double %__exp10
}

declare double @__exp10(double)

The pre-instcombine code works because calls to LLVM builtins are special.
Instruction selection will chose the right calling convention for the call.
However, the code after instcombine is wrong. The call to __exp10 will use
the C calling convention.

I can think of 3 options to fix this.

1. Make "C" calling convention just work since the target should know what CC
   is being used.

   This doesn't work because each function can use different CC with the "pcs"
   attribute.

2. Have Clang add the right CC keyword on the calls to LLVM builtin.

   This will work but it doesn't match the LLVM IR specification which states
   these are "Standard C Library Intrinsics".

3. Fix simplify libcall so the resulting calls to the C routines will have the
   proper CC keyword. e.g.
   %__exp10 = call arm_aapcs_vfpcc double @__exp10(double %x) #1

   This works and is the solution I implemented here.

Both solutions #2 and #3 would work. After carefully considering the pros and
cons, I decided to implement #3 for the following reasons.

1. It doesn't change the "spec" of the intrinsics.
2. It's a self-contained fix.

There are a couple of potential downsides.
1. There could be other places in the optimizer that is broken in the same way
   that's not addressed by this.
2. There could be other calling conventions that need to be propagated by
   simplify-libcall that's not handled.

But for now, this is the fix that I'm most comfortable with.

llvm-svn: 203488

0e8f4612

Followup to r203483 - add test. · d47a5c2d
Eli Bendersky authored Mar 10, 2014
```
[forgot to 'svn add' before committing r203483]

llvm-svn: 203485
```
d47a5c2d

[mips] Implement NaCl sandboxing of loads, stores and SP changes: · 5fddf610

Sasa Stankovic authored Mar 10, 2014

  * Add masking instructions before loads and stores (in MC layer).
  * Add masking instructions after SP changes (in MC layer).
  * Forbid loads, stores and SP changes in delay slots (in MI layer).

Differential Revision: http://llvm-reviews.chandlerc.com/D2904

llvm-svn: 203484

5fddf610

[bugpoint] Add testcase for r203343. · 47492919
Adam Nemet authored Mar 10, 2014
```
llvm-svn: 203472
```
47492919
Fix regression with -O0 for mips . · 96b7402b
Reed Kotler authored Mar 10, 2014
```
llvm-svn: 203469
```
96b7402b
Add test for LinkModules warning on triple, modified by r203009. Datalayout is already tested. · 76086c66
JF Bastien authored Mar 10, 2014
```
llvm-svn: 203468
```
76086c66
[mips] Assembly parser must invoke the target streamer to handle .set reorder macro. · 64459d29
Matheus Almeida authored Mar 10, 2014
```
llvm-svn: 203459
```
64459d29

AArch64: fix LowerCONCAT_VECTORS for new CodeGen. · 2a661f3f

Tim Northover authored Mar 10, 2014

The function was making too many assumptions about its input:

1. The NEON_VDUP optimisation was far too aggressive, assuming (I
think) that the input would always be BUILD_VECTOR.

2. We were treating most unknown concats as legal (by returning Op
rather than SDValue()). I think only concats of pairs of vectors are
actually legal.

http://llvm.org/PR19094

llvm-svn: 203450

2a661f3f

[Sparc] Add support for decoding 'swap' instruction. · f703132b
Venkatraman Govindaraju authored Mar 09, 2014
```
llvm-svn: 203424
```
f703132b

Mar 09, 2014

Revert r203230, "CodeGenPrep: sink extends of illegal types into use block." · 1783e1e9
NAKAMURA Takumi authored Mar 09, 2014
```
It choked i686 stage2.

llvm-svn: 203386
```
1783e1e9

IR: Change inalloca's grammar a bit · c4ab61cb

David Majnemer authored Mar 09, 2014

The grammar for LLVM IR is not well specified in any document but seems
to obey the following rules:

 - Attributes which have parenthesized arguments are never preceded by
   commas.  This form of attribute is the only one which ever has
   optional arguments.  However, not all of these attributes support
   optional arguments: 'thread_local' supports an optional argument but
   'addrspace' does not.  Interestingly, 'addrspace' is documented as
   being a "qualifier".  What constitutes a qualifier?  I cannot find a
   definition.

 - Some attributes use a space between the keyword and the value.
   Examples of this form are 'align' and 'section'.  These are always
   preceded by a comma.

 - Otherwise, the attribute has no argument.  These attributes do not
   have a preceding comma.

Sometimes an attribute goes before the instruction, between the
instruction and it's type, or after it's type.  'atomicrmw' has
'volatile' between the instruction and the type while 'call' has 'tail'
preceding the instruction.

With all this in mind, it seems most consistent for 'inalloca' on an
'inalloca' instruction to occur before between the instruction and the
type.  Unlike the current formulation, there would be no preceding
comma.  The combination 'alloca inalloca' doesn't look particularly
appetizing, perhaps a better spelling of 'inalloca' is down the road.

llvm-svn: 203376

c4ab61cb

Mar 08, 2014

Update comment from r203315 based on review · 42030397
Adam Nemet authored Mar 08, 2014
```
llvm-svn: 203361
```
42030397
DebugInfo: further improvements to test following up on r203329 · 078278fe
David Blaikie authored Mar 08, 2014
```
llvm-svn: 203337
```
078278fe
DebugInfo: Fix test fallout from r203323 · f528f054
David Blaikie authored Mar 08, 2014
```
Will fix this harder in a moment.

llvm-svn: 203329
```
f528f054
DebugInfo: Use DW_FORM_data4 for DW_AT_high_pc in DW_TAG_lexical_blocks · 26ab6c6d
David Blaikie authored Mar 08, 2014
```
Suggested by Adrian Prantl in code review for r203187

llvm-svn: 203323
```
26ab6c6d
Add support for hashing location information for CU level hashes. · 4f17ee09
Eric Christopher authored Mar 08, 2014
```
Add a testcase based on sret.cpp where we can now hash the entire
compile unit.

llvm-svn: 203319
```
4f17ee09

[DAGCombiner] Recognize another rotation idiom · 5117f5df

Adam Nemet authored Mar 07, 2014

This is the new idiom:

  x<<(y&31) | x>>((0-y)&31)

which is recognized as:

  x ROTL (y&31)

The change refines matchRotateSub.  In
Neg & (OpSize - 1) == (OpSize - Pos) & (OpSize - 1), if Pos is
Pos' & (OpSize - 1) we can just use Pos' instead of Pos.

llvm-svn: 203315

5117f5df

ISel: Make VSELECT selection terminate in cases where the condition type has to · d33e9429

Arnold Schwaighofer authored Mar 07, 2014

be split and the result type widened.

When the condition of a vselect has to be split it makes no sense widening the
vselect and thereby widening the condition. We end up in an endless loop of
widening (vselect result type) and splitting (condition mask type) doing this.
Instead, split both the condition and the vselect and widen the result.

I ran this over the test suite with i686 and mattr=+sse and saw no regressions.

Fixes PR18036.

llvm-svn: 203311

d33e9429

Remove unnecessary test for Darwin and update testcase to be a little less · 887e7078
Adrian Prantl authored Mar 07, 2014
```
horrible/fragile.
rdar://problem/16264854

llvm-svn: 203309
```
887e7078

Mar 07, 2014

Moved test file from test/MC/Mips to test/CodeGen/Mips. · 1e50b46b
Sasa Stankovic authored Mar 07, 2014
```
llvm-svn: 203298
```
1e50b46b
DebugInfo: Use DW_FORM_data4 for DW_AT_high_pc in inlined functions · 555e79a3
David Blaikie authored Mar 07, 2014
```
Suggested by Adrian Prantl in code review for r203187.

llvm-svn: 203296
```
555e79a3
DebugInfo: Update test to cover linux (with a FIXME...) too · 3e4ff7a9
David Blaikie authored Mar 07, 2014
```
llvm-svn: 203295
```
3e4ff7a9
R600/SI: Using SGPRs is illegal for instructions that read carry-out from VCC · e28859f8
Tom Stellard authored Mar 07, 2014
```
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 203281
```
e28859f8

R600/SI: Custom lower i1 stores · 1c8788ef

Tom Stellard authored Mar 07, 2014



These are sometimes created by the shrink to boolean optimization in the
globalopt pass.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 203280

1c8788ef

DebugInfo: Restrict DW_AT_high_pc encoding as data4 offset to DWARF 4 as per spec · d723f518
David Blaikie authored Mar 07, 2014
```
Code review feedback to r203187 from Oliver Stannard. Thanks!

llvm-svn: 203256
```
d723f518
ARM: Make .unreq directives case-insensitive · 29db0eb8
Duncan P. N. Exon Smith authored Mar 07, 2014
```
Be case-insensitive when processing .unreq directives.

Patch by Lin Zuojian!

llvm-svn: 203251
```
29db0eb8

CodeGenPrep: sink extends of illegal types into use block. · ad3d81d3

Tim Northover authored Mar 07, 2014

This helps the instruction selector to lower an i64 * i64 -> i128
multiplication into a single instruction on targets which support it.

Patch by Manuel Jacob.

llvm-svn: 203230

ad3d81d3

InstCombine: form shuffles from wider range of insert/extractelements · fad2761c

Tim Northover authored Mar 07, 2014

Sequences of insertelement/extractelements are sometimes used to build
vectorsr; this code tries to put them back together into shuffles, but
could only produce a completely uniform shuffle types (<N x T> from two
<N x T> sources).

This should allow shuffles with different numbers of elements on the
input and output sides as well.

llvm-svn: 203229

fad2761c

Replace PROLOG_LABEL with a new CFI_INSTRUCTION. · b1f25f1b

Rafael Espindola authored Mar 07, 2014

The old system was fairly convoluted:
* A temporary label was created.
* A single PROLOG_LABEL was created with it.
* A few MCCFIInstructions were created with the same label.

The semantics were that the cfi instructions were mapped to the PROLOG_LABEL
via the temporary label. The output position was that of the PROLOG_LABEL.
The temporary label itself was used only for doing the mapping.

The new CFI_INSTRUCTION has a 1:1 mapping to MCCFIInstructions and points to
one by holding an index into the CFI instructions of this function.

I did consider removing MMI.getFrameInstructions completelly and having
CFI_INSTRUCTION own a MCCFIInstruction, but MCCFIInstructions have non
trivial constructors and destructors and are somewhat big, so the this setup
is probably better.

The net result is that we don't create temporary labels that are never used.

llvm-svn: 203204

b1f25f1b

Allow constant folding of round function whenever feasible · b67688a8
Karthik Bhat authored Mar 07, 2014
```
llvm-svn: 203198
```
b67688a8
DebugInfo: Limit r203187 to non-darwin as lldb can't handle this yet · 479323a6
David Blaikie authored Mar 07, 2014
```
llvm-svn: 203192
```
479323a6
DebugInfo: Emit DW_TAG_subprogram's DW_AT_high_pc as an offset from the low_pc · 48b1bdcf
David Blaikie authored Mar 07, 2014
```
This removes a relocation from each subprogram, reducing link times,
etc.

llvm-svn: 203187
```
48b1bdcf
DebugInfo: Refactor test to not rely on fixed DIE offsets · f5040a64
David Blaikie authored Mar 07, 2014
```
llvm-svn: 203186
```
f5040a64
DebugInfo: Improve test to not depend on the specific naming of temporary symbols · b9a0265c
David Blaikie authored Mar 07, 2014
```
llvm-svn: 203184
```
b9a0265c

Mar 06, 2014

Remove shouldEmitUsedDirectiveFor. · 3b30cb41
Rafael Espindola authored Mar 06, 2014
```
Clang now uses llvm.compiler.used for these cases.

llvm-svn: 203174
```
3b30cb41
Convert test to FileCheck. · 123256a4
Rafael Espindola authored Mar 06, 2014
```
llvm-svn: 203173
```
123256a4

[X86] Teach the DAGCombiner how to fold a OR of two shufflevector nodes. · 6292a140

Andrea Di Biagio authored Mar 06, 2014

This patch teaches the DAGCombiner how to fold a binary OR between two
shufflevector into a single shuffle vector when possible.

The rules are:
  1. fold (or (shuf A, V_0, MA), (shuf B, V_0, MB)) -> (shuf A, B, Mask1)
  2. fold (or (shuf A, V_0, MA), (shuf B, V_0, MB)) -> (shuf B, A, Mask2)

The DAGCombiner can take advantage of the fact that OR is commutative and
compute two possible shuffle masks (Mask1 and Mask2) for the resulting
shuffle node.

Before folding a dag according to either rule 1 or 2, DAGCombiner verifies
that the resulting shuffle mask is legal for the target.
DAGCombiner would firstly try to fold according to 1.; If not possible
then it will try to fold according to 2.
If both Mask1 and Mask2 are illegal then we conservatively don't fold
the OR instruction.

llvm-svn: 203156

6292a140

Fix the printing of n_type. · 1194e69f

Rafael Espindola authored Mar 06, 2014

Despite the name, n_type contains the type of the symbol, but also if it is
extern or private extern.

llvm-svn: 203154

1194e69f

R600: Fix extloads from i8 / i16 to i64. · f9a995d6

Matt Arsenault authored Mar 06, 2014

This appears to only be working for global loads. Private
and local break for other reasons.

llvm-svn: 203135

f9a995d6