Commits · 0d28f80bd1f0ec511fe608bfdc58c927d1a6ef9c · Lorenzo Albano / LLVM bpEVL

Aug 05, 2015
- Rename all references to old mailing lists to new lists.llvm.org address. · 0d28f80b
  Tanya Lattner authored Aug 05, 2015
```
llvm-svn: 243999
```
  0d28f80b
Aug 04, 2015

wrap OptSize and MinSize attributes for easier and consistent access (NFCI) · 924879ad

Sanjay Patel authored Aug 04, 2015

Create wrapper methods in the Function class for the OptimizeForSize and MinSize
attributes. We want to hide the logic of "or'ing" them together when optimizing
just for size (-Os).

Currently, we are not consistent about this and rely on a front-end to always set
OptimizeForSize (-Os) if MinSize (-Oz) is on. Thus, there are 18 FIXME changes here
that should be added as follow-on patches with regression tests.

This patch is NFC-intended: it just replaces existing direct accesses of the attributes
by the equivalent wrapper call.

Differential Revision: http://reviews.llvm.org/D11734

llvm-svn: 243994

924879ad

ARM: support windows division routines · 0a2672bb

Saleem Abdulrasool authored Aug 04, 2015

This adds the software division routines for the Windows RTABI.  These are not
expected to be used often though as most modern Windows ARM capable targets
support hardware division.  In the case that the target CPU doesnt support
hardware division, this will be the fallback.

llvm-svn: 243952

0a2672bb

ARM: make Darwin libcall registration table driven (NFC) · 67697a7e

Saleem Abdulrasool authored Aug 04, 2015

Make the libcall updating table driven similar to the approach that the Linux
and Windows codepath does below.  NFC.

llvm-svn: 243951

67697a7e

ARM: remove horrible printf left over from debugging · 9c340ec6
Tim Northover authored Aug 03, 2015
```
llvm-svn: 243907
```
9c340ec6

Aug 03, 2015

ARM: prefer allocating VFP regs at stride 4 on Darwin. · 910dde7a

Tim Northover authored Aug 03, 2015

This is necessary for WatchOS support, where the compact unwind format assumes
this kind of layout. For now we only want this on Swift-like CPUs though, where
it's been the Xcode behaviour for ages. Also, since it can expand the prologue
we don't want it at -Oz.

llvm-svn: 243884

910dde7a

[ARM] Make GlobalMerge merge extern globals by default · f3324cf1

John Brawn authored Aug 03, 2015

Enabling merging of extern globals appears to be generally either beneficial or
harmless. On some benchmarks suites (on Cortex-M4F, Cortex-A9, and Cortex-A57)
it gives improvements in the 1-5% range, but in the rest the overall effect is
zero.

Differential Revision: http://reviews.llvm.org/D10966

llvm-svn: 243874

f3324cf1

Be less conservative about forming IT blocks. · 6967e5e4

James Molloy authored Aug 03, 2015

In http://reviews.llvm.org/rL215382, IT forming was made more conservative under
the belief that a flag-setting instruction was unpredictable inside an IT block on ARMv6M.

But actually, ARMv6M doesn't even support IT blocks so that's impossible. In the ARMARM for
v7M, v7AR and v8AR it states that the semantics of such an instruction changes inside an
IT block - it doesn't set the flags. So actually it is fine to use one inside an IT block
as long as the flags register is dead afterwards.

This gives significant performance improvements in a variety of MPEG based workloads.

Differential revision: http://reviews.llvm.org/D11680

llvm-svn: 243869

6967e5e4

Aug 02, 2015

De-constify pointers to Type since they can't be modified. NFC · e3dcce97

Craig Topper authored Aug 01, 2015

This was already done in most places a while ago. This just fixes the ones that crept in over time.

llvm-svn: 243842

e3dcce97

Aug 01, 2015

-Wdeprecated-clean: Fix cases of violating the rule of 5 in ways that are deprecated in C++11 · a5fd382e

David Blaikie authored Aug 01, 2015

Various targets use std::swap on specific MCAsmOperands (ARM and
possibly Hexagon as well). It might be helpful to mark those subclasses
as final, to ensure that the availability of move/copy operations can't
lead to slicing. (same sort of requirements as the non-vitual dtor -
protected or a final class)

llvm-svn: 243820

a5fd382e

Jul 31, 2015

[ARM] Lower modulo operation to generate __aeabi_divmod on Android · 532a1369

Sumanth Gundapaneni authored Jul 31, 2015

    
For a modulo (reminder) operation,
clang -target armv7-none-linux-gnueabi generates "__modsi3"
clang -target armv7-none-eabi generates "__aeabi_idivmod"
clang -target armv7-linux-androideabi generates "__modsi3"
Android bionic libc doesn't provide a __modsi3, instead it provides a
"__aeabi_idivmod". This patch fixes the LLVM ARMISelLowering to generate
the correct call when ever there is a modulo operation.

Differential Revision: http://reviews.llvm.org/D11661

llvm-svn: 243717

532a1369

Jul 30, 2015

fix memcpy/memset/memmove lowering when optimizing for size · 1166f2ff

Sanjay Patel authored Jul 30, 2015

Fixing MinSize attribute handling was discussed in D11363. 
This is a prerequisite patch to doing that.

The handling of OptSize when lowering mem* functions was broken
on Darwin because it wants to ignore -Os for these cases, but the
existing logic also made it ignore -Oz (MinSize).

The Linux change demonstrates a widespread problem. The backend
doesn't usually recognize the MinSize attribute by itself; it
assumes that if the MinSize attribute exists, then the OptSize 
attribute must also exist. 

Fixing this more generally will be a follow-on patch or two.

Differential Revision: http://reviews.llvm.org/D11568

llvm-svn: 243693

1166f2ff

Fix typo "fuction" noticed in comments in AssumptionCache.h, and also all the... · c3890d29

Nick Lewycky authored Jul 29, 2015

Fix typo "fuction" noticed in comments in AssumptionCache.h, and also all the other files that have the same typo. All comments, no functionality change! (Merely a "fuctionality" change.)

Bonus change to remove emacs major mode marker from SystemZMachineFunctionInfo.cpp because emacs already knows it's C++ from the extension. Also fix typo "appeary" in AMDGPUMCAsmInfo.h.

llvm-svn: 243585

c3890d29

Jul 29, 2015

[ARM] Define subtarget feature strict-align. · 2670f4a5

Akira Hatanaka authored Jul 28, 2015

This commit defines subtarget feature strict-align and uses it instead of
cl::opt -arm-strict-align to decide whether strict alignment should be
forced. Also, remove the logic that was checking the OS and architecture
as clang is now responsible for setting strict-align based on the command
line options specified and the target architecute and OS.

rdar://problem/21529937

http://reviews.llvm.org/D11470

llvm-svn: 243493

2670f4a5

Jul 28, 2015

Implement target independent TLS compatible with glibc's emutls.c. · 1e859582

Chih-Hung Hsieh authored Jul 28, 2015

The 'common' section TLS is not implemented.
Current C/C++ TLS variables are not placed in common section.
DWARF debug info to get the address of TLS variables is not generated yet.

clang and driver changes in http://reviews.llvm.org/D10524

  Added -femulated-tls flag to select the emulated TLS model,
  which will be used for old targets like Android that do not
  support ELF TLS models.

Added TargetLowering::LowerToTLSEmulatedModel as a target-independent
function to convert a SDNode of TLS variable address to a function call
to __emutls_get_address.

Added into lib/Target/*/*ISelLowering.cpp to call LowerToTLSEmulatedModel
for TLSModel::Emulated. Although all targets supporting ELF TLS models are
enhanced, emulated TLS model has been tested only for Android ELF targets.
Modified AsmPrinter.cpp to print the emutls_v.* and emutls_t.* variables for
emulated TLS variables.
Modified DwarfCompileUnit.cpp to skip some DIE for emulated TLS variabls.

TODO: Add proper DIE for emulated TLS variables.
      Added new unit tests with emulated TLS.

Differential Revision: http://reviews.llvm.org/D10522

llvm-svn: 243438

1e859582

- Added support for parsing HWDiv features using Target Parser. · 4ea70755
Alexandros Lamprineas authored Jul 27, 2015
```
- Architecture extensions are represented as a bitmap.

Phabricator: http://reviews.llvm.org/D11457
llvm-svn: 243335
```
4ea70755

Jul 27, 2015

[llvm-mc] Pushing plumbing through for --fatal-warnings flag. · fe2c8b80
Colin LeMahieu authored Jul 27, 2015
```
llvm-svn: 243334
```
fe2c8b80

[ARM/AArch64] Fix cost model for interleaved accesses · 7581d225

Silviu Baranga authored Jul 27, 2015

Summary:
Fix the cost of interleaved accesses for ARM/AArch64.
We were calling getTypeAllocSize and using it to check
the number of bits, when we should have called
getTypeAllocSizeInBits instead.

This would pottentially cause the vectorizer to
generate loads/stores and shuffles which cannot
be matched with an interleaved access instruction.

No performance changes are expected for now since
matching/generating interleaved accesses is still
disabled by default.

Reviewers: rengolin

Subscribers: aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D11524

llvm-svn: 243270

7581d225

Jul 24, 2015

[ARM] - Fix lowering of shufflevectors in AArch32 · 4d45ff2b

Luke Cheeseman authored Jul 24, 2015

Some shufflevectors are currently being incorrectly lowered in the AArch32
backend as the existing checks for detecting the NEON operations from the
shufflevector instruction expects the shuffle mask and the vector operands to be
of the same length.

This is not always the case as the mask may be twice as long as the operand;
here only the lower half of the shufflemask gets checked, so provided the lower
half of the shufflemask looks like a vector transpose (or even is just all -1
for undef) then the intrinsics may get incorrectly lowered into a vector
transpose (VTRN) instruction.

This patch fixes this by accommodating for both cases and adds regression tests.

Differential Revision: http://reviews.llvm.org/D11407

llvm-svn: 243103

4d45ff2b

When lowering vector shifts a check is performed to see if the value to shift by · b5c627ab

Luke Cheeseman authored Jul 24, 2015

is an immediate, in this check the value is negated and stored in and int64_t.
The value can be -2^63 yet the result cannot be stored in an int64_t and this
gives some undefined behaviour causing failures. The negation is only necessary
when the values is within a certain range and so it should not need to negate
-2^63, this patch introduces this and also a regression test.

Differential Revision: http://reviews.llvm.org/D11408

llvm-svn: 243100

b5c627ab

[ARM] Register (existing) ARMLoadStoreOpt pass with LLVM pass manager. · d9c1bc99

David Gross authored Jul 23, 2015

Summary: Among other things, this allows -print-after-all/-print-before-all to dump IR around this pass.

Subscribers: aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D11373

llvm-svn: 243052

d9c1bc99

Jul 23, 2015
- Test commit. · 2ad5d173
  David Gross authored Jul 23, 2015
```
llvm-svn: 243046
```
  2ad5d173
Jul 22, 2015

[ARM] Make the frame lowering code ready for shrink-wrapping. · 48b77200

Quentin Colombet authored Jul 22, 2015

Shrink-wrapping can now be tested on ARM with -enable-shrink-wrap.

Related to <rdar://problem/20821730>

llvm-svn: 242908

48b77200

Jul 21, 2015

[ARM] Define subtarget feature "reserve-r9", which is used to decide · 28581525

Akira Hatanaka authored Jul 21, 2015

whether register r9 should be reserved.

This recommits r242737, which broke bots because the number of subtarget
features went over the limit of 64.

This change is needed because we cannot use a backend option to set
cl::opt "arm-reserve-r9" when doing LTO.

Out-of-tree projects currently using cl::opt option "-arm-reserve-r9" to
reserve r9 should make changes to add subtarget feature "reserve-r9" to
the IR.

rdar://problem/21529937

Differential Revision: http://reviews.llvm.org/D11320

llvm-svn: 242756

28581525

ARMLoadStoreOpt: Merge subs/adds into LDRD/STRD; Factor out common code · a50d2203

Matthias Braun authored Jul 21, 2015

Re-apply of r241928 which had to be reverted because of the r241926
revert.

This commit factors out common code from MergeBaseUpdateLoadStore() and
MergeBaseUpdateLSMultiple() and introduces a new function
MergeBaseUpdateLSDouble() which merges adds/subs preceding/following a
strd/ldrd instruction into an strd/ldrd instruction with writeback where
possible.

Differential Revision: http://reviews.llvm.org/D10676

llvm-svn: 242743

a50d2203

ARMLoadStoreOptimizer: Create LDRD/STRD on thumb2 · e40d89ef

Matthias Braun authored Jul 21, 2015

Re-apply r241926 with an additional check that r13 and r15 are not used
for LDRD/STRD. See http://llvm.org/PR24190. This also already includes
the fix from r241951.

Differential Revision: http://reviews.llvm.org/D10623

llvm-svn: 242742

e40d89ef

Revert r242737. · 42427d2c

Akira Hatanaka authored Jul 20, 2015

This caused builds to fail with the following error message:

error:Too many subtarget features! Bump MAX_SUBTARGET_FEATURES.

llvm-svn: 242740

42427d2c

[ARM] Define subtarget feature "reserve-r9", which is used to decide · 7482d40c

Akira Hatanaka authored Jul 20, 2015

whether register r9 should be reserved.

This change is needed because we cannot use a backend option to set
cl::opt "arm-reserve-r9" when doing LTO.

Out-of-tree projects currently using cl::opt option "-arm-reserve-r9" to
reserve r9 should make changes to add subtarget feature "reserve-r9" to
the IR.

rdar://problem/21529937

Differential Revision: http://reviews.llvm.org/D11320

llvm-svn: 242737

7482d40c

Revert "ARMLoadStoreOptimizer: Create LDRD/STRD on thumb2" · 731e359e
Matthias Braun authored Jul 20, 2015
```
This reverts commit r241926. This caused http://llvm.org/PR24190

llvm-svn: 242735
```
731e359e
Revert "ARMLoadStoreOpt: Merge subs/adds into LDRD/STRD; Factor out common code" · 84e28970
Matthias Braun authored Jul 20, 2015
```
This reverts commit r241928. This caused http://llvm.org/PR24190

llvm-svn: 242734
```
84e28970
Revert "ARM: Use SpecificBumpPtrAllocator to fix leak introduced in r241920" · 22f39607
Matthias Braun authored Jul 20, 2015
```
This reverts commit r241951. It caused http://llvm.org/PR24190

llvm-svn: 242733
```
22f39607

Targets: commonize some stack realignment code · e4d22d59

JF Bastien authored Jul 20, 2015

This patch does the following:
* Fix FIXME on `needsStackRealignment`: it is now shared between multiple targets, implemented in `TargetRegisterInfo`, and isn't `virtual` anymore. This will break out-of-tree targets, silently if they used `virtual` and with a build error if they used `override`.
* Factor out `canRealignStack` as a `virtual` function on `TargetRegisterInfo`, by default only looks for the `no-realign-stack` function attribute.

Multiple targets duplicated the same `needsStackRealignment` code:
 - Aarch64.
 - ARM.
 - Mips almost: had extra `DEBUG` diagnostic, which the default implementation now has.
 - PowerPC.
 - WebAssembly.
 - x86 almost: has an extra `-force-align-stack` option, which the default implementation now has.

The default implementation of `needsStackRealignment` used to just return `false`. My current patch changes the behavior by simply using the above shared behavior. This affects:
 - AMDGPU
 - BPF
 - CppBackend
 - MSP430
 - NVPTX
 - Sparc
 - SystemZ
 - XCore
 - Out-of-tree targets
This is a breaking change! `make check` passes.

The only implementation of the `virtual` function (besides the slight different in x86) was Hexagon (which did `MF.getFrameInfo()->getMaxAlignment() > 8`), and potentially some out-of-tree targets. Hexagon now uses the default implementation.

`needsStackRealignment` was being overwritten in `<Target>GenRegisterInfo.inc`, to return `false` as the default also did. That was odd and is now gone.

Reviewers: sunfish

Subscribers: aemerson, llvm-commits, jfb

Differential Revision: http://reviews.llvm.org/D11160

llvm-svn: 242727

e4d22d59

Jul 20, 2015

[ARM] Refactor the prologue/epilogue emission to be more robust. · 71a71485

Quentin Colombet authored Jul 20, 2015

This is the first step toward supporting shrink-wrapping for this target.

The changes could be summarized by these items:
- Expand the tail-call return as part of the expand pseudo pass.
- Get rid of the assumptions that the epilogue is the exit block:
  * Do not assume which registers are free in the epilogue. (This indirectly
    improve the lowering of the code for the segmented stacks, see the test
    cases.)
  * Take into account that the basic block can be empty.

Related to <rdar://problem/20821730>

llvm-svn: 242714

71a71485

Jul 18, 2015

ARM: Enable MachineScheduler and disable PostRAScheduler for swift. · 9e859806

Matthias Braun authored Jul 17, 2015

Reapply r242500 now that the swift schedmodel includes LDRLIT.

This is mostly done to disable the PostRAScheduler which optimizes for
instruction latencies which isn't a good fit for out-of-order
architectures. This also allows to leave out the itinerary table in
swift in favor of the SchedModel ones.

This change leads to performance improvements/regressions by as much as
10% in some benchmarks, in fact we loose 0.4% performance over the
llvm-testsuite for reasons that appear to be unknown or out of the
compilers control. rdar://20803802 documents the investigation of
these effects.

While it is probably a good idea to perform the same switch for the
other ARM out-of-order CPUs, I limited this change to swift as I cannot
perform the benchmark verification on the other CPUs.

Differential Revision: http://reviews.llvm.org/D10513

llvm-svn: 242588

9e859806

ARM: Add scheduling information for LDRLIT instructions to swift scheduling model · 141d1c9d

Matthias Braun authored Jul 17, 2015

These pseudo instructions are only lowered after register allocation and
are therefore still present when the machine scheduler runs.
Add a run: line to a testcase that uses the uncommon flags necessary to
actually produce a LDRLIT instruction on swift.

llvm-svn: 242587

141d1c9d

Jul 17, 2015

Revert "ARM: Enable MachineScheduler and disable PostRAScheduler for swift." · 5a6d5bc1

Adam Nemet authored Jul 17, 2015

This reverts commit r242500.

It broke some internal tests and Matthias asked me to revert it while he
is investigating.

llvm-svn: 242553

5a6d5bc1

[ARM] Use [SU]ABSDIFF nodes instead of intrinsics for VABD/VABA · a6702e2f

James Molloy authored Jul 17, 2015

No functional change, but it preps codegen for the future when SABSDIFF
will start getting generated in anger.

llvm-svn: 242546

a6702e2f

ARM: Enable MachineScheduler and disable PostRAScheduler for swift. · 2d8315f8

Matthias Braun authored Jul 17, 2015

This is mostly done to disable the PostRAScheduler which optimizes for
instruction latencies which isn't a good fit for out-of-order
architectures. This also allows to leave out the itinerary table in
swift in favor of the SchedModel ones.

This change leads to performance improvements/regressions by as much as
10% in some benchmarks, in fact we loose 0.4% performance over the
llvm-testsuite for reasons that appear to be unknown or out of the
compilers control. rdar://20803802 documents the investigation of
these effects.

While it is probably a good idea to perform the same switch for the
other ARM out-of-order CPUs, I limited this change to swift as I cannot
perform the benchmark verification on the other CPUs.

Differential Revision: http://reviews.llvm.org/D10513

llvm-svn: 242500

2d8315f8

Arm: Don't define a label twice with two setjmps in a function. · da3d0d73

Matthias Braun authored Jul 16, 2015

Constructing a name based on the function name didn't give us a unique
symbol if we had more than one setjmp in a function. Using
MCContext::createTempSymbol() always gives us a unique name.

Differential Revision: http://reviews.llvm.org/D9314

llvm-svn: 242482

da3d0d73

Fix __builtin_setjmp in combination with sjlj exception handling. · 3cd00c17

Matthias Braun authored Jul 16, 2015

llvm.eh.sjlj.setjmp was used as part of the SjLj exception handling
style but is also used in clang to implement __builtin_setjmp.  The ARM
backend needs to output additional dispatch tables for the SjLj
exception handling style, these tables however can't be emitted if
llvm.eh.sjlj.setjmp is simply used for __builtin_setjmp and no actual
landing pad blocks exist.

To solve this issue a new llvm.eh.sjlj.setup_dispatch intrinsic is
introduced which is used instead of llvm.eh.sjlj.setjmp in the SjLj
exception handling lowering, so we can differentiate between the case
where we actually need to setup a dispatch table and the case where we
just need the __builtin_setjmp semantic.

Differential Revision: http://reviews.llvm.org/D9313

llvm-svn: 242481

3cd00c17