Commits · 98aa7fab7edf836434c77d79d048fba66f5748b0 · Roger Ferrer / llvm-epi-0.8

Jan 24, 2014
- [Sparc] Correct quad register list in the asm parser. · 98aa7fab
  Venkatraman Govindaraju authored Jan 24, 2014
```
Add test cases to check parsing of v9 double registers and their aliased quad registers.

llvm-svn: 199974
```
  98aa7fab
- InitToTextSection is redundant with InitSections. Remove it. · f144034c
  Rafael Espindola authored Jan 23, 2014
```
llvm-svn: 199955
```
  f144034c
Jan 23, 2014

Update the X86 assembler for .intel_syntax to produce an error for invalid base · bc570f28

Kevin Enderby authored Jan 23, 2014

registers in memory addresses that do not match the index register. As it does
for .att_syntax.

rdar://15887380

llvm-svn: 199948

bc570f28

Update the X86 assembler for .intel_syntax to produce an error for invalid · 9d11702f

Kevin Enderby authored Jan 23, 2014

scale factors in memory addresses. As it does for .att_syntax.

It was producing:
Assertion failed: (((Scale == 1 || Scale == 2 || Scale == 4 || Scale == 8)) && "Invalid scale!"), function CreateMem, file /Volumes/SandBox/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp, line 1133.

rdar://14967214

llvm-svn: 199942

9d11702f

Fix out of bounds access to the double regs array. Given the · 7383d4a9
Eric Christopher authored Jan 23, 2014
```
code this looks correct, but could use review. The previous
was definitely not correct.

llvm-svn: 199940
```
7383d4a9
Add a few missing cases from r199933. Testcase coming shortly. · b1ce3337
Lang Hames authored Jan 23, 2014
```
llvm-svn: 199938
```
b1ce3337
Replace vfmaddxx213 instructions with their 231-type equivalents in accumulator · 23de211c
Lang Hames authored Jan 23, 2014
```
loops. Writing back to the accumulator (231-type) allows the coalescer to
eliminate an extra copy.

llvm-svn: 199933
```
23de211c

[Thumbv8] Fix the value of BLXOperandIndex of isV8EligibleForIT · 5930ae6c

Weiming Zhao authored Jan 23, 2014

Originally, BLX was passed as operand #0 in MachineInstr and as operand
#2 in MCInst. But now, it's operand #2 in both cases.

This patch also removes unnecessary FileCheck in the test case added by r199127.

llvm-svn: 199928

5930ae6c

Add target analysis passes to the codegen pipeline for MCJIT. · 5fe955cb

Juergen Ributzka authored Jan 23, 2014

This patch adds the target analysis passes (usually TargetTransformInfo) to the
codgen pipeline. We also expose now the AddAnalysisPasses method through the C
API, because the optimizer passes would also benefit from better target-specific
cost models.

Reviewed by Andrew Kaylor

llvm-svn: 199926

5fe955cb

[AArch64] Added vselect patterns with float and double types · 5d31f694
Ana Pazos authored Jan 23, 2014
```
llvm-svn: 199925
```
5d31f694

R600: Remove successive JUMP in AnalyzeBranch when AllowModify is true · a64353e5

Tom Stellard authored Jan 23, 2014

This fixes a crash in the OpenCV OpenCL test suite.

There is no lit test for this, because the test would be very large
and could easily be invalidated by changes to the scheduler
or other parts of the compiler.

Patch by:  Vincent Lejeune

llvm-svn: 199919

a64353e5

R600: Disable the BFE pattern · a2a4b8ee

Tom Stellard authored Jan 23, 2014

This pattern uses an SDNodeXForm, which isn't being emitted for some
reason.  I can get it to work by attaching the PatLeaf that has the
XForm to the argument in the output pattern, but this results in an
immediate being used in a register operand, which the backend can't
handle yet.

llvm-svn: 199918

a2a4b8ee

R600: Correctly handle vertex fetch clauses the precede ENDIFs · 805890b2

Tom Stellard authored Jan 23, 2014

The control flow finalizer would sometimes use an ALU_POP_AFTER
instruction before the vetex fetch clause instead of using a POP
instruction after it.

llvm-svn: 199917

805890b2

R600: Unconditionally unroll loops that contain GEPs with alloca pointers · 8cce9bdf

Tom Stellard authored Jan 23, 2014

Implement the getUnrollingPreferences() function for
AMDGPUTargetTransformInfo so that loops that do address calculations
on pointers derived from alloca are unconditionally unrolled.

Unrolling these loops makes it more likely that SROA will be able to
eliminate the allocas, which is a big win for R600 since memory
allocated by alloca (private memory) is really slow.

llvm-svn: 199916

8cce9bdf

R600: Recommit 199842: Add work-around for the CF stack entry HW bug · 348273df

Tom Stellard authored Jan 23, 2014

The unit test is now disabled on non-asserts builds.

The CF stack can be corrupted if you use CF_ALU_PUSH_BEFORE,
CF_ALU_ELSE_AFTER, CF_ALU_BREAK, or CF_ALU_CONTINUE when the number of
sub-entries on the stack is greater than or equal to the stack entry
size and sub-entries modulo 4 is either 0 or 3 (on cedar the bug is
present when number of sub-entries module 8 is either 7 or 0)

We choose to be conservative and always apply the work-around when the
number of sub-enries is greater than or equal to the stack entry size,
so that we can safely over-allocate the stack when we are unsure of the
stack allocation rules.

reviewed-by: Vincent Lejeune <vljn at ovi.com>
llvm-svn: 199905

348273df

AVX-512: added VPERM2D VPERM2Q VPERM2PS VPERM2PD instructions, · a5d38a39
Elena Demikhovsky authored Jan 23, 2014
```
they give better sequences than VPERMI

llvm-svn: 199893
```
a5d38a39

ARM: use litpools for normal i32 imms when compiling minsize. · 55c625f2

Tim Northover authored Jan 23, 2014

With constant-sharing, litpool loads consume 4 + N*2 bytes of code, but
movw/movt pairs consume 8*N. This means litpools are better than movw/movt even
with just one use. Other materialisation strategies can still be better though,
so the logic is a little odd.

llvm-svn: 199891

55c625f2

[mips][sched] Split IIStore into II_S[BHWD], II_S[WD][LR], and II_SAVE · 37463f72
Daniel Sanders authored Jan 23, 2014
```
No functional change since the InstrItinData's have been duplicated.

llvm-svn: 199876
```
37463f72

Add a variable to track whether or not we've used a unique section, · 15abef6d

Eric Christopher authored Jan 23, 2014

e.g. linkonce, to TargetMachine and set it when we've done so
for ELF targets currently. This involved making TargetMachine
non-const in a TLOF use and propagating that change around - I'm
open to other ideas.

This will be used in a future commit to handle emitting debug
information with ranges.

llvm-svn: 199871

15abef6d

fix some spell mistakes around 'ConcatVector' and 'ShuffleVector' in AArch64 backend. · 50944eb6
Kevin Qin authored Jan 23, 2014
```
llvm-svn: 199858
```
50944eb6
X86Disassembler.cpp: Fix @param introduced in r199804. [-Wdocumentation] · 372f05d5
NAKAMURA Takumi authored Jan 23, 2014
```
llvm-svn: 199855
```
372f05d5
[Mips] formatting through clang-format · 3b2c96ee
Jack Carter authored Jan 22, 2014
```
llvm-svn: 199853
```
3b2c96ee

[Mips] TargetStreamer Support for .set mips16. · 39536724

Jack Carter authored Jan 22, 2014

This patch updates .set mips16 support which
affects the ELF ABI and its flags. In addition the patch uses
a common interface for both the MipsTargetSteamer and
MipsObjectStreamer that the assembler uses for
both ELF and ASCII output for these directives.

llvm-svn: 199851

39536724

Jan 22, 2014

Revert "R600: Add work-around for the CF stack entry HW bug" · 31e16388

Tom Stellard authored Jan 22, 2014

This reverts commit 35b8331cad6eb512a2506adbc394201181da94ba.

The -debug-only flag for llc doesn't appear to be available in
all build configurations.

llvm-svn: 199845

31e16388

R600: Add work-around for the CF stack entry HW bug · e89373e0

Tom Stellard authored Jan 22, 2014

The CF stack can be corrupted if you use CF_ALU_PUSH_BEFORE,
CF_ALU_ELSE_AFTER, CF_ALU_BREAK, or CF_ALU_CONTINUE when the number of
sub-entries on the stack is greater than or equal to the stack entry
size and sub-entries modulo 4 is either 0 or 3 (on cedar the bug is
present when number of sub-entries module 8 is either 7 or 0)

We choose to be conservative and always apply the work-around when the
number of sub-enries is greater than or equal to the stack entry size,
so that we can safely over-allocate the stack when we are unsure of the
stack allocation rules.

reviewed-by: Vincent Lejeune <vljn at ovi.com>
llvm-svn: 199842

e89373e0

R600: Add some missing CF instruction definitions to the .td files. · 59ed4794
Tom Stellard authored Jan 22, 2014
```
reviewed-by: Vincent Lejeune <vljn at ovi.com>
llvm-svn: 199841
```
59ed4794
R600: Refactor stack size calculation · a40f9715
Tom Stellard authored Jan 22, 2014
```
reviewed-by: Vincent Lejeune <vljn at ovi.com>
llvm-svn: 199840
```
a40f9715
R600: CF_PUSH is the same on Evergreen and Cayman · afbb697e
Tom Stellard authored Jan 22, 2014
```
reviewed-by: Vincent Lejeune <vljn at ovi.com>
llvm-svn: 199839
```
afbb697e
R600: Add wavefront size property to the subtargets v2 · 8c347b02
Tom Stellard authored Jan 22, 2014
```
v2:
  - Initialize wavefront size to 0

reviewed-by: Vincent Lejeune <vljn at ovi.com>
llvm-svn: 199838
```
8c347b02
R600: Add stack size to .AMDGPUcsdata section · 08b6af91
Tom Stellard authored Jan 22, 2014
```
reviewed-by: Vincent Lejeune <vljn at ovi.com>
llvm-svn: 199837
```
08b6af91

Fix pr18515. · 28a85a84

Rafael Espindola authored Jan 22, 2014

My understanding (from reading just the llvm code) is that
* most ppc cpus have a "sync n" instruction and an msync alias that is "sync 0".
* "book e" cpus instead have a msync instruction and not the more
general "sync n"

This patch reflects that in the .td files, allowing a single codepath for
asm ond obj streamer and incidentelly fixes a crash when EmitRawText was
called on a obj streamer.

llvm-svn: 199832

28a85a84

R600: MOVA is vector only · 476437cb
Tom Stellard authored Jan 22, 2014
```
llvm-svn: 199827
```
476437cb
R600: Take alignment into account when calculating the stack offset · 598f3945
Tom Stellard authored Jan 22, 2014
```
llvm-svn: 199826
```
598f3945
R600: Add support for global addresses with constant initializers · 04c0e985
Tom Stellard authored Jan 22, 2014
```
llvm-svn: 199825
```
04c0e985

R600: Begin private memory at the second GPR. · 27982b1d

Tom Stellard authored Jan 22, 2014

This way private memory does not over-write work group information
stored in GPRs 0 and 1.

llvm-svn: 199824

27982b1d

R600/SI: Add support for i8 and i16 private loads/stores · e9373605
Tom Stellard authored Jan 22, 2014
```
llvm-svn: 199823
```
e9373605

Fix inline assembly that switches between ARM and Thumb modes · 1f6a6086

Greg Fitzgerald authored Jan 22, 2014

This patch restores the ARM mode if the user's inline assembly
does not.  In the object streamer, it ensures that instructions
following the inline assembly are encoded correctly and that
correct mapping symbols are emitted.  For the asm streamer, it
emits a .arm or .thumb directive.

This patch does not ensure that the inline assembly contains
the ADR instruction to switch modes at runtime.

The problem we need to solve is code like this:

  int foo(int a, int b) {
    int r = a + b;
    asm volatile(
        ".align 2     \n"
        ".arm         \n"
        "add r0,r0,r0 \n"
    : : "r"(r));
    return r+1;
  }

If we compile this function in thumb mode then the inline assembly
will switch to arm mode. We need to make sure that we switch back to
thumb mode after emitting the inline assembly or we will incorrectly
encode the instructions that follow (i.e. the assembly instructions
for return r+1).

Based on patch by David Peixotto

Change-Id: Ib57f6d2d78a22afad5de8693fba6230ff56ba48b
llvm-svn: 199818

1f6a6086

Remove param doxygen comment for non-existing parameter. · f5f23b09
Benjamin Kramer authored Jan 22, 2014
```
Found by -Wdocumentation.

llvm-svn: 199814
```
f5f23b09
[x86] Silence unused diReg variable warning in non-asserting builds · 7a7c192e
David Woodhouse authored Jan 22, 2014
```
llvm-svn: 199812
```
7a7c192e
[x86] Fix uninitialized variable warning in translate{Src,Dst}Index · fee418c2
David Woodhouse authored Jan 22, 2014
```
llvm-svn: 199811
```
fee418c2