- Apr 10, 2013
-
-
Michel Danzer authored
21 more little piglits with radeonsi. Reviewed-by:
Tom Stellard <thomas.stellard@amd.com> llvm-svn: 179186
-
Reed Kotler authored
Mips32 code as Mips16 unless it can't be compiled as Mips 16. For now this would happen as long as floating point instructions are not needed. Probably it would also make sense to compile as mips32 if atomic operations are needed too. There may be other cases too. A module pass prescans the IR and adds the mips16 or nomips16 attribute to functions depending on the functions needs. Mips 16 mode can result in a 40% code compression by utililizing 16 bit encoding of many instructions. The hope is for this to replace the traditional gcc way of dealing with Mips16 code using floating point which involves essentially using soft float but with a library implemented using mips32 floating point. This gcc method also requires creating stubs so that Mips32 code can interact with these Mips 16 functions that have floating point needs. My conjecture is that in reality this traditional gcc method would never win over this new method. I will be implementing the traditional gcc method also. Some of it is already done but I needed to do the stubs to finish the work and those required this mips16/32 mixed mode capability. I have more ideas for to make this new method much better and I think the old method will just live in llvm for anyone that needs the backward compatibility but I don't for what reason that would be needed. llvm-svn: 179185
-
Peter Collingbourne authored
symbol with multiple .type declarations. Differential Revision: http://llvm-reviews.chandlerc.com/D607 llvm-svn: 179184
-
Rafael Espindola authored
llvm-svn: 179179
-
Vincent Lejeune authored
llvm-svn: 179174
-
Tim Northover authored
These instructions aren't universally available, but depend on a specific extension to the normal ARM architecture (rather than, say, v6/v7/...) so a new feature is appropriate. This also enables the feature by default on A-class cores which usually have these extensions, to avoid breaking existing code and act as a sensible default. llvm-svn: 179171
-
Joey Gouly authored
rather than checking if the source and destination have the same number of arguments and copying the attributes over directly. llvm-svn: 179169
-
Christian Konig authored
Depending on the number of bits set in the writemask. Signed-off-by:
Christian König <christian.koenig@amd.com> Reviewed-by:
Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 179166
-
Christian Konig authored
Signed-off-by:
Christian König <christian.koenig@amd.com> Reviewed-by:
Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 179165
-
Christian Konig authored
Signed-off-by:
Christian König <christian.koenig@amd.com> Reviewed-by:
Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 179164
-
Hal Finkel authored
Implement suggestions made by Bill Schmidt in post-commit review. Thanks! llvm-svn: 179162
-
Tobias Grosser authored
Contributed by: Star Tan <tanmx_star@yeah.net> llvm-svn: 179157
-
Hal Finkel authored
This adds in-principle support for if-converting the bctr[l] instructions. These instructions are used for indirect branching. It seems, however, that the current if converter will never actually predicate these. To do so, it would need the ability to hoist a few setup insts. out of the conditionally-executed block. For example, code like this: void foo(int a, int (*bar)()) { if (a != 0) bar(); } becomes: ... beq 0, .LBB0_2 std 2, 40(1) mr 12, 4 ld 3, 0(4) ld 11, 16(4) ld 2, 8(4) mtctr 3 bctrl ld 2, 40(1) .LBB0_2: ... and it would be safe to do all of this unconditionally with a predicated beqctrl instruction. llvm-svn: 179156
-
Rafael Espindola authored
For now they are still only used as little endian. llvm-svn: 179147
-
Evan Cheng authored
xmm0 / xmm1. rdar://13599493 llvm-svn: 179141
-
Andrew Trick authored
The target hooks are getting out of hand. What does it mean to run before or after regalloc anyway? Allowing either Pass* or AnalysisID pass identification should make it much easier for targets to use the substitutePass and insertPass APIs, and create less need for badly named target hooks. llvm-svn: 179140
-
Jack Carter authored
Modifier 'D' is to use the second word of a double integer. We had previously implemented the pure register varient of the modifier and this patch implements the memory reference. #include "stdio.h" int b[8] = {0,1,2,3,4,5,6,7}; void main() { int i; // The first word. Notice, no 'D' {asm ( "lw %0,%1;" : "=r" (i) : "m" (*(b+4)) );} printf("%d\n",i); // The second word {asm ( "lw %0,%D1;" : "=r" (i) : "m" (*(b+4)) );} printf("%d\n",i); } llvm-svn: 179135
-
Hal Finkel authored
This enables us to form predicated branches (which are the same conditional branches we had before) and also a larger set of predicated returns (including instructions like bdnzlr which is a conditional return and loop-counter decrement all in one). At the moment, if conversion does not capture all possible opportunities. A simple example is provided in early-ret2.ll, where if conversion forms one predicated return, and then the PPCEarlyReturn pass picks up the other one. So, at least for now, we'll keep both mechanisms. llvm-svn: 179134
-
Bob Wilson authored
llvm-svn: 179132
-
- Apr 09, 2013
-
-
Chad Rosier authored
llvm-svn: 179129
-
Chad Rosier authored
llvm-svn: 179125
-
Rafael Espindola authored
llvm-svn: 179124
-
Chad Rosier authored
llvm-svn: 179120
-
Chandler Carruth authored
columns is essentially impossible to edit. llvm-svn: 179119
-
Reed Kotler authored
and mips16 on a per function basis. Because this patch is somewhat involved I have provide an overview of the key pieces of it. The patch is written so as to not change the behavior of the non mixed mode. We have tested this a lot but it is something new to switch subtargets so we don't want any chance of regression in the mainline compiler until we have more confidence in this. Mips32/64 are very different from Mip16 as is the case of ARM vs Thumb1. For that reason there are derived versions of the register info, frame info, instruction info and instruction selection classes. Now we register three separate passes for instruction selection. One which is used to switch subtargets (MipsModuleISelDAGToDAG.cpp) and then one for each of the current subtargets (Mips16ISelDAGToDAG.cpp and MipsSEISelDAGToDAG.cpp). When the ModuleISel pass runs, it determines if there is a need to switch subtargets and if so, the owning pointers in MipsTargetMachine are appropriately changed. When 16Isel or SEIsel is run, they will return immediately without doing any work if the current subtarget mode does not apply to them. In addition, MipsAsmPrinter needs to be reset on a function basis. The pass BasicTargetTransformInfo is substituted with a null pass since the pass is immutable and really needs to be a function pass for it to be used with changing subtargets. This will be fixed in a follow on patch. llvm-svn: 179118
-
Nadav Rotem authored
This commit adds the infrastructure for performing bottom-up SLP vectorization (and other optimizations) on parallel computations. The infrastructure has three potential users: 1. The loop vectorizer needs to be able to vectorize AOS data structures such as (sum += A[i] + A[i+1]). 2. The BB-vectorizer needs this infrastructure for bottom-up SLP vectorization, because bottom-up vectorization is faster to compute. 3. A loop-roller needs to be able to analyze consecutive chains and roll them into a loop, in order to reduce code size. A loop roller does not need to create vector instructions, and this infrastructure separates the chain analysis from the vectorization. This patch also includes a simple (100 LOC) bottom up SLP vectorizer that uses the infrastructure, and can vectorize this code: void SAXPY(int *x, int *y, int a, int i) { x[i] = a * x[i] + y[i]; x[i+1] = a * x[i+1] + y[i+1]; x[i+2] = a * x[i+2] + y[i+2]; x[i+3] = a * x[i+3] + y[i+3]; } llvm-svn: 179117
-
Chad Rosier authored
parse an identifier. Otherwise, parseExpression may parse multiple tokens, which makes it impossible to properly compute an immediate displacement. An example of such a case is the source operand (i.e., [Symbol + ImmDisp]) in the below example: __asm mov eax, [Symbol + ImmDisp] The existing test cases exercise this patch. rdar://13611297 llvm-svn: 179115
-
Eric Christopher authored
therefore not at all) of the pc or statement list. We also don't need to emit the compilation dir so save so space and time and don't bother. Fix up the testcase accordingly and verify that we don't emit the attributes or the items that they use. llvm-svn: 179114
-
Hal Finkel authored
Some general cleanup and only scan the end of a BB for branches (once we're done with the terminators and debug values, then there should not be any other branches). These address post-commit review suggestions by Bill Schmidt. No functionality change intended. llvm-svn: 179112
-
Nadav Rotem authored
llvm-svn: 179111
-
Chad Rosier authored
rather than deriving the StringRef from the Start and End SMLocs. Using the Start and End SMLocs works fine for operands such as [Symbol], but not for operands such as [Symbol + ImmDisp]. All existing test cases that reference a variable exercise this patch. rdar://13602265 llvm-svn: 179109
-
Benjamin Kramer authored
This pattern occurs in SROA output due to the way vector arguments are lowered on ARM. The testcase from PR15525 now compiles into this, which is better than the code we got with the old scalarrepl: _Store: ldr.w r9, [sp] vmov d17, r3, r9 vmov d16, r1, r2 vst1.8 {d16, d17}, [r0] bx lr Differential Revision: http://llvm-reviews.chandlerc.com/D647 llvm-svn: 179106
-
Hal Finkel authored
On PowerPC, non-vector loads and stores have r+i forms; however, in functions with large stack frames these were not being used to access slots far from the stack pointer because such slots were out of range for the signed 16-bit immediate offset field. This increases register pressure because we need a separate register for each offset (when the r+r form is used). By enabling virtual base registers, we can deal with large stack frames without unduly increasing register pressure. llvm-svn: 179105
-
Rafael Espindola authored
For now it is templated only on being 64 or 32 bits. I will add little/big endian next. llvm-svn: 179097
-
Alexey Samsonov authored
DWARF parser: Fix DWARF-2/3 incompatibility: size of DW_FORM_ref_addr is the same as DW_FORM_addr in DWARF2, and is 4/8 bytes on 32/64-bit DWARF starting from DWARF3. Adding a test for this is a huge pain - generating and uploading pre-built binary with DWARF3 debug info is way too ugly, and writing fine-grained unittests for DebugInfo is impossible, as it doesn't expose any headers in include/llvm. That said, I'm going to choose the second approach and submit the patch exposing DebugInfo headers for review soon enough. llvm-svn: 179095
-
Jakob Stoklund Olesen authored
llvm-svn: 179086
-
Nadav Rotem authored
llvm-svn: 179084
-
Jakob Stoklund Olesen authored
The save area is twice as big and there is no struct return slot. The stack pointer is always 16-byte aligned (after adding the bias). Also eliminate the stack adjustment instructions around calls when the function has a reserved stack frame. llvm-svn: 179083
-
Rafael Espindola authored
llvm-svn: 179076
-
Rafael Espindola authored
Use it when we don't need to know if we have a 32 or 64 bit SymbolTableEntry. llvm-svn: 179074
-