- Nov 24, 2013
-
-
Venkatraman Govindaraju authored
[Sparc] Emit large negative adjustments to SP/FP with sethi+xor instead of sethi+or. This generates correct code for both sparc32 and sparc64. llvm-svn: 195576
-
Venkatraman Govindaraju authored
llvm-svn: 195575
-
Venkatraman Govindaraju authored
[SparcV9]: Do not emit .register directives for global registers that are clobbered by calls but not used in the function itself. llvm-svn: 195574
-
Venkatraman Govindaraju authored
llvm-svn: 195573
-
Reed Kotler authored
to what is needed for constant islands. The prescan method for Mips16 constant islands will eventually go away. It is only temporary and should be done earlier when the instructions are first created or from the DAG. If we keep it here we need to handle better the situation where constant islands is called multiple times since don't want to prescan more than once. llvm-svn: 195569
-
Reed Kotler authored
I had to move some code and I moved a declaration forward past it's first use in the function but by nutty coincidence there was another variable of the same name and type and with completely unrelated function that was declared globally in the class so no compilation error ensued. It required some unusual conditions for it to even matter. Caused test case casts.c in test-suite to fail during compilation with a duplicate symbol error. I would have noticed it during final code review for this port. llvm-svn: 195565
-
- Nov 23, 2013
-
-
Tom Stellard authored
We were ignoring the ordered/onordered bits and also the signed/unsigned bits of condition codes when lowering the DAG to MachineInstrs. NOTE: This is a candidate for the 3.4 branch. llvm-svn: 195514
-
- Nov 22, 2013
-
-
Jim Grosbach authored
Utilizing the 8 and 16 bit comparison instructions, even when an input can be folded into the comparison instruction itself, is typically not worth it. There are too many partial register stalls as a result, leading to significant slowdowns. By always performing comparisons on at least 32-bit registers, performance of the calculation chain leading to the comparison improves. Continue to use the smaller comparisons when minimizing size, as that allows better folding of loads into the comparison instructions. rdar://15386341 llvm-svn: 195496
-
Paul Robinson authored
Improvements over r195317: - Set/restore EnableFastISel flag instead of just running FastISel within SelectAllBasicBlocks; the flag is checked in various places, and FastISel won't run properly if those places don't do the right thing. - Test looks for normal ISel versus FastISel behavior, and not something more subtle that doesn't work everywhere. Based on work by Andrea Di Biagio. llvm-svn: 195491
-
Michael Liao authored
- When simplifying the mask generation for BLEND, check whether that mask is also consumed by other non-BLEND insns. If true, skip that simplification. llvm-svn: 195476
-
Richard Sandiford authored
I've no idea why I decided to handle TMxx differently from all the other high/low logic operations, but it was a stupid thing to do. The high registers aren't available as separate 32-bit registers on z10, so subreg_h32 can't be used on a GR64 there. I've normally been testing with z196 and with -O3 and so hadn't noticed this until now. llvm-svn: 195473
-
Rafael Espindola authored
The callee will not pop the stack for us. llvm-svn: 195467
-
Daniel Sanders authored
Credit to Matheus Almeida for spotting it. llvm-svn: 195456
-
Daniel Sanders authored
lowerBUILD_VECTOR() was treating integer constant splats as being legal regardless of whether they had undef values. This caused instruction selection failures when the undefs were legalized to zero, making the constant non-splat. Fixed this by requiring HasAnyUndef to be false for a integer constant splat to be legal. If it is true, a new node is generated with the undefs replaced with the necessary values to remain a splat. llvm-svn: 195455
-
Richard Barton authored
Patch by Oliver Stannard! llvm-svn: 195448
-
Daniel Sanders authored
[mips][msa] Float vector constants cannot use ldi.[wd] directly. Bitcast from the appropriate integer vector type. Fixes an instruction selection failure detected by llvm-stress. llvm-svn: 195444
-
Kostya Serebryany authored
llvm-svn: 195439
-
Hao Liu authored
Fix a Cygwin build failure caused by enum values starting with '_', which is conflicted with some platform macros. This patch only renames variables, no functional change. llvm-svn: 195432
-
Hao Liu authored
e.g. "%tmp = load <2 x i64>* %ptr" can't be selected. "%tmp = bitcast i64 %in to <2 x i32>" can't be selected. llvm-svn: 195424
-
Hao Liu authored
llvm-svn: 195423
-
Hao Liu authored
Fix a Cygwin build failure caused by enum values starting with '_', which is conflicted with some platform macros. This solution only renames variables, no functional change. NOTE: This is a candidate for the 3.4 branch. llvm-svn: 195421
-
Jiangning Liu authored
llvm-svn: 195408
-
Lang Hames authored
<def,dead> ones. Add an assertion to make sure we catch this in the future. Fixes <rdar://problem/15464559>. llvm-svn: 195401
-
Tom Stellard authored
Splitting a basic block will create a new ALU clause, so we need to make sure we aren't moving uses of registers that are local to their current clause into a new one. I had a test case for this, but unfortunately unrelated schedule changes invalidated it, and I wasn't been able to come up with another one. NOTE: This is a candidate for the 3.4 branch. llvm-svn: 195399
-
Ekaterina Romanova authored
SHLD/SHRD are VectorPath (microcode) instructions known to have poor latency on certain architectures. While generating SHLD/SHRD instructions is acceptable when optimizing for size, optimizing for speed on these platforms should be implemented using alternative sequences of instructions composed of add, adc, shr, shl, or and lea which are directPath instructions. These alternative instructions not only have a lower latency but they also increase the decode bandwidth by allowing simultaneous decoding of a third directPath instruction. AMD's processors family K7, K8, K10, K12, K15 and K16 are known to have SHLD/SHRD instructions with very poor latency. Optimization guides for these processors recommend using an alternative sequence of instructions. For these AMD's processors, I disabled folding (or (x << c) | (y >> (64 - c))) when we are not optimizing for size. It might be beneficial to disable this folding for some of the Intel's processors. However, since I couldn't find specific recommendations regarding using SHLD/SHRD instructions on Intel's processors, I haven't disabled this peephole for Intel. llvm-svn: 195383
-
- Nov 21, 2013
-
-
Daniel Sanders authored
Mask == ~InvMask asserts if the width of Mask and InvMask differ. The combine isn't valid (with two exceptions, see below) if the widths differ so test for this before testing Mask == ~InvMask. In the specific cases of Mask=~0 and InvMask=0, as well as Mask=0 and InvMask=~0, the combine is still valid. However, there are more appropriate combines that could be used in these cases such as folding x & 0 to 0, or x & ~0 to x. llvm-svn: 195364
-
Artyom Skrobov authored
llvm-svn: 195358
-
Daniel Sanders authored
Fixes a crash (null pointer dereferenced) when MSA is enabled. llvm-svn: 195343
-
NAKAMURA Takumi authored
llvm-svn: 195341
-
NAKAMURA Takumi authored
It broke, at least, i686 target. It is reproducible with "llc -mtriple=i686-unknown". FYI, it didn't appear to add either "-O0" or "-fast-isel". llvm-svn: 195339
-
Ana Pazos authored
Fixed scalar dup alias and added test case. llvm-svn: 195330
-
Ana Pazos authored
Intrinsics implemented: vqdmull_lane, vqdmulh_lane, vqrdmulh_lane, vqdmlal_lane, vqdmlsl_lane scalar Neon intrinsics. llvm-svn: 195327
-
Bill Wendling authored
clang optimizes tail calls, as in this example: int foo(void); int bar(void) { return foo(); } where the call is transformed to: calll .L0$pb .L0$pb: popl %eax .Ltmp0: addl $_GLOBAL_OFFSET_TABLE_+(.Ltmp0-.L0$pb), %eax movl foo@GOT(%eax), %eax popl %ebp jmpl *%eax # TAILCALL However, the GOT references must all be resolved at dlopen() time, and so this approach cannot be used with lazy dynamic linking (e.g. using RTLD_LAZY), which usually populates the PLT with stubs that perform the actual resolving. This patch changes X86TargetLowering::LowerCall() to skip tail call optimization, if the called function is a global or external symbol. Patch by Dimitry Andric! PR15086 llvm-svn: 195318
-
Paul Robinson authored
Based on work by Andrea Di Biagio. llvm-svn: 195317
-
Reed Kotler authored
llvm-svn: 195312
-
- Nov 20, 2013
-
-
Hal Finkel authored
The instruction definitions incorrectly specified that popcntd and popcntw have record forms; they do not. This mistake was causing invalid code generation. llvm-svn: 195272
-
Daniel Sanders authored
There's no test case for this commit. This is because it is doubtful that the incorrect behaviour can actually trigger. When MSA is not enabled, the type legalizer should have eliminated all occurrences of patterns the affected pseudo-instruction could possibly match before instruction selection occurs. llvm-svn: 195252
-
Daniel Sanders authored
llvm-svn: 195245
-
NAKAMURA Takumi authored
llvm-svn: 195238
-
NAKAMURA Takumi authored
llvm-svn: 195237
-