- Mar 18, 2013
-
-
Christian Konig authored
Signed-off-by:
Christian König <christian.koenig@amd.com> Reviewed-by:
Tom Stellard <thomas.stellard@amd.com> llvm-svn: 177277
-
Christian Konig authored
Signed-off-by:
Christian König <christian.koenig@amd.com> Reviewed-by:
Tom Stellard <thomas.stellard@amd.com> llvm-svn: 177276
-
Christian Konig authored
Signed-off-by:
Christian König <christian.koenig@amd.com> Reviewed-by:
Tom Stellard <thomas.stellard@amd.com> llvm-svn: 177275
-
Christian Konig authored
Signed-off-by:
Christian König <christian.koenig@amd.com> Reviewed-by:
Tom Stellard <thomas.stellard@amd.com> llvm-svn: 177274
-
Christian Konig authored
Signed-off-by:
Christian König <christian.koenig@amd.com> Reviewed-by:
Tom Stellard <thomas.stellard@amd.com> llvm-svn: 177273
-
Christian Konig authored
Signed-off-by:
Christian König <christian.koenig@amd.com> Reviewed-by:
Tom Stellard <thomas.stellard@amd.com> llvm-svn: 177272
-
Christian Konig authored
Unfortunately the previous fix for inserting waits for unordered defines wasn't sufficient, cause it's possible that even ordered defines are only partially used (or not used at all). Signed-off-by:
Christian König <christian.koenig@amd.com> Reviewed-by:
Tom Stellard <thomas.stellard@amd.com> llvm-svn: 177271
-
Anton Korobeynikov authored
MinGW is almost completely compatible to MSVC, with the exception of the _tls_array global not being available. Patch by David Nadlinger! llvm-svn: 177257
-
Craig Topper authored
llvm-svn: 177243
-
Craig Topper authored
llvm-svn: 177242
-
- Mar 17, 2013
-
-
Sylvestre Ledru authored
llvm-svn: 177234
-
Hal Finkel authored
This change cleans up two issues with Altivec register spilling: 1. The spilling code was inefficient (using two instructions, and add and a load, when just one would do) 2. The code assumed that r0 would always be available (true for now, but this will change) The new code handles VR spilling just like GPR spills but forced into r+r mode. As a result, when any VR spills are present, we must now always allocate the register-scavenger spill slot. llvm-svn: 177231
-
- Mar 16, 2013
-
-
Hal Finkel authored
As a follow-up to r158719, remove PPCRegisterInfo::avoidWriteAfterWrite. Jakob pointed out in response to r158719 that this callback is currently unused and so this has no effect (and the speedups that I thought that I had observed as a result of implementing this function must have been noise). llvm-svn: 177228
-
Craig Topper authored
Previously we weren't skipping the VVVV encoded register. Based on patch by Michael Liao. llvm-svn: 177221
-
Jakob Stoklund Olesen authored
Since almost all X86 instructions can fold loads, use a multiclass to define register/memory pairs of SchedWrites. An X86FoldableSchedWrite represents the register version of an instruction. It holds a reference to the SchedWrite to use when the instruction folds a load. This will be used inside multiclasses that define rr and rm instruction versions together. llvm-svn: 177210
-
- Mar 15, 2013
-
-
Arnold Schwaighofer authored
I was too pessimistic in r177105. Vector selects that fit into a legal register type lower just fine. I was mislead by the code fragment that I was using. The stores/loads that I saw in those cases came from lowering the conditional off an address. Changing the code fragment to: %T0_3 = type <8 x i18> %T1_3 = type <8 x i1> define void @func_blend3(%T0_3* %loadaddr, %T0_3* %loadaddr2, %T1_3* %blend, %T0_3* %storeaddr) { %v0 = load %T0_3* %loadaddr %v1 = load %T0_3* %loadaddr2 ==> FROM: ;%c = load %T1_3* %blend ==> TO: %c = icmp slt %T0_3 %v0, %v1 ==> USE: %r = select %T1_3 %c, %T0_3 %v0, %T0_3 %v1 store %T0_3 %r, %T0_3* %storeaddr ret void } revealed this mistake. radar://13403975 llvm-svn: 177170
-
Silviu Baranga authored
Adding an A15 specific optimization pass for interactions between S/D/Q registers. The pass handles all the required transformations pre-regalloc. llvm-svn: 177169
-
Benjamin Kramer authored
Fixes PR15520. llvm-svn: 177167
-
Hal Finkel authored
Unaligned access is supported on PPC for non-vector types, and is generally more efficient than manually expanding the loads and stores. A few of the existing test cases were using expanded unaligned loads and stores to test other features (like load/store with update), and for these test cases, unaligned access remains disabled. llvm-svn: 177160
-
Arnold Schwaighofer authored
A vector fptrunc and fpext simply gets split into scalar instructions. radar://13192358 llvm-svn: 177159
-
Hal Finkel authored
In preparation for the addition of other SIMD ISA extensions (such as QPX) we need to make sure that all Altivec patterns are properly predicated on having Altivec support. No functionality change intended (one test case needed to be updated b/c it assumed that Altivec intrinsics would be supported without enabling Altivec support). llvm-svn: 177152
-
Hal Finkel authored
For spills into a large stack frame, the FI-elimination code uses the register scavenger to obtain a free GPR for use with an r+r-addressed load or store. When there are no available GPRs, the scavenger gets one by using its spill slot. Previously, we were not always allocating that spill slot and the RS would assert when the spill slot was needed. I don't currently have a small test that triggered the assert, but I've created a small regression test that verifies that the spill slot is now added when the stack frame is sufficiently large. llvm-svn: 177140
-
Eric Christopher authored
llvm-svn: 177135
-
Nadav Rotem authored
llvm-svn: 177130
-
David Blaikie authored
(these were added in r177089) llvm-svn: 177129
-
Akira Hatanaka authored
llvm-svn: 177128
-
- Mar 14, 2013
-
-
Jakob Stoklund Olesen authored
The new InstrSchedModel is easier to use than the instruction itineraries. It will be used to model instruction latency and throughput in modern Intel microarchitectures like Sandy Bridge. InstrSchedModel should be able to coexist with instruction itinerary classes, but for cleanliness we should switch the Atom processor model to the new InstrSchedModel as well. llvm-svn: 177122
-
Reed Kotler authored
See the Mips16ISetLowering.cpp patch to see a use of this. For now now the extra code in Mips16ISetLowering.cpp is a nop but is used for test purposes. Mips32 registers are setup and then removed and then the Mips16 registers are setup. Normally you need to add register classes and then call computeRegisterProperties. llvm-svn: 177120
-
Chad Rosier authored
the win64 calling convention. rdar://13423768 llvm-svn: 177113
-
Hal Finkel authored
This is a generic function (derived from PEI); moving it into MachineFrameInfo eliminates a current redundancy between the ARM and AArch64 backends, and will allow it to be used by the PowerPC target code. No functionality change intended. llvm-svn: 177111
-
Hal Finkel authored
Add the current PEI register scavenger as a parameter to the processFunctionBeforeFrameFinalized callback. This change is necessary in order to allow the PowerPC target code to set the register scavenger frame index after the save-area offset adjustments performed by processFunctionBeforeFrameFinalized. Only after these adjustments have been made is it possible to estimate the size of the stack frame. llvm-svn: 177108
-
Hal Finkel authored
Make requiresFrameIndexScavenging return true, and create virtual registers in the spilling code instead of using the register scavenger directly. This makes the target-level code simpler, and importantly, delays the scavenging until after callee-saved register processing (which will be important for later changes). Also cleans up trackLivenessAfterRegAlloc (makes it inline in the header with the other related functions). This makes it clear that it always returns true. No functionality change intended. llvm-svn: 177107
-
Hal Finkel authored
We used to add a spill slot for the register scavenger whenever the function has a frame pointer. This is unnecessarily conservative: We may need the spill slot for dynamic stack allocations, and functions with dynamic stack allocations always have a FP, but we might also have a FP for other reasons (such as the user explicitly disabling frame-pointer elimination), and we don't necessarily need a spill slot for those functions. The structsinregs test needed adjustment because it disables FP elimination. llvm-svn: 177106
-
Arnold Schwaighofer authored
By terrible I mean we store/load from the stack. This matters on PAQp8 in _Z5trainPsS_ii (which is inlined into Mixer::update) where we decide to vectorize a loop with a VF of 8 resulting in a 25% degradation on a cortex-a8. LV: Found an estimated cost of 2 for VF 8 For instruction: icmp slt i32 LV: Found an estimated cost of 2 for VF 8 For instruction: select i1, i32, i32 The bug that tracks the CodeGen part is PR14868. radar://13403975 llvm-svn: 177105
-
Akira Hatanaka authored
No functionality changes. llvm-svn: 177104
-
Jyotsna Verma authored
We are warning the user about the alignment, so we should not assert. llvm-svn: 177103
-
Akira Hatanaka authored
llvm-svn: 177096
-
Akira Hatanaka authored
No intended functionality changes. llvm-svn: 177095
-
Hal Finkel authored
I don't think that it is otherwise clear how the overlapping offsets are processed into distinct spill slots. Comment that this is done in processFunctionBeforeFrameFinalized. llvm-svn: 177094
-
Akira Hatanaka authored
llvm-svn: 177092
-