- Oct 16, 2012
-
-
Stepan Dyatkovskiy authored
Stack is formed improperly for long structures passed as byval arguments for EABI mode. If we took AAPCS reference, we can found the next statements: A: "If the argument requires double-word alignment (8-byte), the NCRN (Next Core Register Number) is rounded up to the next even register number." (5.5 Parameter Passing, Stage C, C.3). B: "The alignment of an aggregate shall be the alignment of its most-aligned component." (4.3 Composite Types, 4.3.1 Aggregates). So if we have structure with doubles (9 double fields) and 3 Core unused registers (r1, r2, r3): caller should use r2 and r3 registers only. Currently r1,r2,r3 set is used, but it is invalid. Callee VA routine should also use r2 and r3 regs only. All is ok here. This behaviour is guessed by rounding up SP address with ADD+BFC operations. Fix: Main fix is in ARMTargetLowering::HandleByVal. If we detected AAPCS mode and 8 byte alignment, we waste odd registers then. P.S.: I also improved LDRB_POST_IMM regression test. Since ldrb instruction will not generated by current regression test after this patch. llvm-svn: 166018
-
NAKAMURA Takumi authored
Original message: The attached is the fix to radar://11663049. The optimization can be outlined by following rules: (select (x != c), e, c) -> select (x != c), e, x), (select (x == c), c, e) -> select (x == c), x, e) where the <c> is an integer constant. The reason for this change is that : on x86, conditional-move-from-constant needs two instructions; however, conditional-move-from-register need only one instruction. While the LowerSELECT() sounds to be the most convenient place for this optimization, it turns out to be a bad place. The reason is that by replacing the constant <c> with a symbolic value, it obscure some instruction-combining opportunities which would otherwise be very easy to spot. For that reason, I have to postpone the change to last instruction-combining phase. The change passes the test of "make check-all -C <build-root/test" and "make -C project/test-suite/SingleSource". Original message since r165661: My previous change has a bug: I negated the condition code of a CMOV, and go ahead creating a new CMOV using the *ORIGINAL* condition code. llvm-svn: 166017
-
Bill Wendling authored
llvm-svn: 166016
-
Craig Topper authored
llvm-svn: 166014
-
Bill Wendling authored
llvm-svn: 166013
-
Bill Wendling authored
llvm-svn: 166012
-
Bill Wendling authored
llvm-svn: 166011
-
Bill Wendling authored
Use the Attributes::get method which takes an AttrVal value directly to simplify the code a bit. No functionality change. llvm-svn: 166009
-
Bill Wendling authored
llvm-svn: 166008
-
Bill Wendling authored
llvm-svn: 166007
-
Craig Topper authored
llvm-svn: 166004
-
Andrew Trick authored
This is a medium term workaround until we have a more robust solution in the form of a register liveness utility for postRA passes. llvm-svn: 166001
-
Jakob Stoklund Olesen authored
llvm-svn: 165999
-
Jakob Stoklund Olesen authored
Clients can use the equivalent functions in MRI. llvm-svn: 165990
-
Michael Liao authored
- Besides used in SjLj exception handling, __builtin_setjmp/__longjmp is also used as a light-weight replacement of setjmp/longjmp which are used to implementation continuation, user-level threading, and etc. The support added in this patch ONLY addresses this usage and is NOT intended to support SjLj exception handling as zero-cost DWARF exception handling is used by default in X86. llvm-svn: 165989
-
Jakob Stoklund Olesen authored
All callers can simply use the corresponding MRI functions. llvm-svn: 165985
-
- Oct 15, 2012
-
-
Jakob Stoklund Olesen authored
Using the cached bit vector in MRI avoids comstantly allocating and recomputing the reserved register bit vector. llvm-svn: 165983
-
Jakob Stoklund Olesen authored
Also provide an MRI::getReservedRegs() function to access the frozen register set, and isReserved() and isAllocatable() methods to test individual registers. The various implementations of TRI::getReservedRegs() are quite complicated, and many passes need to look at the reserved register set. This patch makes it possible for these passes to use the cached copy in MRI, avoiding a lot of malloc traffic and repeated calculations. llvm-svn: 165982
-
Jim Grosbach authored
rdar://12502028 llvm-svn: 165981
-
Bill Wendling authored
Move the Attributes::Builder outside of the Attributes class and into its own class named AttrBuilder. No functionality change. llvm-svn: 165960
-
Chad Rosier authored
llvm-svn: 165955
-
Rafael Espindola authored
follow in one sec. llvm-svn: 165951
-
Andrew Trick authored
llvm-svn: 165950
-
Chad Rosier authored
llvm-svn: 165947
-
Chad Rosier authored
inline assembly. For the time being, these will be called directly by clang. However, in the near future I expect these to be sunk back into the MC layer and more basic APIs (e.g., getClobbers(), getConstraints(), etc.) will be called by clang. llvm-svn: 165946
-
Chad Rosier authored
llvm-svn: 165945
-
Micah Villmow authored
Resubmit the changes to llvm core to update the functions to support different pointer sizes on a per address space basis. llvm-svn: 165941
-
Adhemerval Zanella authored
This patch replaces the EmitRawText by a EmitTCEntry class (specialized for each Streamer) in PowerPC64 TOC entry creation. llvm-svn: 165940
-
Kostya Serebryany authored
[asan] make AddressSanitizer to be a FunctionPass instead of ModulePass. This will simplify chaining other FunctionPasses with asan. Also some minor cleanup llvm-svn: 165936
-
Chandler Carruth authored
includes extracting ints for copying elsewhere and inserting ints when copying into the alloca. This should fix the CanSROA assertion coming out of Clang's regression test suite. llvm-svn: 165931
-
Chandler Carruth authored
and generally clean up the memset handling. It had rotted a bit as the other rewriting logic got polished more. llvm-svn: 165930
-
Silviu Baranga authored
Fixed PR13938: the ARM backend was crashing because it couldn't select a VDUPLANE node with the vector input size different from the output size. This was bacause the BUILD_VECTOR lowering code didn't check that the size of the input vector was correct for using VDUPLANE. llvm-svn: 165929
-
Chandler Carruth authored
cases where we have partial integer loads and stores to an otherwise promotable alloca to widen[1] those loads and stores to cover the entire alloca and bitcast them into the appropriate type such that promotion can proceed. These partial loads and stores stem from an annoying confluence of ARM's calling convention and ABI lowering and the FCA pre-splitting which takes place in SROA. Clang lowers a { double, double } in-register function argument as a [4 x i32] function argument to ensure it is placed into integer 32-bit registers (a really unnerving implicit contract between Clang and the ARM backend I would add). This results in a FCA load of [4 x i32]* from the { double, double } alloca, and SROA decomposes this into a sequence of i32 loads and stores. Inlining proceeds, code gets folded, but at the end of the day, we still have i32 stores to the low and high halves of a double alloca. Widening these to be i64 operations, and bitcasting them to double prior to loading or storing allows promotion to proceed for these allocas. I looked quite a bit changing the IR which Clang produces for this case to be more friendly, but small changes seem unlikely to help. I think the best representation we could use currently would be to pass 4 i32 arguments thereby avoiding any FCAs, but that would still require this fix. It seems like it might eventually be nice to somehow encode the ABI register selection choices outside of the parameter type system so that the parameter can be a { double, double }, but the CC register annotations indicate that this should be passed via 4 integer registers. This patch does not address the second problem in PR14059, which is the reverse: when a struct alloca is loaded as a *larger* single integer. This patch also does not address some of the code quality issues with the FCA-splitting. Those don't actually impede any optimizations really, but they're on my list to clean up. [1]: Pedantic footnote: for those concerned about memory model issues here, this is safe. For the alloca to be promotable, it cannot escape or have any use of its address that could allow these loads or stores to be racing. Thus, widening is always safe. llvm-svn: 165928
-
Chandler Carruth authored
into static helper functions. They're really quite generic and are going to be needed elsewhere shortly. llvm-svn: 165927
-
Bill Wendling authored
Add an enum for the return and function indexes into the AttrListPtr object. This gets rid of some magic numbers. llvm-svn: 165924
-
Bill Wendling authored
llvm-svn: 165923
-
Bill Wendling authored
llvm-svn: 165920
-
Bill Wendling authored
Convert the internal representation of the Attributes class into a pointer to an opaque object that's uniqued by and stored in the LLVMContext object. The Attributes class then becomes a thin wrapper around this opaque object. Eventually, the internal representation will be expanded to include attributes that represent code generation options, etc. llvm-svn: 165917
-
Meador Inge authored
This patch migrates the strcmp and strncmp optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 165915
-
- Oct 14, 2012
-
-
Benjamin Kramer authored
llvm-svn: 165904
-