- May 17, 2013
-
-
Benjamin Kramer authored
Shuffles that only move an element into position 0 of the vector are common in the output of the loop vectorizer and often generate suboptimal code when SSSE3 is not available. Lower them to vector shifts if possible. We still prefer palignr over psrldq because it has higher throughput on sandybridge. llvm-svn: 182102
-
- May 05, 2013
-
-
David Majnemer authored
X86ISelLowering has support to treat: (icmp ne (and (xor %flags, -1), (shl 1, flag)), 0) as if it were actually: (icmp eq (and %flags, (shl 1, flag)), 0) However, r179386 has code at the InstCombine level to handle this. llvm-svn: 181145
-
Nadav Rotem authored
llvm-svn: 181136
-
- May 02, 2013
-
-
Michael Liao authored
llvm-svn: 180915
-
Michael Liao authored
No functionality change llvm-svn: 180914
-
Michael Liao authored
No functionality change llvm-svn: 180912
-
- Apr 20, 2013
-
-
Tim Northover authored
I think it's almost impossible to fold atomic fences profitably under LLVM/C++11 semantics. As a result, this is now unused and just cluttering up the target interface. llvm-svn: 179940
-
Tim Northover authored
llvm-svn: 179939
-
Michael Liao authored
llvm-svn: 179901
-
- Apr 19, 2013
-
-
Michael Liao authored
llvm-svn: 179833
-
- Apr 18, 2013
-
-
Benjamin Kramer authored
This pattern started popping up in vectorized min/max reductions. llvm-svn: 179797
-
- Apr 11, 2013
-
-
Michael Liao authored
As packed comparisons in AVX/SSE produce all 0s or all 1s in each SIMD lane, vector select could be simplified to AND/OR or removed if one or both values being selected is all 0s or all 1s. llvm-svn: 179267
-
Michael Liao authored
This patch is revised based on patch from Victor Umansky <victor.umansky@intel.com>. More cases are handled in X86's bool simplification, i.e. - SETCC_CARRY - value is truncated to i1 with AND As a by-product, PR5443 is also fixed. llvm-svn: 179265
-
- Apr 10, 2013
-
-
Evan Cheng authored
xmm0 / xmm1. rdar://13599493 llvm-svn: 179141
-
- Apr 05, 2013
-
-
Bill Wendling authored
During LTO, the target options on functions within the same Module may change. This would necessitate resetting some of the back-end. Do this for X86, because it's a Friday afternoon. llvm-svn: 178917
-
- Mar 31, 2013
-
-
Benjamin Kramer authored
A vector sext + sitofp is a lot cheaper than 8 scalar conversions. llvm-svn: 178448
-
- Mar 29, 2013
-
-
Benjamin Kramer authored
It was superseded by MachineBlockPlacement and disabled by default since LLVM 3.1. llvm-svn: 178349
-
Michael Liao authored
llvm-svn: 178314
-
Michael Liao authored
- RDRAND always clears the destination value when a random value is not available (i.e. CF == 0). This value is truncated or zero-extended as the false boolean value to be returned. Boolean simplification needs to skip this 'zext' or 'trunc' node. llvm-svn: 178312
-
Michael Liao authored
To enable a load of a call address to be folded with that call, this load is moved from outside of callseq into callseq. Such a moving adds a non-glued node (that load) into a glued sequence. This non-glue load is only removed when DAG selection folds them into a memory form call instruction. When such instruction selection is disabled, it breaks DAG schedule. To prevent that, such moving is disabled when target favors register indirect call. Previous workaround disabling CALL32m/CALL64m insn selection is removed. llvm-svn: 178308
-
- Mar 28, 2013
-
-
Timur Iskhodzhanov authored
llvm-svn: 178291
-
- Mar 27, 2013
-
-
Preston Gurd authored
For the current Atom processor, the fastest way to handle a call indirect through a memory address is to load the memory address into a register and then call indirect through the register. This patch implements this improvement by modifying SelectionDAG to force a function address which is a memory reference to be loaded into a virtual register. Patch by Sriram Murali. llvm-svn: 178171
-
Hal Finkel authored
Thanks to Bill Schmidt for pointing this out during code review! llvm-svn: 178170
-
- Mar 26, 2013
-
-
Michael Liao authored
llvm-svn: 178083
-
Michael Liao authored
- It's still considered aligned when the specified alignment is larger than the natural alignment; - The new alignment for the high 128-bit vector should be min(16, alignment) as the pointer is advanced by 16, a power-of-2 offset. llvm-svn: 177947
-
- Mar 20, 2013
-
-
Michael Liao authored
- Move SRA/SRL/SHL lowering support from DAG combination to DAG lowering to support extended 256-bit integer in AVX but not AVX2. llvm-svn: 177478
-
Michael Liao authored
- Prepare moving logic from DAG combining into DAG lowering. There's no functionality change. llvm-svn: 177477
-
Michael Liao authored
- no functionality change llvm-svn: 177476
-
- Mar 19, 2013
-
-
Nadav Rotem authored
Patch by Ahmad, Muhammad T <muhammad.t.ahmad@intel.com> llvm-svn: 177421
-
- Mar 18, 2013
-
-
Anton Korobeynikov authored
MinGW is almost completely compatible to MSVC, with the exception of the _tls_array global not being available. Patch by David Nadlinger! llvm-svn: 177257
-
- Mar 14, 2013
-
-
Michael Liao authored
- Fix the typo on type checking llvm-svn: 177010
-
- Mar 08, 2013
-
-
Tom Stellard authored
LegalizeDAG.cpp uses the value of the comparison operands when checking the legality of BR_CC, so DAGCombiner should do the same. v2: - Expand more BR_CC value types for NVPTX v3: - Expand correct BR_CC value types for Hexagon, Mips, and XCore. llvm-svn: 176694
-
- Mar 07, 2013
-
-
Benjamin Kramer authored
That can usually be lowered efficiently and is common in sandybridge code. It would be nice to do this in DAGCombiner but we can't insert arbitrary BUILD_VECTORs this late. Fixes PR15462. llvm-svn: 176634
-
Michael Liao authored
- Phi nodes should be replaced/updated after lowering CMOV into branch because 'mainMBB' updating operand in Phi node is changed. - Add EFLAGS in livein before lowering the 2nd CMOV. It's necessary as we will reuse the EFLAGS generated before the 1st lowered CMOV, which won't clobber EFLAGS. However, we need explicitly specify that. - '-attr=-cmov' test case are added. llvm-svn: 176598
-
- Mar 06, 2013
-
-
Michael Liao authored
- Clear 'mayStore' flag when loading from the atomic variable before the spin loop - Clear kill flag from one use to multiple use in registers forming the address to that atomic variable - don't use a physical register as live-in register in BB (neither entry nor landing pad.) by copying it into virtual register (patch by Cameron Zwarich) llvm-svn: 176538
-
- Mar 04, 2013
-
-
Preston Gurd authored
* Only apply divide bypass optimization when not optimizing for size. * Fixed bug caused by constant for 0 value of type Int32, used dividend type to generate the constant instead. * For atom x86-64 apply the divide bypass to use 16-bit divides instead of 64-bit divides when operand values are small enough. * Added lit tests for 64-bit divide bypass. Patch by Tyler Nowicki! llvm-svn: 176442
-
- Mar 01, 2013
-
-
Michael Liao authored
- ISD::SHL/SRL/SRA must have either both scalar or both vector operands but TLI.getShiftAmountTy() so far only return scalar type. As a result, backend logic assuming that breaks. - Rename the original TLI.getShiftAmountTy() to TLI.getScalarShiftAmountTy() and re-define TLI.getShiftAmountTy() to return target-specificed scalar type or the same vector type as the 1st operand. - Fix most TICG logic assuming TLI.getShiftAmountTy() a simple scalar type. llvm-svn: 176364
-
- Feb 26, 2013
-
-
Michael Liao authored
- Put expensive checking after simple one llvm-svn: 176060
-
Michael Liao authored
- Check whether SSE is available before lowering all 1s vector building with PCMPEQD, which is only available from SSE2 llvm-svn: 176058
-
- Feb 24, 2013
-
-
Nadav Rotem authored
Fix PR15239. llvm-svn: 175985
-