- Sep 05, 2012
-
-
Roman Divacky authored
llvm-svn: 163225
-
Jim Grosbach authored
Make sure to return a pointer into the target memory, not the local memory. Often they are the same, but we can't assume that. llvm-svn: 163217
-
Jim Grosbach authored
Simulate a remote target address space by allocating a seperate chunk of memory for the target and re-mapping section addresses to that prior to execution. Later we'll want to have a truly remote process, but for now this gets us closer to being able to test the remote target functionality outside LLDB. rdar://12157052 llvm-svn: 163216
-
Benjamin Kramer authored
It relies on clear() being fast and the cache rarely has more than 1 or 2 elements, so give it an inline capacity and always shrink it back down in case it grows. DenseMap will grow to 64 buckets which makes clear() a lot slower. llvm-svn: 163215
-
Pranav Bhandarkar authored
subreg_hireg of register pair Rp. * lib/Target/Hexagon/HexagonPeephole.cpp(PeepholeDoubleRegsMap): New DenseMap similar to PeepholeMap that additionally records subreg info too. (runOnMachineFunction): Record information in PeepholeDoubleRegsMap and copy propagate the high sub-reg of Rp0 in Rp1 = lsr(Rp0, #32) to the instruction Rx = COPY Rp1:logreg_subreg. * test/CodeGen/Hexagon/remove_lsr.ll: New test. llvm-svn: 163214
-
Kostya Serebryany authored
llvm-svn: 163205
-
Silviu Baranga authored
Fixed the DAG combiner to better handle the folding of AND nodes for vector types. The previous code was making the assumption that the length of the bitmask returned by isConstantSplat was equal to the size of the vector type. Now we first make sure that the splat value has at least the length of the vector lane type, then we only use as many fields as we have available in the splat value. llvm-svn: 163203
-
Kostya Serebryany authored
llvm-svn: 163199
-
Craig Topper authored
Remove some of the patterns added in r163196. Increasing the complexity on insert_subvector into undef accomplishes the same thing. llvm-svn: 163198
-
Craig Topper authored
Add patterns for integer forms of VINSERTF128/VINSERTI128 folded with loads. Also add patterns to turn subvector inserts with loads to index 0 of an undef into VMOVAPS. llvm-svn: 163196
-
Chad Rosier authored
llvm-svn: 163195
-
Logan Chien authored
llvm-svn: 163194
-
Logan Chien authored
llvm-svn: 163193
-
Craig Topper authored
Convert vextracti128/vextractf128 intrinsics to extract_subvector at DAG build time. Similar was previously done for vinserti128/vinsertf128. Add patterns for folding these extract_subvectors with stores. llvm-svn: 163192
-
Marshall Clow authored
llvm-svn: 163191
-
Richard Smith authored
llvm-svn: 163190
-
Chad Rosier authored
llvm-svn: 163187
-
Chad Rosier authored
llvm-svn: 163186
-
Chad Rosier authored
Reader/Writer. llvm-svn: 163185
-
Chad Rosier authored
llvm-svn: 163184
-
Chad Rosier authored
llvm-svn: 163181
-
Dan Gohman authored
pointers-to-strong-pointers may be in play. These can lead to retains and releases happening in unstructured ways, foiling the optimizer. This fixes rdar://12150909. llvm-svn: 163180
-
Jakub Staszak authored
llvm-svn: 163179
-
Jakob Stoklund Olesen authored
Implicit uses can be dynamically tied to defs. This will soon be used for predicated instructions on ARM. llvm-svn: 163177
-
Chad Rosier authored
class. llvm-svn: 163175
-
Chad Rosier authored
implementation does not co-exist well with how the sideeffect and alignstack attributes are handled. The reverts r161641. llvm-svn: 163174
-
David Blaikie authored
This doesn't seem ideal, perhaps we could just keep the llvm_site_cfg and have other config (clang and clang-tools-extra) derive their site_cfg from that. Suggestions/complaints/ideas welcome. llvm-svn: 163171
-
- Sep 04, 2012
-
-
Jakub Staszak authored
Doesn't set MadeChange to TRUE if BypassSlowDivision doesn't change anything. llvm-svn: 163165
-
Jakub Staszak authored
Also a few minor changes: - use pre-inc instead of post-inc - use isa instead of dyn_cast - 80 col - trailing spaces llvm-svn: 163164
-
Jakub Staszak authored
llvm-svn: 163160
-
Jakob Stoklund Olesen authored
llvm-svn: 163154
-
Jakob Stoklund Olesen authored
The MachineOperand::TiedTo field was maintained, but not used. This patch enables it in isRegTiedToDefOperand() and isRegTiedToUseOperand() which are the actual functions use by the register allocator. llvm-svn: 163153
-
Jakob Stoklund Olesen authored
llvm-svn: 163152
-
Jakob Stoklund Olesen authored
After much agonizing, use a full 4 bits of precious MachineOperand space to encode this. This uses existing padding, and doesn't grow MachineOperand beyond its current 32 bytes. This allows tied defs among the first 15 operands on a normal instruction, just like the current MCInstrDesc constraint encoding. Inline assembly needs to be able to tie more than the first 15 operands, and gets special treatment. Tied uses can appear beyond 15 operands, as long as they are tied to a def that's in range. llvm-svn: 163151
-
Preston Gurd authored
- CodeGenPrepare pass for identifying div/rem ops - Backend specifies the type mapping using addBypassSlowDivType - Enabled only for Intel Atom with O2 32-bit -> 8-bit - Replace IDIV with instructions which test its value and use DIVB if the value is positive and less than 256. - In the case when the quotient and remainder of a divide are used a DIV and a REM instruction will be present in the IR. In the non-Atom case they are both lowered to IDIVs and CSE removes the redundant IDIV instruction, using the quotient and remainder from the first IDIV. However, due to this optimization CSE is not able to eliminate redundant IDIV instructions because they are located in different basic blocks. This is overcome by calculating both the quotient (DIV) and remainder (REM) in each basic block that is inserted by the optimization and reusing the result values when a subsequent DIV or REM instruction uses the same operands. - Test cases check for the presents of the optimization when calculating either the quotient, remainder, or both. Patch by Tyler Nowicki! llvm-svn: 163150
-
Bob Wilson authored
Rationale: For each preprocessor macro, either the definedness is what's meaningful, or the value is what's meaningful, or both. If definedness is meaningful, we should use #ifdef. If the value is meaningful, we should use and #ifdef interchangeably for the same macro, seems ugly to me, even if undefined macros are zero if used. This also has the benefit that including an LLVM header doesn't prevent you from compiling with -Wundef -Werror. Patch by John Garvin! <rdar://problem/12189979> llvm-svn: 163148
-
Sergei Larin authored
Change current Hexagon MI scheduler to use new converging scheduler. Integrates DFA resource model into it. llvm-svn: 163137
-
Arnold Schwaighofer authored
This patch corrects the definition of umlal/smlal instructions and adds support for matching them to the ARM dag combiner. Bug 12213 Patch by Yin Ma! llvm-svn: 163136
-
Elena Demikhovsky authored
Since this specific shuffle is widely used in many workloads we have ~10% performance on them. shufflevector <8 x float> %A, <8 x float> %B, <8 x i32> <i32 0, i32 8, i32 2, i32 10, i32 4, i32 12, i32 6, i32 14> vmovaps (%rdx), %ymm0 vshufps $8, %ymm0, %ymm0, %ymm0 vmovaps (%rcx), %ymm1 vshufps $8, %ymm0, %ymm1, %ymm1 vunpcklps %ymm0, %ymm1, %ymm0 vmovaps (%rcx), %ymm0 vmovsldup (%rdx), %ymm1 vblendps $85, %ymm0, %ymm1, %ymm0 llvm-svn: 163134
-
Nadav Rotem authored
Scan the body of the loop and find instructions that may trap. Use this information when deciding if it is safe to hoist or sink instructions. Notice that we can optimize the search of instructions that may throw in the case of nested loops. rdar://11518836 llvm-svn: 163132
-