- Jan 12, 2011
-
-
Chris Lattner authored
of the bootstrap miscompare issue. llvm-svn: 123299
-
Jason W Kim authored
R_ARM_MOVT_PREL and R_ARM_MOVW_PREL_NC. 2. Fix minor bug in ARMAsmPrinter - treat bitfield flag as a bitfield, not an enum. 3. Add support for 3 new elf section types (no-ops) llvm-svn: 123294
-
Jason W Kim authored
.s Test added. llvm-svn: 123292
-
Jakob Stoklund Olesen authored
llvm-svn: 123290
-
- Jan 11, 2011
-
-
Jakob Stoklund Olesen authored
llvm-svn: 123282
-
Venkatraman Govindaraju authored
llvm-svn: 123281
-
Chris Lattner authored
llvm-svn: 123242
-
Daniel Dunbar authored
carry setting flag from the mnemonic. Note that this currently involves me disabling a number of working cases in arm_instructions.s, this is a hopefully short term evil which will be rapidly fixed (and greatly surpassed), assuming my current approach flies. llvm-svn: 123238
-
Eric Christopher authored
llvm-svn: 123227
-
Chris Lattner authored
llvm-svn: 123220
-
Chris Lattner authored
neccesarily an uncond branch to the header. This fixes PR8955 (the assertion tripping). llvm-svn: 123219
-
Chandler Carruth authored
point values to their integer representation through the SSE intrinsic calls. This is the last part of a README.txt entry for which I have real world examples. llvm-svn: 123206
-
Chandler Carruth authored
file and make it actually test something... llvm-svn: 123205
-
Owen Anderson authored
Fix a random missed optimization by making InstCombine more aggressive when determining which bits are demanded by a comparison against a constant. llvm-svn: 123203
-
Eric Christopher authored
restore the stack pointer from the frame pointer on thumbv6. Fixes rdar://8819685 llvm-svn: 123196
-
- Jan 10, 2011
-
-
Dale Johannesen authored
There's an inherent tension in DAGCombine between assuming that things will be put in canonical form, and the Depth mechanism that disables transformations when recursion gets too deep. It would not surprise me if there's a lot of little bugs like this one waiting to be discovered. The mechanism seems fragile and I'd suggest looking at it from a design viewpoint. llvm-svn: 123191
-
Daniel Dunbar authored
llvm-svn: 123189
-
Chandler Carruth authored
intrinsics element dependencies. Reviewed by Nick. llvm-svn: 123161
-
Chandler Carruth authored
them to FileCheck as well. llvm-svn: 123154
-
Chandler Carruth authored
llvm-svn: 123153
-
Chris Lattner authored
llvm-svn: 123148
-
Chris Lattner authored
back to life. llvm-svn: 123146
-
Chris Lattner authored
llvm-svn: 123144
-
Chris Lattner authored
llvm-svn: 123143
-
- Jan 09, 2011
-
-
Tobias Grosser authored
llvm-svn: 123121
-
Chris Lattner authored
with GEP instructions are always NUW, because PHIs cannot wrap the end of the address space. llvm-svn: 123105
-
Chris Lattner authored
that have the bit set. llvm-svn: 123104
-
- Jan 08, 2011
-
-
Chris Lattner authored
larger memsets. Among other things, this fixes rdar://8760394 and allows us to handle "Example 2" from http://blog.regehr.org/archives/320, compiling it into a single 4096-byte memset: _mad_synth_mute: ## @mad_synth_mute ## BB#0: ## %entry pushq %rax movl $4096, %esi ## imm = 0x1000 callq ___bzero popq %rax ret llvm-svn: 123089
-
Chris Lattner authored
P and P+1 are relative to the same base pointer. llvm-svn: 123087
-
Chris Lattner authored
memset into a single larger memset. llvm-svn: 123086
-
Chris Lattner authored
llvm-svn: 123082
-
Chris Lattner authored
to be foldable into an uncond branch. When this happens, we can make a much simpler CFG for the loop, which is important for nested loop cases where we want the outer loop to be aggressively optimized. Handle this case more aggressively. For example, previously on phi-duplicate.ll we would get this: define void @test(i32 %N, double* %G) nounwind ssp { entry: %cmp1 = icmp slt i64 1, 1000 br i1 %cmp1, label %bb.nph, label %for.end bb.nph: ; preds = %entry br label %for.body for.body: ; preds = %bb.nph, %for.cond %j.02 = phi i64 [ 1, %bb.nph ], [ %inc, %for.cond ] %arrayidx = getelementptr inbounds double* %G, i64 %j.02 %tmp3 = load double* %arrayidx %sub = sub i64 %j.02, 1 %arrayidx6 = getelementptr inbounds double* %G, i64 %sub %tmp7 = load double* %arrayidx6 %add = fadd double %tmp3, %tmp7 %arrayidx10 = getelementptr inbounds double* %G, i64 %j.02 store double %add, double* %arrayidx10 %inc = add nsw i64 %j.02, 1 br label %for.cond for.cond: ; preds = %for.body %cmp = icmp slt i64 %inc, 1000 br i1 %cmp, label %for.body, label %for.cond.for.end_crit_edge for.cond.for.end_crit_edge: ; preds = %for.cond br label %for.end for.end: ; preds = %for.cond.for.end_crit_edge, %entry ret void } Now we get the much nicer: define void @test(i32 %N, double* %G) nounwind ssp { entry: br label %for.body for.body: ; preds = %entry, %for.body %j.01 = phi i64 [ 1, %entry ], [ %inc, %for.body ] %arrayidx = getelementptr inbounds double* %G, i64 %j.01 %tmp3 = load double* %arrayidx %sub = sub i64 %j.01, 1 %arrayidx6 = getelementptr inbounds double* %G, i64 %sub %tmp7 = load double* %arrayidx6 %add = fadd double %tmp3, %tmp7 %arrayidx10 = getelementptr inbounds double* %G, i64 %j.01 store double %add, double* %arrayidx10 %inc = add nsw i64 %j.01, 1 %cmp = icmp slt i64 %inc, 1000 br i1 %cmp, label %for.body, label %for.end for.end: ; preds = %for.body ret void } With all of these recent changes, we are now able to compile: void foo(char *X) { for (int i = 0; i != 100; ++i) for (int j = 0; j != 100; ++j) X[j+i*100] = 0; } into a single memset of 10000 bytes. This series of changes should also be helpful for other nested loop scenarios as well. llvm-svn: 123079
-
Chris Lattner authored
1. Rip out LoopRotate's domfrontier updating code. It isn't needed now that LICM doesn't use DF and it is super complex and gross. 2. Make DomTree updating code a lot simpler and faster. The old loop over all the blocks was just to find a block?? 3. Change the code that inserts the new preheader to just use SplitCriticalEdge instead of doing an overcomplex reimplementation of it. No behavior change, except for the name of the inserted preheader. llvm-svn: 123072
-
Rafael Espindola authored
Add a unnamed_addr bit to global variables and functions. This will be used to indicate that the address is not significant and therefore the constant or function can be merged with others. If an optimization pass can show that an address is not used, it can set this. Examples of things that can have this set by the FE are globals created to hold string literals and C++ constructors. Adding unnamed_addr to a non-const global should have no effect unless an optimization can transform that global into a constant. Aliases are not allowed to have unnamed_addr since I couldn't figure out any use for it. llvm-svn: 123063
-
Frits van Bommel authored
llvm-svn: 123061
-
Chris Lattner authored
them into the loop preheader, eliminating silly instructions like "icmp i32 0, 100" in fixed tripcount loops. This also better exposes the bigger problem with loop rotate that I'd like to fix: once this has been folded, the duplicated conditional branch *often* turns into an uncond branch. Not aggressively handling this is pessimizing later loop optimizations somethin' fierce by making "dominates all exit blocks" checks fail. llvm-svn: 123060
-
Evan Cheng authored
llvm-svn: 123048
-
Evan Cheng authored
Instead encode llvm IR level property "HasSideEffects" in an operand (shared with IsAlignStack). Added MachineInstrs::hasUnmodeledSideEffects() to check the operand when the instruction is an INLINEASM. This allows memory instructions to be moved around INLINEASM instructions. llvm-svn: 123044
-
- Jan 07, 2011
-
-
Devang Patel authored
llvm-svn: 123039
-
Bob Wilson authored
Patch by Tim Northover. llvm-svn: 123035
-