- Nov 15, 2011
-
-
Craig Topper authored
Properly qualify AVX2 specific parts of execution dependency table. Also enable converting between 256-bit PS/PD operations when AVX1 is enabled. Fixes PR11370. llvm-svn: 144622
-
Evan Cheng authored
integer variants. rdar://10437054 llvm-svn: 144608
-
Jim Grosbach authored
rdar://10435076 llvm-svn: 144606
-
Nick Lewycky authored
llvm-svn: 144603
-
Jakob Stoklund Olesen authored
Two new TargetInstrInfo hooks lets the target tell ExecutionDepsFix about instructions with partial register updates causing false unwanted dependencies. The ExecutionDepsFix pass will break the false dependencies if the updated register was written in the previoius N instructions. The small loop added to sse-domains.ll runs twice as fast with dependency-breaking instructions inserted. llvm-svn: 144602
-
Jakob Stoklund Olesen authored
Keep track of the last instruction to define each register individually instead of per DomainValue. This lets us track more accurately when a register was last written. Also track register ages across basic blocks. When entering a new basic block, use the least stale predecessor def as a worst case estimate for register age. The register age is used to arbitrate between conflicting domains. The most recently defined register wins. llvm-svn: 144601
-
Nick Lewycky authored
link it against llvm code, by making our definitions weak. "Some users." llvm-svn: 144596
-
Jim Grosbach authored
rdar://10435076 llvm-svn: 144593
-
Jim Grosbach authored
rdar://10435076 llvm-svn: 144592
-
Jim Grosbach authored
llvm-svn: 144589
-
Jim Grosbach authored
Make it easier to deal with aliases for instructions that do require a suffix but accept more specific variants of the same size. llvm-svn: 144588
-
Jim Grosbach authored
rdar://10435076 llvm-svn: 144587
-
Chad Rosier authored
violating a dependency is to emit all loads prior to stores. This would likely cause a great deal of spillage offsetting any potential gains. llvm-svn: 144585
-
Jim Grosbach authored
Canonicallize on the non-suffixed form, but continue to accept assembly that has any correctly sized type suffix. llvm-svn: 144583
-
- Nov 14, 2011
-
-
Nick Lewycky authored
and stores capture) to permit the caller to see each capture point and decide whether to continue looking. Use this inside memdep to do an analysis that basicaa won't do. This lets us solve another devirtualization case, fixing PR8908! llvm-svn: 144580
-
Chad Rosier authored
rdar://10412592 llvm-svn: 144578
-
Chad Rosier authored
into registers, rather then encoded directly in the load/store. llvm-svn: 144576
-
Jim Grosbach authored
rdar://10435076 llvm-svn: 144575
-
Evan Cheng authored
llvm-svn: 144569
-
Evan Cheng authored
"kill". This looks like a bug upstream. Since that's going to take some time to understand, loosen the assertion and disable the optimization when multiple kills are seen. llvm-svn: 144568
-
Nick Lewycky authored
These annotations are disabled entirely when either ENABLE_THREADS is off, or building a release build. When enabled, they add calls to functions with no statements to ManagedStatic's getters. Use these annotations to inform tsan that the race used inside ManagedStatic initialization is actually benign. Thanks to Kostya Serebryany for helping write this patch! llvm-svn: 144567
-
-
Chad Rosier authored
rdar://10412592 llvm-svn: 144565
-
Benjamin Kramer authored
llvm-svn: 144560
-
Evan Cheng authored
instructions of the two-address operands) in order to avoid inserting copies. This fixes the few regressions introduced when the two-address hack was disabled (without regressing the improvements). rdar://10422688 llvm-svn: 144559
-
Pete Cooper authored
Constant idx case is still done in tablegen but other cases are then expanded Fixes <rdar://problem/10435460> llvm-svn: 144557
-
Benjamin Kramer authored
llvm-svn: 144555
-
Akira Hatanaka authored
llvm-svn: 144554
-
Akira Hatanaka authored
N32/64 places all variable arguments in integer registers (or on stack), regardless of their types, but follows calling convention of non-vaarg function when it handles fixed arguments. llvm-svn: 144553
-
Akira Hatanaka authored
llvm-svn: 144552
-
Justin Holewinski authored
PTX: Let LLVM use loads/stores for all mem* intrinsics, instead of relying on custom implementations. llvm-svn: 144551
-
Akira Hatanaka authored
argument registers on the callee's stack frame, along with functions that set and get it. It is not necessary to add the size of this area when computing stack size in emitPrologue, since it has already been accounted for in PEI::calculateFrameObjectOffsets. llvm-svn: 144549
-
Jakob Stoklund Olesen authored
I broke this in r144515, it affected most ARM testers. <rdar://problem/10441389> llvm-svn: 144547
-
rdar://problem/10441578Bob Wilson authored
This still seems to be causing some failures. It needs more testing before it gets enabled again. llvm-svn: 144543
-
Jim Grosbach authored
llvm-svn: 144538
-
Benjamin Kramer authored
llvm-svn: 144536
-
Chandler Carruth authored
cleans up all the chains allocated during the processing of each function so that for very large inputs we don't just grow memory usage without bound. llvm-svn: 144533
-
Chandler Carruth authored
tests when I forcibly enabled block placement. It is apparantly possible for an unanalyzable block to fallthrough to a non-loop block. I don't actually beleive this is correct, I believe that 'canFallThrough' is returning true needlessly for the code construct, and I've left a bit of a FIXME on the verification code to try to track down why this is coming up. Anyways, removing the assert doesn't degrade the correctness of the algorithm. llvm-svn: 144532
-
Chandler Carruth authored
this pass. We're leaving already merged blocks on the worklist, and scanning them again and again only to determine each time through that indeed they aren't viable. We can instead remove them once we're going to have to scan the worklist. This is the easy way to implement removing them. If this remains on the profile (as I somewhat suspect it will), we can get a lot more clever here, as the worklist's order is essentially irrelevant. We can use swapping and fold the two loops to reduce overhead even when there are many blocks on the worklist but only a few of them are removed. llvm-svn: 144531
-
Chandler Carruth authored
time it is queried to compute the probability of a single successor. This makes computing the probability of every successor of a block in sequence... really really slow. ;] This switches to a linear walk of the successors rather than a quadratic one. One of several quadratic behaviors slowing this pass down. I'm not really thrilled with moving the sum code into the public interface of MBPI, but I don't (at the moment) have ideas for a better interface. My direction I'm thinking in for a better interface is to have MBPI actually retain much more state and make *all* of these queries cheap. That's a lot of work, and would require invasive changes. Until then, this seems like the least bad (ie, least quadratic) solution. Suggestions welcome. llvm-svn: 144530
-