- Apr 09, 2012
-
-
Bill Wendling authored
llvm-svn: 154312
-
Nadav Rotem authored
Move NormalizeVectorShuffle and LowerVectorBroadcast into X86TargetLowering. llvm-svn: 154310
-
Craig Topper authored
Remove unnecessary type check when combining and/or/xor of swizzles. Move some checks to allow better early out. llvm-svn: 154309
-
Craig Topper authored
llvm-svn: 154308
-
Craig Topper authored
llvm-svn: 154307
-
Bill Wendling authored
llvm-svn: 154306
-
Craig Topper authored
llvm-svn: 154305
-
Chandler Carruth authored
x86 addressing modes. This allows PIE-based TLS offsets to fit directly into an addressing mode immediate offset, which is the last remaining code quality issue from PR12380. With this patch, that PR is completely fixed. To understand why this patch is correct to match these offsets into addressing mode immediates, break it down by cases: 1) 32-bit is trivially correct, and unmodified here. 2) 64-bit non-small mode is unchanged and never matches. 3) 64-bit small PIC code which is RIP-relative is handled specially in the match to try to fit RIP into the base register. If it fails, it now early exits. This behavior is unchanged by the patch. 4) 64-bit small non-PIC code which is not RIP-relative continues to work as it did before. The reason these immediates are safe is because the ABI ensures they fit in small mode. This behavior is unchanged. 5) 64-bit small PIC code which is *not* using RIP-relative addressing. This is the only case changed by the patch, and the primary place you see it is in TLS, either the win64 section offset TLS or Linux local-exec TLS model in a PIC compilation. Here the ABI again ensures that the immediates fit because we are in small mode, and any other operations required due to the PIC relocation model have been handled externally to the Wrapper node (extra loads etc are made around the wrapper node in ISelLowering). I've tested this as much as I can comparing it with GCC's output, and everything appears safe. I discussed this with Anton and it made sense to him at least at face value. That said, if there are issues with PIC code after this patch, yell and we can revert it. llvm-svn: 154304
-
Chandler Carruth authored
comprehensive testing of TLS codegen for x86. Convert all of the ones that were still using grep to use FileCheck. Remove some redundancies between them. Perhaps most interestingly expand the test cases so that they actually fully list the instruction snippet being tested. TLS operations are *very* narrowly defined, and so these seem reasonably stable. More importantly, the existing test cases already were crazy fine grained, expecting specific registers to be allocated. This just clarifies that no *other* instructions are expected, and fills in some crucial gaps that weren't being tested at all. This will make any subsequent changes to TLS much more clear during review. llvm-svn: 154303
-
Nick Kledzik authored
llvm-svn: 154302
-
Nick Kledzik authored
llvm-svn: 154301
-
Craig Topper authored
llvm-svn: 154299
-
- Apr 08, 2012
-
-
Chandler Carruth authored
case as we don't currently have any way of dumping target options or otherwise observing this. Another small step toward fixing PR12380. With this we generate TLS accesses using the static model instead of the dynamic model, but we're still generating suboptimal code under the mistaken assumption that the TLS offset might be greater than 2^32, and therefor not viable as an immediate offset of a segment register. llvm-svn: 154298
-
Benjamin Kramer authored
llvm-svn: 154297
-
Duncan Sands authored
when -ffast-math, i.e. don't just always do it if the reciprocal can be formed exactly. There is already an IR level transform that does that, and it does it more carefully. llvm-svn: 154296
-
Craig Topper authored
Simplify code that tries to do vector extracts for shuffles when the mask width and the input vector widths don't match. No need to check the min and max are in range before calculating the start index. The range check after having the start index is sufficient. Also no need to check for an extract from the beginning differently. llvm-svn: 154295
-
Chandler Carruth authored
optimizations which are valid for position independent code being linked into a single executable, but not for such code being linked into a shared library. I discussed the design of this with Eric Christopher, and the decision was to support an optional bit rather than a completely separate relocation model. Fundamentally, this is still PIC relocation, its just that certain optimizations are only valid under a PIC relocation model when the resulting code won't be in a shared library. The simplest path to here is to expose a single bit option in the TargetOptions. If folks have different/better designs, I'm all ears. =] I've included the first optimization based upon this: changing TLS models to the *Exec models when PIE is enabled. This is the LLVM component of PR12380 and is all of the hard work. llvm-svn: 154294
-
Chandler Carruth authored
in TargetLowering. There was already a FIXME about this location being odd. The interface is simplified as a consequence. This will also make it easier to change TLS models when compiling with PIE. llvm-svn: 154292
-
Chandler Carruth authored
First, this patch cleans up the parsing of the PIC and PIE family of options in the driver. The existing logic failed to claim arguments all over the place resulting in kludges that marked the options as unused. Instead actually walk all of the arguments and claim them properly. We now treat -f{,no-}{pic,PIC,pie,PIE} as a single set, accepting the last one on the commandline. Previously there were lots of ordering bugs that could creep in due to the nature of the parsing. Let me know if folks would like weird things such as "-fPIE -fno-pic" to turn on PIE, but disable full PIC. This doesn't make any sense to me, but we could in theory support it. Options that seem to have intentional "trump" status (-static, -mkernel, etc) continue to do so and are commented as such. Next, a -pie-level flag is threaded into the frontend, rigged to a language option, and handled preprocessor, setting up the appropriate defines. We'll now have the correct defines when compiling with -fpie. The one place outside of the preprocessor that was inspecting the PIC level (as opposed to the relocation model, which is set and handled separately, yay!) is in the GNU ObjC runtime. I changed it to exactly preserve existing behavior. If folks want to change its behavior in the face of PIE, they can do that in a separate patch. Essentially the only functionality changed here is the preprocessor defines and bug-fixes to the argument management. Tests have been updated and extended to test all of this a bit more thoroughly. llvm-svn: 154291
-
Chandler Carruth authored
testing any of the strange driver behavior. We already have some tiny tests for the driver behavior, and I'm going to expand them greatly in the next commit. llvm-svn: 154290
-
Chandler Carruth authored
llvm-svn: 154289
-
Benjamin Kramer authored
EngineBuilder::create is expected to take ownership of the TargetMachine passed to it. Delete it on error or when we create an interpreter that doesn't need it. llvm-svn: 154288
-
Chandler Carruth authored
where a chain outside of the loop block-set ended up in the worklist for scheduling as part of the contiguous loop. However, asserting the first block in the chain is in the loop-set isn't a valid check -- we may be forced to drag a chain into the worklist due to one block in the chain being part of the loop even though the first block is *not* in the loop. This occurs when we have been forced to form a chain early due to un-analyzable branches. No test case here as I have no idea how to even begin reducing one, and it will be hopelessly fragile. We have to somehow end up with a loop header of an inner loop which is a successor of a basic block with an unanalyzable pair of branch instructions. Ow. Self-host triggers it so it is unlikely it will regress. This at least gets block placement back to passing selfhost and the test suite. There are still a lot of slowdown that I don't like coming out of block placement, although there are now also a lot of speedups. =[ I'm seeing swings in both directions up to 10%. I'm going to try to find time to dig into this and see if we can turn this on for 3.1 as it does a really good job of cleaning up after some loops that degraded with the inliner changes. llvm-svn: 154287
-
Chandler Carruth authored
debugging. llvm-svn: 154286
-
Chandler Carruth authored
GEPs, bit casts, and stores reaching it but no other instructions. These often show up during the iterative processing of the inliner, SROA, and DCE. Once we hit this point, we can completely remove the alloca. These were actually showing up in the final, fully optimized code in a bunch of inliner tests I've been working on, and notably they show up after LLVM finishes optimizing away all function calls involved in hash_combine(a, b). llvm-svn: 154285
-
Nadav Rotem authored
Previously we used three instructions to broadcast an immediate value into a vector register. On Sandybridge we continue to load the broadcasted value from the constant pool. llvm-svn: 154284
-
Bill Wendling authored
llvm-svn: 154283
-
Bill Wendling authored
llvm-svn: 154282
-
Bill Wendling authored
llvm-svn: 154281
-
Bill Wendling authored
An MDNode has a list of MDNodeOperands allocated directly after it as part of its allocation. Therefore, the Parent of the MDNodeOperands can be found by walking back through the operands to the beginning of that list. Mark the first operand's value pointer as being the 'first' operand so that we know where the beginning of said list is. This saves a *lot* of space during LTO with -O0 -g flags. llvm-svn: 154280
-
Bill Wendling authored
value pointer by making the value pointer into a pointer-int pair with 2 bits available for flags. llvm-svn: 154279
-
Richard Smith authored
converting from std::nullptr_t, the subexpression might have side-effects. llvm-svn: 154278
-
Michael J. Spencer authored
llvm-svn: 154277
-
Michael J. Spencer authored
llvm-svn: 154276
-
Michael J. Spencer authored
llvm-svn: 154275
-
Michael J. Spencer authored
llvm-svn: 154274
-
Francois Pichet authored
ext_reserved_user_defined_literal must not default to Error in MicrosoftMode. Hence create ext_ms_reserved_user_defined_literal that doesn't default to Error; otherwise MSVC headers won't parse. Fixes PR12383. llvm-svn: 154273
-
Craig Topper authored
Turn avx2 vinserti128 intrinsic calls into INSERT_SUBVECTOR DAG nodes and remove patterns for selecting the intrinsic. Similar was already done for avx1. llvm-svn: 154272
-
Simon Atanasyan authored
llvm-svn: 154270
-
Simon Atanasyan authored
llvm-svn: 154269
-