- Mar 26, 2014
-
-
Quentin Colombet authored
Adds the different broadcast instructions to the ReplaceableInstrsAVX2 table. That way the ExeDepsFix pass can take better decisions when AVX2 broadcasts are across domain (int <-> float). In particular, prior to this patch we were generating: vpbroadcastd LCPI1_0(%rip), %ymm2 vpand %ymm2, %ymm0, %ymm0 vmaxps %ymm1, %ymm0, %ymm0 ## <- domain change penalty Now, we generate the following nice sequence where everything is in the float domain: vbroadcastss LCPI1_0(%rip), %ymm2 vandps %ymm2, %ymm0, %ymm0 vmaxps %ymm1, %ymm0, %ymm0 <rdar://problem/16354675> llvm-svn: 204770
-
Rafael Espindola authored
We need .symtab_shndxr if and only if a symbol references a section with an index >= 0xff00. The old code was trying to figure out if the section was needed ahead of time, making it a fairly dependent on the code actually writing the table. It was also somewhat conservative and would create the section in cases where it was not needed. If I remember correctly, the old structure was there so that the sections were created in the same order gas creates them. That was valuable when MC's support for ELF was new and we tested with elf-dump.py. This patch refactors the symbol table creation to another class and makes it obvious that .symtab_shndxr is really only created when we are about to output a reference to a section index >= 0xff00. While here, also improve the tests to use macros. One file is one section short of needing .symtab_shndxr, the second one has just the right number. llvm-svn: 204769
-
Hal Finkel authored
The VSX instruction set has two types of FMA instructions: A-type (where the addend is taken from the output register) and M-type (where one of the product operands is taken from the output register). This adds a small pass that runs just after MI scheduling (and, thus, just before register allocation) that mutates A-type instructions (that are created during isel) into M-type instructions when: 1. This will eliminate an otherwise-necessary copy of the addend 2. One of the product operands is killed by the instruction The "right" moment to make this decision is in between scheduling and register allocation, because only there do we know whether or not one of the product operands is killed by any particular instruction. Unfortunately, this also makes the implementation somewhat complicated, because the MIs are not in SSA form and we need to preserve the LiveIntervals analysis. As a simple example, if we have: %vreg5<def> = COPY %vreg9; VSLRC:%vreg5,%vreg9 %vreg5<def,tied1> = XSMADDADP %vreg5<tied0>, %vreg17, %vreg16, %RM<imp-use>; VSLRC:%vreg5,%vreg17,%vreg16 ... %vreg9<def,tied1> = XSMADDADP %vreg9<tied0>, %vreg17, %vreg19, %RM<imp-use>; VSLRC:%vreg9,%vreg17,%vreg19 ... We can eliminate the copy by changing from the A-type to the M-type instruction. This means: %vreg5<def,tied1> = XSMADDADP %vreg5<tied0>, %vreg17, %vreg16, %RM<imp-use>; VSLRC:%vreg5,%vreg17,%vreg16 is replaced by: %vreg16<def,tied1> = XSMADDMDP %vreg16<tied0>, %vreg18, %vreg9, %RM<imp-use>; VSLRC:%vreg16,%vreg18,%vreg9 and we remove: %vreg5<def> = COPY %vreg9; VSLRC:%vreg5,%vreg9 llvm-svn: 204768
-
Bob Wilson authored
Conceptually one of these loops is just a while-loop, but the actual code-gen is more complicated. We don't instrument all the different control flow edges to get accurate counts for each conditional branch, nor do I think it makes sense to do so. Instead, make the simplifying assumption that the loop behaves like a while-loop. Use the same branch weights for the first check for an empty collection as would be used for the back-edge of a while loop, and use that same weighting for the innermost loop, ignoring the possibility that there may be some extra code to go fetch more elements. llvm-svn: 204767
-
NAKAMURA Takumi authored
llvm-svn: 204766
-
- Mar 25, 2014
-
-
Rafael Espindola authored
While at it, factor some logic into FragmentWriter. This will allow more code to be factored out of the fairly large ELFObjectWriter. llvm-svn: 204765
-
Sean Callanan authored
if they didn't change, just like it does for registers. This makes life easier for kernel debugging and any other situation where values are read-only. <rdar://problem/16367795> llvm-svn: 204764
-
Enrico Granata authored
llvm-svn: 204763
-
rdar://problem/14862302Enrico Granata authored
For small structs, the frame format now prints them as one-liners This follows the same definition that frame variable does for deciding what a "small struct" is, and as such should be fairly consistent with the variable display in general llvm-svn: 204762
-
Enrico Granata authored
Make sure this test has a looser dependency on the exact class generated here.. it is going to be some sort of NS-provided String, but let's not bet on the details llvm-svn: 204761
-
Meador Inge authored
When cross-compiling LLVM itself the configure/make scripts get confused when creating the needed build host tools. For example, building and configuring like: CC_FOR_BUILD='i686-pc-linux-gnu-gcc' CXX_FOR_BUILD='i686-pc-linux-gnu-g++' CXX='i686-mingw32-g++' CC='i686-mingw32-gcc' LD='i686-mingw32-ld' /scratch /meadori/llvm-trunk/src/trunk/configure --host=i686-mingw32 CC_FOR_BUILD='i686-pc-linux-gnu-gcc' CXX_FOR_BUILD='i686-pc-linux-gnu-g++' CXX='i686-mingw32-g++' CC='i686-mingw32-gcc' LD='i686-mingw32-ld' make causes the following build break: checking whether the C compiler works... configure: error: cannot run C compiled programs. If you meant to cross compile, use `--host'. See `config.log' for more details. The 'config.log' shows that i686-mingw32-gcc is being used to create executables for the build host. This patch fixes the problem by propogating the names of the build host tools via BUILD_* when configuring/making BuildTools. Original patch by Ekaterina Sanina. llvm-svn: 204760
-
Andrew MacPherson authored
llvm-svn: 204759
-
Juergen Ributzka authored
llvm-svn: 204758
-
Richard Smith authored
in a lambda capture. llvm-svn: 204757
-
rdar://problem/14515139Enrico Granata authored
Add a GetFoundationVersion() to AppleObjCRuntime This API is used to return and cache the major version of Foundation.framework, which is potentially a useful piece of data to key off of to enable or disable certain ObjC related behaviors (especially in data formatters) llvm-svn: 204756
-
David Blaikie authored
Review feedback from Reid Kleckner on r203603. llvm-svn: 204755
-
Rui Ueyama authored
If a response file is given via command line, the final command line arguments will not appear in the log because the actual arguments are in the given file. This patch is to show the final command line if /verbose is specified to help users. llvm-svn: 204754
-
Reid Kleckner authored
If there are any scope specifiers, then a base class must be named or the symbol isn't from a base class. Fixes PR19233. llvm-svn: 204753
-
Andrew MacPherson authored
Move calls to DisableAllBreakpointSites() and m_thread_list.DiscardThreadPlans() into base Process::Destroy() instead of in subclass DoDestroy() methods. llvm-svn: 204752
-
Sean Callanan authored
when other cases get added. llvm-svn: 204751
-
Sean Callanan authored
that call debug-information intrinsics. llvm-svn: 204750
-
Todd Fiala authored
This change makes significant improvements in the performance of calculating a UUID within ObjectFileELF, and handles both running processes and core files correctly. This does lazy evaluation of UUID generation and caches the result when calculated. Change by Piotr Rak. llvm-svn: 204749
-
Hal Finkel authored
Although the first two operands are the ones that can be swapped, the tied input operand is listed before them, so we need to adjust for that. I have a test case for this, but it goes along with an upcoming commit (so it will come soon). llvm-svn: 204748
-
Todd Fiala authored
Also added 'import sys' on some tests that are using non-standard unittest2.skipUnless blocks with code that is intended to do things that we have more specializes @* attributes for. These skip conditions were failing to execute due to missing import, causing darwin-only tests to run on Linux regardless. Will file a bug for that separately. llvm-svn: 204747
-
Hal Finkel authored
TableGen will create a lookup table for the A-type FMA instructions providing their corresponding M-form opcodes. This will be used by upcoming commits. llvm-svn: 204746
-
http://llvm.org/bugs/show_bug.cgi?id=19241Greg Clayton authored
When there was no process, the expression options were set to not ignore breakpoints. This causes debug info to be generated and causes errors when evaluating simple expressions. llvm-svn: 204745
-
Reid Kleckner authored
Fixes PR19240. In retrospect, this is a fairly obvious bug. :) llvm-svn: 204744
-
Matt Arsenault authored
Remove handling of select_cc, since it makes no sense to be there. This now does nothing, but I'll be adding some handling of other target nodes soon. llvm-svn: 204743
-
Benjamin Kramer authored
Fix an logic error in the clang driver preventing crtfastmath.o from linking when -Ofast is used without -ffast-math In gcc using -Ofast forces linking of crtfastmath.o. In the current clang crtfastmath.o is only linked when -ffast-math/-funsafe-math-optimizations passed. It can lead to performance issues, when using only -Ofast without explicit -ffast-math (I faced with it). My patch fixes inconsistency with gcc behaviour and also introduces few tests on it. Patch by Zinovy Nis! Differential Revision: http://llvm-reviews.chandlerc.com/D3114 llvm-svn: 204742
-
Duncan P. N. Exon Smith authored
Implement Pass::releaseMemory() in BlockFrequencyInfo and MachineBlockFrequencyInfo. Just delete the private implementation when not in use. Switch to a std::unique_ptr to make the logic more clear. <rdar://problem/14292693> llvm-svn: 204741
-
Duncan P. N. Exon Smith authored
<rdar://problem/14292693> llvm-svn: 204740
-
Juergen Ributzka authored
If getElementPtr uses a constant as base pointer, then make the constant opaque. This prevents constant folding it with the offset. The offset can usually be encoded in the load/store instruction itself and the base address doesn't have to be rematerialized several times. llvm-svn: 204739
-
Juergen Ributzka authored
The cost for the first four stackmap operands was always TCC_Free. This is only true for the first two operands. All other operands are TCC_Free if they are within 64bit. llvm-svn: 204738
-
Juergen Ributzka authored
Usually opaque constants shouldn't be folded, unless they are simple unary operations that don't create new constants. Although this shouldn't drop the opaque constant flag. This commit fixes this. Related to <rdar://problem/14774662> llvm-svn: 204737
-
Hans Wennborg authored
llvm-svn: 204736
-
Adam Nemet authored
This used to resort to splitting the 256-bit operation into two 128-bit shuffles and then recombining the results. Fixes <rdar://problem/16167303> llvm-svn: 204735
-
Adam Nemet authored
I found three implementations of this. This splits it out into a new function and uses it from the three places. My plan is to add a fourth use when lowering a vector_shuffle:v16i16. Compared the assembly output of test/CodeGen/X86 before and after. The only change is due to how the first PSHUFB was generated in LowerVECTOR_SHUFFLEv8i16. If the shuffle mask specified undef (i.e. -1), the old implementation would write -1 * 2 and -1 * 2 + 1 (254 and 255) in the control mask. Now we write 0x80. These are of course interchangeable since bit 7 decides if a constant zero is written in the result byte. The other instances of this code use 0x80 consistently. Related to <rdar://problem/16167303> llvm-svn: 204734
-
Richard Osborne authored
Summary: Previously the code didn't check if the before and after types for the store were pointers to different address spaces. This resulted in instcombine using a bitcast to convert between pointers to different address spaces, causing an assertion due to the invalid cast. It is not be appropriate to use addrspacecast this case because it is not guaranteed to be a no-op cast. Instead bail out and do not do the transformation. CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D3117 llvm-svn: 204733
-
Richard Osborne authored
No functionality change. llvm-svn: 204732
-
Benjamin Kramer authored
PR19187. llvm-svn: 204731
-