- Nov 08, 2013
-
-
Arnaud A. de Grandmaison authored
Based on discussions with Lang Hames and Jakob Stoklund Olesen at the hacker's lab, and in the light of upcoming work on the PBQP register allocator, it was though that CalcSpillWeights does not need to be a pass. This change will enable to customize / tune the spill weight computation depending on the allocator. Update the documentation style while there. No functionnal change. llvm-svn: 194269
-
Arnaud A. de Grandmaison authored
llvm-svn: 194259
-
- Nov 05, 2013
-
-
Andrew Trick authored
MorphNodeTo is not safe to call during DAG building. It eagerly deletes dependent DAG nodes which invalidates the NodeMap. We could expose a safe interface for morphing nodes, but I don't think it's worth it. Just create a new MachineNode and replaceAllUsesWith. My understaning of the SD design has been that we want to support early target opcode selection. That isn't very well supported, but generally works. It seems reasonable to rely on this feature even if it isn't widely used. llvm-svn: 194102
-
- Nov 02, 2013
-
-
Eric Christopher authored
llvm-svn: 193894
-
- Nov 01, 2013
-
-
Juergen Ributzka authored
llvm-svn: 193871
-
Rafael Espindola authored
linkonce_odr_auto_hide was in incomplete attempt to implement a way for the linker to hide symbols that are known to be available in every TU and whose addresses are not relevant for a particular DSO. It was redundant in that it all its uses are equivalent to linkonce_odr+unnamed_addr. Unlike those, it has never been connected to clang or llvm's optimizers, so it was effectively dead. Given that nothing produces it, this patch just nukes it (other than the llvm-c enum value). llvm-svn: 193865
-
Aaron Ballman authored
Commenting out this assert because it is causing the build bots to fail. This effectively reverts r193861, but needs to be fixed as part of r193769. llvm-svn: 193862
-
Aaron Ballman authored
llvm-svn: 193861
-
David Blaikie authored
Drive by space optimization. Also makes the DIEs more regular which might speed up DWARF parsing. llvm-svn: 193835
-
- Oct 31, 2013
-
-
Andrew Trick authored
llvm-svn: 193819
-
Andrew Trick authored
Originally implemented by Lang Hames. llvm-svn: 193811
-
Manman Ren authored
We add a map in DwarfDebug to map MDNodes that are shareable across CUs to the corresponding DIEs: MDTypeNodeToDieMap. These DIEs can be shared across CUs, that is why we keep the maps in DwarfDebug instead of CompileUnit. We make the assumption that if a DIE is not added to an owner yet, we assume it belongs to the current CU. Since DIEs for the type system are added to their owners immediately after creation, and other DIEs belong to the current CU, the assumption should be true. A testing case is added to show that we only create a single DIE for a type MDNode and we use ref_addr to refer to the type DIE. We also add a testing case to show ref_addr relocations for non-darwin platforms. llvm-svn: 193779
-
Andrew Trick authored
llvm-svn: 193769
-
Andrew Trick authored
llvm-svn: 193765
-
Rafael Espindola authored
It had no tests, was unused and was "experimental at best". llvm-svn: 193749
-
Jim Grosbach authored
When an extend more than doubles the size of the elements (e.g., a zext from v16i8 to v16i32), the normal legalization method of splitting the vectors will run into problems as by the time the destination vector is legal, the source vector is illegal. The end result is the operation often becoming scalarized, with the typical horrible performance. For example, on x86_64, the simple input of: define void @bar(<16 x i8> %a, <16 x i32>* %p) nounwind { %tmp = zext <16 x i8> %a to <16 x i32> store <16 x i32> %tmp, <16 x i32>*%p ret void } Generates: .section __TEXT,__text,regular,pure_instructions .section __TEXT,__const .align 5 LCPI0_0: .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .section __TEXT,__text,regular,pure_instructions .globl _bar .align 4, 0x90 _bar: vpunpckhbw %xmm0, %xmm0, %xmm1 vpunpckhwd %xmm0, %xmm1, %xmm2 vpmovzxwd %xmm1, %xmm1 vinsertf128 $1, %xmm2, %ymm1, %ymm1 vmovaps LCPI0_0(%rip), %ymm2 vandps %ymm2, %ymm1, %ymm1 vpmovzxbw %xmm0, %xmm3 vpunpckhwd %xmm0, %xmm3, %xmm3 vpmovzxbd %xmm0, %xmm0 vinsertf128 $1, %xmm3, %ymm0, %ymm0 vandps %ymm2, %ymm0, %ymm0 vmovaps %ymm0, (%rdi) vmovaps %ymm1, 32(%rdi) vzeroupper ret So instead we can check if there are legal types that enable us to split more cleverly when the input vector is already legal such that we don't turn it into an illegal type. If the extend is such that it's more than doubling the size of the input we check if - the number of vector elements is even, - the source type is legal, - the type of a split source is illegal, - the type of an extended (by doubling element size) source is legal, and - the type of that extended source when split is legal. If the conditions are met, instead of just splitting both the destination and the source types, we create an extend that only goes up one "step" (doubling the element width), and the continue legalizing the rest of the operation normally. The result is that this operates as a new, more effecient, termination condition for the loop of "split the operation until the destination type is legal." With this change, the above example now compiles to: _bar: vpxor %xmm1, %xmm1, %xmm1 vpunpcklbw %xmm1, %xmm0, %xmm2 vpunpckhwd %xmm1, %xmm2, %xmm3 vpunpcklwd %xmm1, %xmm2, %xmm2 vinsertf128 $1, %xmm3, %ymm2, %ymm2 vpunpckhbw %xmm1, %xmm0, %xmm0 vpunpckhwd %xmm1, %xmm0, %xmm3 vpunpcklwd %xmm1, %xmm0, %xmm0 vinsertf128 $1, %xmm3, %ymm0, %ymm0 vmovaps %ymm0, 32(%rdi) vmovaps %ymm2, (%rdi) vzeroupper ret This generalizes a custom lowering that was added a while back to the ARM backend. That lowering is no longer necessary, and is removed. The testcases for it, however, provide excellent ARM tests for this change and so remain. rdar://14735100 llvm-svn: 193727
-
Matt Arsenault authored
llvm-svn: 193721
-
- Oct 30, 2013
-
-
Rafael Espindola authored
With this patch llvm produces a weak_def_can_be_hidden for linkonce_odr if they are also unnamed_addr or don't have their address taken. There is not a lot of documentation about .weak_def_can_be_hidden, but from the old discussion about linkonce_odr_auto_hide and the name of the directive this looks correct: these symbols can be hidden. Testing this with the ld64 in Xcode 5 linking clang reduces the number of exported symbols from 21053 to 19049. llvm-svn: 193718
-
David Blaikie authored
This is a preliminary step to handling type units by abstracting over all (type or compile) units. llvm-svn: 193714
-
David Blaikie authored
llvm-svn: 193699
-
Juergen Ributzka authored
Now Hexagon and SystemZ are not happy with it :-( llvm-svn: 193677
-
Juergen Ributzka authored
The Type Legalizer recognizes that VSELECT needs to be split, because the type is to wide for the given target. The same does not always apply to SETCC, because less space is required to encode the result of a comparison. As a result VSELECT is split and SETCC is unrolled into scalar comparisons. This commit fixes the issue by checking for VSELECT-SETCC patterns in the DAG Combiner. If a matching pattern is found, then the result mask of SETCC is promoted to the expected vector mask type for the given target. This mask has usually the same size as the VSELECT return type (except for Intel KNL). Now the type legalizer will split both VSELECT and SETCC. This allows the following X86 DAG Combine code to sucessfully detect the MIN/MAX pattern. This fixes PR16695, PR17002, and <rdar://problem/14594431>. Reviewed by Nadav llvm-svn: 193676
-
Josh Magee authored
Differential Revision: http://llvm-reviews.chandlerc.com/D2057 llvm-svn: 193672
-
Manman Ren authored
Use EmitLabelOffsetDifference for handling on darwin platform when non-darwin platforms use EmitLabelPlusOffset. Also fix a bug in EmitLabelOffsetDifference where the size is hard-coded to 4 even though Size is passed in as an argument. llvm-svn: 193660
-
- Oct 29, 2013
-
-
Manman Ren authored
To support ref_addr, we calculate the section offset of a DIE (i.e. offset of a DIE from beginning of the debug info section). The Offset field in DIE is currently CU-relative. To calculate the section offset, we add a DebugInfoOffset field in CompileUnit to store the offset of a CU from beginning of the debug info section. We set the value in DwarfUnits::computeSizeAndOffset for each CompileUnit. A helper function DIE::getCompileUnit is added to return the CU DIE that the input DIE belongs to. We also add a map CUDieMap in DwarfDebug to help finding the CU for a given CU DIE. For a cross-referenced DIE, we first find the CU DIE it belongs to with getCompileUnit, then we use CUDieMap to get the corresponding CU for the CU DIE. Adding the section offset of the CU with the CU-relative offset of a DIE gives us the seciton offset of the DIE. We correctly emit ref_addr with relocation using EmitLabelPlusOffset when doesDwarfUseRelocationsAcrossSections is true. This commit handles the emission of DW_FORM_ref_addr when we have an attribute with FORM_ref_addr. A follow-on patch will start using ref_addr when adding a DIEEntry. This commit will be tested and verified in the follow-on patch. Reviewed off-list by Eric, Thanks. llvm-svn: 193658
-
Manman Ren authored
after the DIE creation, we construct the context first. Ensure that we create the context before we create a type so that we can add the newly created type to the parent. Remove last use of addToContextOwner now that it's not needed. We use createAndAddDIE to wrap around "new DIE(". Now all shareable DIEs should be added to their parents right after the creation. Reviewed off-list by Eric, Thanks. llvm-svn: 193657
-
Josh Magee authored
This modifies the pass to classify every SSP-triggering AllocaInst according to an SSPLayoutKind (LargeArray, SmallArray, AddrOf). This analysis is collected by the pass and made available for use, but no other pass uses it yet. The next patch will make use of this analysis in PEI and StackSlot passes. The end goal is to support ssp-strong stack layout rules. WIP. Differential Revision: http://llvm-reviews.chandlerc.com/D1789 llvm-svn: 193653
-
Rafael Espindola authored
This allows constructing a Mangler with just a TargetMachine. llvm-svn: 193630
-
Rafael Espindola authored
llvm-svn: 193627
-
Manman Ren authored
after the DIE creation, we construct the context first. This touches creation of namespaces and global variables. The purpose is to handle all DIE creations similarly: constructs the context first, then creates the DIE and immediately adds the DIE to its parent. We use createAndAddDIE to wrap around "new DIE(". llvm-svn: 193589
-
Alp Toker authored
llvm-svn: 193579
-
Manman Ren authored
This commit ensures DIEs are constructed within a compile unit and immediately added to their parents. Reviewed off-list by Eric. llvm-svn: 193568
-
Manman Ren authored
More patches will be submitted to convert "new DIE(" to use createAddAndDIE in DwarfCompileUnit.cpp. This will simplify implementation of addDIEEntry where we have to decide between ref4 and ref_addr, because DIEs that can be shared across CU will be added to a CU already. Reviewed off-list by Eric. llvm-svn: 193567
-
Manman Ren authored
It wraps around "new DIE(" and handles the bookkeeping part of the newly-created DIE. It adds the DIE to its parent, and calls insertDIE if necessary. It makes sure that bookkeeping is done at the earliest time and we should not see parentless DIEs if all constructions of DIEs go through this helper function. Later on, we can use an allocator for DIE allocation, and will only need to change createAndAddDIE instead of modifying all the "new DIE(". Reviewed off-list by Eric. llvm-svn: 193566
-
- Oct 28, 2013
-
-
Richard Sandiford authored
Making useAA() default to true for SystemZ showed that the combiner alias analysis wasn't handling volatile accesses. This hit many of the SystemZ tests, but I arbitrarily picked one for the purpose of this patch. llvm-svn: 193518
-
Richard Sandiford authored
Most SelectionDAG code drops the TBAA info when creating a new form of a load and store (e.g. during legalization, or when converting a plain load to an extending one). This patch tries to catch all cases where the TBAA information can legitimately be carried over. The patch adds alternative forms of getLoad() and getExtLoad() that take a MachineMemOperand instead of individual fields. (The corresponding getTruncStore() already exists.) The idea is to use the MachineMemOperand forms when all fields are carried over (size, pointer info, isVolatile, isNonTemporal, alignment and TBAA info). If some adjustment is being made, e.g. to narrow the load, then we still pass the individual fields but also pass the TBAA info. llvm-svn: 193517
-
- Oct 25, 2013
-
-
David Blaikie authored
llvm-svn: 193432
-
David Blaikie authored
llvm-svn: 193427
-
Tim Northover authored
ARM processors without ldrex/strex need to be able to make libcalls for all atomic operations, including the newer min/max versions. The alternative would probably be expanding these operations in terms of cmpxchg (as x86 does always), but in the configurations where this matters code-size tends to be paramount so the libcall is more desirable. llvm-svn: 193398
-
Nadav Rotem authored
This optimization is not SSE specific so I am moving it to DAGco. The new scalar_to_vector dag node exposed a missing pattern in the AArch64 target that I needed to add. llvm-svn: 193393
-