Skip to content
  1. Apr 22, 2013
    • David Blaikie's avatar
      Revert "Revert "PR14606: debug info imported_module support"" · f55abeaf
      David Blaikie authored
      This reverts commit r179840 with a fix to test/DebugInfo/two-cus-from-same-file.ll
      
      I'm not sure why that test only failed on ARM & MIPS and not X86 Linux, even
      though the debug info was clearly invalid on all of them, but this ought to fix
      it.
      
      llvm-svn: 179996
      f55abeaf
    • Jim Grosbach's avatar
      Legalize vector truncates by parts rather than just splitting. · 563983c8
      Jim Grosbach authored
      Rather than just splitting the input type and hoping for the best, apply
      a bit more cleverness. Just splitting the types until the source is
      legal often leads to an illegal result time, which is then widened and a
      scalarization step is introduced which leads to truly horrible code
      generation. With the loop vectorizer, these sorts of operations are much
      more common, and so it's worth extra effort to do them well.
      
      Add a legalization hook for the operands of a TRUNCATE node, which will
      be encountered after the result type has been legalized, but if the
      operand type is still illegal. If simple splitting of both types
      ends up with the result type of each half still being legal, just
      do that (v16i16 -> v16i8 on ARM, for example). If, however, that would
      result in an illegal result type (v8i32 -> v8i8 on ARM, for example),
      we can get more clever with power-two vectors. Specifically,
      split the input type, but also widen the result element size, then
      concatenate the halves and truncate again.  For example on ARM,
      To perform a "%res = v8i8 trunc v8i32 %in" we transform to:
        %inlo = v4i32 extract_subvector %in, 0
        %inhi = v4i32 extract_subvector %in, 4
        %lo16 = v4i16 trunc v4i32 %inlo
        %hi16 = v4i16 trunc v4i32 %inhi
        %in16 = v8i16 concat_vectors v4i16 %lo16, v4i16 %hi16
        %res = v8i8 trunc v8i16 %in16
      
      This allows instruction selection to generate three VMOVN instructions
      instead of a sequences of moves, stores and loads.
      
      Update the ARMTargetTransformInfo to take this improved legalization
      into account.
      
      Consider the simplified IR:
      
      define <16 x i8> @test1(<16 x i32>* %ap) {
        %a = load <16 x i32>* %ap
        %tmp = trunc <16 x i32> %a to <16 x i8>
        ret <16 x i8> %tmp
      }
      
      define <8 x i8> @test2(<8 x i32>* %ap) {
        %a = load <8 x i32>* %ap
        %tmp = trunc <8 x i32> %a to <8 x i8>
        ret <8 x i8> %tmp
      }
      
      Previously, we would generate the truly hideous:
      	.syntax unified
      	.section	__TEXT,__text,regular,pure_instructions
      	.globl	_test1
      	.align	2
      _test1:                                 @ @test1
      @ BB#0:
      	push	{r7}
      	mov	r7, sp
      	sub	sp, sp, #20
      	bic	sp, sp, #7
      	add	r1, r0, #48
      	add	r2, r0, #32
      	vld1.64	{d24, d25}, [r0:128]
      	vld1.64	{d16, d17}, [r1:128]
      	vld1.64	{d18, d19}, [r2:128]
      	add	r1, r0, #16
      	vmovn.i32	d22, q8
      	vld1.64	{d16, d17}, [r1:128]
      	vmovn.i32	d20, q9
      	vmovn.i32	d18, q12
      	vmov.u16	r0, d22[3]
      	strb	r0, [sp, #15]
      	vmov.u16	r0, d22[2]
      	strb	r0, [sp, #14]
      	vmov.u16	r0, d22[1]
      	strb	r0, [sp, #13]
      	vmov.u16	r0, d22[0]
      	vmovn.i32	d16, q8
      	strb	r0, [sp, #12]
      	vmov.u16	r0, d20[3]
      	strb	r0, [sp, #11]
      	vmov.u16	r0, d20[2]
      	strb	r0, [sp, #10]
      	vmov.u16	r0, d20[1]
      	strb	r0, [sp, #9]
      	vmov.u16	r0, d20[0]
      	strb	r0, [sp, #8]
      	vmov.u16	r0, d18[3]
      	strb	r0, [sp, #3]
      	vmov.u16	r0, d18[2]
      	strb	r0, [sp, #2]
      	vmov.u16	r0, d18[1]
      	strb	r0, [sp, #1]
      	vmov.u16	r0, d18[0]
      	strb	r0, [sp]
      	vmov.u16	r0, d16[3]
      	strb	r0, [sp, #7]
      	vmov.u16	r0, d16[2]
      	strb	r0, [sp, #6]
      	vmov.u16	r0, d16[1]
      	strb	r0, [sp, #5]
      	vmov.u16	r0, d16[0]
      	strb	r0, [sp, #4]
      	vldmia	sp, {d16, d17}
      	vmov	r0, r1, d16
      	vmov	r2, r3, d17
      	mov	sp, r7
      	pop	{r7}
      	bx	lr
      
      	.globl	_test2
      	.align	2
      _test2:                                 @ @test2
      @ BB#0:
      	push	{r7}
      	mov	r7, sp
      	sub	sp, sp, #12
      	bic	sp, sp, #7
      	vld1.64	{d16, d17}, [r0:128]
      	add	r0, r0, #16
      	vld1.64	{d20, d21}, [r0:128]
      	vmovn.i32	d18, q8
      	vmov.u16	r0, d18[3]
      	vmovn.i32	d16, q10
      	strb	r0, [sp, #3]
      	vmov.u16	r0, d18[2]
      	strb	r0, [sp, #2]
      	vmov.u16	r0, d18[1]
      	strb	r0, [sp, #1]
      	vmov.u16	r0, d18[0]
      	strb	r0, [sp]
      	vmov.u16	r0, d16[3]
      	strb	r0, [sp, #7]
      	vmov.u16	r0, d16[2]
      	strb	r0, [sp, #6]
      	vmov.u16	r0, d16[1]
      	strb	r0, [sp, #5]
      	vmov.u16	r0, d16[0]
      	strb	r0, [sp, #4]
      	ldm	sp, {r0, r1}
      	mov	sp, r7
      	pop	{r7}
      	bx	lr
      
      Now, however, we generate the much more straightforward:
      	.syntax unified
      	.section	__TEXT,__text,regular,pure_instructions
      	.globl	_test1
      	.align	2
      _test1:                                 @ @test1
      @ BB#0:
      	add	r1, r0, #48
      	add	r2, r0, #32
      	vld1.64	{d20, d21}, [r0:128]
      	vld1.64	{d16, d17}, [r1:128]
      	add	r1, r0, #16
      	vld1.64	{d18, d19}, [r2:128]
      	vld1.64	{d22, d23}, [r1:128]
      	vmovn.i32	d17, q8
      	vmovn.i32	d16, q9
      	vmovn.i32	d18, q10
      	vmovn.i32	d19, q11
      	vmovn.i16	d17, q8
      	vmovn.i16	d16, q9
      	vmov	r0, r1, d16
      	vmov	r2, r3, d17
      	bx	lr
      
      	.globl	_test2
      	.align	2
      _test2:                                 @ @test2
      @ BB#0:
      	vld1.64	{d16, d17}, [r0:128]
      	add	r0, r0, #16
      	vld1.64	{d18, d19}, [r0:128]
      	vmovn.i32	d16, q8
      	vmovn.i32	d17, q9
      	vmovn.i16	d16, q8
      	vmov	r0, r1, d16
      	bx	lr
      
      llvm-svn: 179989
      563983c8
  2. Apr 21, 2013
  3. Apr 20, 2013
  4. Apr 19, 2013
  5. Apr 17, 2013
  6. Apr 15, 2013
  7. Apr 14, 2013
  8. Apr 13, 2013
  9. Apr 12, 2013
  10. Apr 11, 2013
    • Benjamin Kramer's avatar
      Add braces around || in && to pacify GCC. · e7c45bc6
      Benjamin Kramer authored
      llvm-svn: 179275
      e7c45bc6
    • Hal Finkel's avatar
      Manually remove successors in if conversion when CopyAndPredicateBlock is used · 95081bff
      Hal Finkel authored
      In the simple and triangle if-conversion cases, when CopyAndPredicateBlock is
      used because the to-be-predicated block has other predecessors, we need to
      explicitly remove the old copied block from the successors list. Normally if
      conversion relies on TII->AnalyzeBranch combined with BB->CorrectExtraCFGEdges
      to cleanup the successors list, but if the predicated block contained an
      un-analyzable branch (such as a now-predicated return), then this will fail.
      
      These extra successors were causing a problem on PPC because it was causing
      later passes (such as PPCEarlyReturm) to leave dead return-only basic blocks in
      the code.
      
      llvm-svn: 179227
      95081bff
  11. Apr 10, 2013
    • Andrew Trick's avatar
      Generalize the PassConfig API and remove addFinalizeRegAlloc(). · e220323c
      Andrew Trick authored
      The target hooks are getting out of hand. What does it mean to run
      before or after regalloc anyway? Allowing either Pass* or AnalysisID
      pass identification should make it much easier for targets to use the
      substitutePass and insertPass APIs, and create less need for badly
      named target hooks.
      
      llvm-svn: 179140
      e220323c
  12. Apr 09, 2013
  13. Apr 07, 2013
  14. Apr 06, 2013
    • Nadav Rotem's avatar
      typo · c4bd84c1
      Nadav Rotem authored
      llvm-svn: 178949
      c4bd84c1
    • Manman Ren's avatar
      Dwarf: use utostr on CUID to append to SmallString. · 5b22f9fe
      Manman Ren authored
      We used to do "SmallString += CUID", which is incorrect, since CUID will
      be truncated to a char.
      
      rdar://problem/13573833
      
      llvm-svn: 178941
      5b22f9fe
    • Hal Finkel's avatar
      Reapply r178845 with fix - Fix bug in PEI's virtual-register scavenging · 3005c299
      Hal Finkel authored
      This fixes PEI as previously described, but correctly handles the case where
      the instruction defining the virtual register to be scavenged is the first in
      the block. Arnold provided me with a bugpoint-reduced test case, but even that
      seems too large to use as a regression test. If I'm successful in cleaning it
      up then I'll commit that as well.
      
      Original commit message:
      
          This change fixes a bug that I introduced in r178058. After a register is
          scavenged using one of the available spills slots the instruction defining the
          virtual register needs to be moved to after the spill code. The scavenger has
          already processed the defining instruction so that registers killed by that
          instruction are available for definition in that same instruction. Unfortunately,
          after this, the scavenger needs to iterate through the spill code and then
          visit, again, the instruction that defines the now-scavenged register. In order
          to avoid confusion, the register scavenger needs the ability to 'back up'
          through the spill code so that it can again process the instructions in the
          appropriate order. Prior to this fix, once the scavenger reached the
          just-moved instruction, it would assert if it killed any registers because,
          having already processed the instruction, it believed they were undefined.
      
          Unfortunately, I don't yet have a small test case. Thanks to Pranav Bhandarkar
          for diagnosing the problem and testing this fix.
      
      llvm-svn: 178919
      3005c299
  15. Apr 05, 2013
    • Bill Wendling's avatar
      Use the target options specified on a function to reset the back-end. · eb108bad
      Bill Wendling authored
      During LTO, the target options on functions within the same Module may
      change. This would necessitate resetting some of the back-end. Do this for X86,
      because it's a Friday afternoon.
      
      llvm-svn: 178917
      eb108bad
    • Hal Finkel's avatar
      Revert r178845 - Fix bug in PEI's virtual-register scavenging · 81c46d08
      Hal Finkel authored
      Reverting because this breaks one of the LTO builders. Original commit message:
      
          This change fixes a bug that I introduced in r178058. After a register is
          scavenged using one of the available spills slots the instruction defining the
          virtual register needs to be moved to after the spill code. The scavenger has
          already processed the defining instruction so that registers killed by that
          instruction are available for definition in that same instruction. Unfortunately,
          after this, the scavenger needs to iterate through the spill code and then
          visit, again, the instruction that defines the now-scavenged register. In order
          to avoid confusion, the register scavenger needs the ability to 'back up'
          through the spill code so that it can again process the instructions in the
          appropriate order. Prior to this fix, once the scavenger reached the
          just-moved instruction, it would assert if it killed any registers because,
          having already processed the instruction, it believed they were undefined.
      
          Unfortunately, I don't yet have a small test case. Thanks to Pranav Bhandarkar
          for diagnosing the problem and testing this fix.
      
      llvm-svn: 178916
      81c46d08
    • Hal Finkel's avatar
      Fix bug in PEI's virtual-register scavenging · e6f48e4e
      Hal Finkel authored
      This change fixes a bug that I introduced in r178058. After a register is
      scavenged using one of the available spills slots the instruction defining the
      virtual register needs to be moved to after the spill code. The scavenger has
      already processed the defining instruction so that registers killed by that
      instruction are available for definition in that same instruction. Unfortunately,
      after this, the scavenger needs to iterate through the spill code and then
      visit, again, the instruction that defines the now-scavenged register. In order
      to avoid confusion, the register scavenger needs the ability to 'back up'
      through the spill code so that it can again process the instructions in the
      appropriate order. Prior to this fix, once the scavenger reached the
      just-moved instruction, it would assert if it killed any registers because,
      having already processed the instruction, it believed they were undefined.
      
      Unfortunately, I don't yet have a small test case. Thanks to Pranav Bhandarkar
      for diagnosing the problem and testing this fix.
      
      llvm-svn: 178845
      e6f48e4e
    • Andrew Trick's avatar
      RegisterPressure heuristics currently require signed comparisons. · 80e66ce0
      Andrew Trick authored
      llvm-svn: 178823
      80e66ce0
    • Andrew Trick's avatar
      Disable DFSResult for ConvergingScheduler. · 96ce3848
      Andrew Trick authored
      For now, just save the compile time since the ConvergingScheduler
      heuristics don't use this analysis. We'll probably enable it later
      after compile-time investigation.
      
      llvm-svn: 178822
      96ce3848
Loading