Skip to content
  1. Feb 10, 2015
  2. Feb 09, 2015
    • Akira Hatanaka's avatar
      Fix a bug in DemoteRegToStack where a reload instruction was inserted into the · 8d3cb829
      Akira Hatanaka authored
      wrong basic block.
      
      This would happen when the result of an invoke was used by a phi instruction
      in the invoke's normal destination block. An instruction to reload the invoke's
      value would get inserted before the critical edge was split and a new basic
      block (which is the correct insertion point for the reload) was created. This
      commit fixes the bug by splitting the critical edge before all the reload
      instructions are inserted.
      
      Also, hoist up the code which computes the insertion point to the only place
      that need that computation.
      
      rdar://problem/15978721
      
      llvm-svn: 228566
      8d3cb829
  3. Feb 08, 2015
    • Tim Northover's avatar
      ARM & AArch64: teach LowerVSETCC that output type size may differ from input. · 45aa89c9
      Tim Northover authored
      While various DAG combines try to guarantee that a vector SETCC
      operation will have the same output size as input, there's nothing
      intrinsic to either creation or LegalizeTypes that actually guarantees
      it, so the function needs to be ready to handle a mismatch.
      
      Fortunately this is easy enough, just extend or truncate the naturally
      compared result.
      
      I couldn't reproduce the failure in other backends that I know have
      SIMD, so it's probably only an issue for these two due to shared
      heritage.
      
      Should fix PR21645.
      
      llvm-svn: 228518
      45aa89c9
  4. Feb 07, 2015
  5. Feb 05, 2015
  6. Feb 04, 2015
    • Renato Golin's avatar
      Adding support to LLVM for targeting Cortex-A72 · 60885044
      Renato Golin authored
      Currently, Cortex-A72 is modelled as an Cortex-A57 except the fp
      load balancing pass isn't enabled for Cortex-A72 as it's not
      profitable to have it enabled for this core.
      
      Patch by Ranjeet Singh.
      
      llvm-svn: 228140
      60885044
    • Renato Golin's avatar
      Reverting VLD1/VST1 base-updating/post-incrementing combining · 2a5c0a51
      Renato Golin authored
      This reverts patches 223862, 224198, 224203, and 224754, which were all
      related to the vector load/store combining and were reverted/reaplied
      a few times due to the same alignment problems we're seeing now.
      
      Further tests, mainly self-hosting Clang, will be needed to reapply this
      patch in the future.
      
      llvm-svn: 228129
      2a5c0a51
  7. Feb 02, 2015
  8. Jan 31, 2015
    • Saleem Abdulrasool's avatar
      ARM: support stack probe size on Windows on ARM · fb8a66fb
      Saleem Abdulrasool authored
      Now that -mstack-probe-size is piped through to the backend via the function
      attribute as on Windows x86, honour the value to permit handling of non-default
      values for stack probes.  This is needed /Gs with the clang-cl driver or
      -mstack-probe-size with the clang driver when targeting Windows on ARM.
      
      llvm-svn: 227667
      fb8a66fb
  9. Jan 29, 2015
  10. Jan 23, 2015
  11. Jan 21, 2015
  12. Jan 19, 2015
    • Rafael Espindola's avatar
      Bring r226038 back. · 12ca34f5
      Rafael Espindola authored
      No change in this commit, but clang was changed to also produce trivial comdats when
      needed.
      
      Original message:
      
      Don't create new comdats in CodeGen.
      
      This patch stops the implicit creation of comdats during codegen.
      
      Clang now sets the comdat explicitly when it is required. With this patch clang and gcc
      now produce the same result in pr19848.
      
      llvm-svn: 226467
      12ca34f5
  13. Jan 16, 2015
    • Timur Iskhodzhanov's avatar
      Revert r226242 - Revert Revert Don't create new comdats in CodeGen · 60b72136
      Timur Iskhodzhanov authored
      This breaks AddressSanitizer (ninja check-asan) on Windows
      
      llvm-svn: 226251
      60b72136
    • Rafael Espindola's avatar
      Revert "Revert Don't create new comdats in CodeGen" · 67a79e72
      Rafael Espindola authored
      This reverts commit r226173, adding r226038 back.
      
      No change in this commit, but clang was changed to also produce trivial comdats for
      costructors, destructors and vtables when needed.
      
      Original message:
      
      Don't create new comdats in CodeGen.
      
      This patch stops the implicit creation of comdats during codegen.
      
      Clang now sets the comdat explicitly when it is required. With this patch clang and gcc
      now produce the same result in pr19848.
      
      llvm-svn: 226242
      67a79e72
  14. Jan 15, 2015
    • Hal Finkel's avatar
      Revert "r226086 - Revert "r226071 - [RegisterCoalescer] Remove copies to reserved registers"" · 5ef58eb8
      Hal Finkel authored
      Reapply r226071 with fixes. Two fixes:
      
       1. We need to manually remove the old and create the new 'deaf defs'
          associated with physical register definitions when we move the definition of
          the physical register from the copy point to the point of the original vreg def.
      
          This problem was picked up by the machinstr verifier, and could trigger a
          verification failure on test/CodeGen/X86/2009-02-12-DebugInfoVLA.ll, so I've
          turned on the verifier in the tests.
      
       2. When moving the def point of the phys reg up, we need to make sure that it
          is neither defined nor read in between the two instructions. We don't, however,
          extend the live ranges of phys reg defs to cover uses, so just checking for
          live-range overlap between the pair interval and the phys reg aliases won't
          pick up reads. As a result, we manually iterate over the range and check for
          reads.
      
          A test soon to be committed to the PowerPC backend will test this change.
      
      Original commit message:
      
      [RegisterCoalescer] Remove copies to reserved registers
      
      This allows the RegisterCoalescer to join "non-flipped" range pairs with a
      physical destination register -- which allows the RegisterCoalescer to remove
      copies like this:
      
      <vreg> = something (maybe a load, for example)
      ... (things that don't use PHYSREG)
      PHYSREG = COPY <vreg>
      
      (with all of the restrictions normally applied by the RegisterCoalescer: having
      compatible register classes, etc. )
      
      Previously, the RegisterCoalescer handled only the opposite case (copying
      *from* a physical register). I don't handle the problem fully here, but try to
      get the common case where there is only one use of <vreg> (the COPY).
      
      An upcoming commit to the PowerPC backend will make this pattern much more
      common on PPC64/ELF systems.
      
      llvm-svn: 226200
      5ef58eb8
    • Timur Iskhodzhanov's avatar
      Revert Don't create new comdats in CodeGen · f5adf13f
      Timur Iskhodzhanov authored
      It breaks AddressSanitizer on Windows.
      
      llvm-svn: 226173
      f5adf13f
    • Hal Finkel's avatar
      Revert "r226071 - [RegisterCoalescer] Remove copies to reserved registers" · dd669615
      Hal Finkel authored
      Reverting this while I investigate some bad behavior this is causing. As a
      possibly-related issue, adding -verify-machineinstrs to one of the test cases
      now fails because of this change:
      
        llc test/CodeGen/X86/2009-02-12-DebugInfoVLA.ll -march=x86-64 -o - -verify-machineinstrs
      
      *** Bad machine code: No instruction at def index ***
      - function:    foo
      - basic block: BB#0 return (0x10007e21f10) [0B;736B)
      - liverange:   [128r,128d:9)[160r,160d:8)[176r,176d:7)[336r,336d:6)[464r,464d:5)[480r,480d:4)[624r,624d:3)[752r,752d:2)[768r,768d:1)[78
      4r,784d:0)  0@784r 1@768r 2@752r 3@624r 4@480r 5@464r 6@336r 7@176r 8@160r 9@128r
      - register:    %DS
      Valno #3 is defined at 624r
      
      *** Bad machine code: Live segment doesn't end at a valid instruction ***
      - function:    foo
      - basic block: BB#0 return (0x10007e21f10) [0B;736B)
      - liverange:   [128r,128d:9)[160r,160d:8)[176r,176d:7)[336r,336d:6)[464r,464d:5)[480r,480d:4)[624r,624d:3)[752r,752d:2)[768r,768d:1)[78
      4r,784d:0)  0@784r 1@768r 2@752r 3@624r 4@480r 5@464r 6@336r 7@176r 8@160r 9@128r
      - register:    %DS
      [624r,624d:3)
      LLVM ERROR: Found 2 machine code errors.
      
      where 624r corresponds exactly to the interval combining change:
      
      624B    %RSP<def> = COPY %vreg16; GR64:%vreg16
              Considering merging %vreg16 with %RSP
                      RHS = %vreg16 [608r,624r:0)  0@608r
                      updated: 608B   %RSP<def> = MOV64rm <fi#3>, 1, %noreg, 0, %noreg; mem:LD8[%saved_stack.1]
              Success: %vreg16 -> %RSP
              Result = %RSP
      
      llvm-svn: 226086
      dd669615
    • Hal Finkel's avatar
      [RegisterCoalescer] Remove copies to reserved registers · 82996462
      Hal Finkel authored
      This allows the RegisterCoalescer to join "non-flipped" range pairs with a
      physical destination register -- which allows the RegisterCoalescer to remove
      copies like this:
      
      <vreg> = something (maybe a load, for example)
      ... (things that don't use PHYSREG)
      PHYSREG = COPY <vreg>
      
      (with all of the restrictions normally applied by the RegisterCoalescer: having
      compatible register classes, etc. )
      
      Previously, the RegisterCoalescer handled only the opposite case (copying
      *from* a physical register). I don't handle the problem fully here, but try to
      get the common case where there is only one use of <vreg> (the COPY).
      
      An upcoming commit to the PowerPC backend will make this pattern much more
      common on PPC64/ELF systems.
      
      llvm-svn: 226071
      82996462
  15. Jan 14, 2015
    • Duncan P. N. Exon Smith's avatar
      IR: Move MDLocation into place · 98854699
      Duncan P. N. Exon Smith authored
      This commit moves `MDLocation`, finishing off PR21433.  There's an
      accompanying clang commit for frontend testcases.  I'll attach the
      testcase upgrade script I used to PR21433 to help out-of-tree
      frontends/backends.
      
      This changes the schema for `DebugLoc` and `DILocation` from:
      
          !{i32 3, i32 7, !7, !8}
      
      to:
      
          !MDLocation(line: 3, column: 7, scope: !7, inlinedAt: !8)
      
      Note that empty fields (line/column: 0 and inlinedAt: null) don't get
      printed by the assembly writer.
      
      llvm-svn: 226048
      98854699
    • Rafael Espindola's avatar
      Don't create new comdats in CodeGen. · fad1639a
      Rafael Espindola authored
      This patch stops the implicit creation of comdats during codegen.
      
      Clang now sets the comdat explicitly when it is required. With this patch clang and gcc
      now produce the same result in pr19848.
      
      llvm-svn: 226038
      fad1639a
    • Tim Northover's avatar
      ARM: add test for crc32 instructions in CodeGen. · a203ca61
      Tim Northover authored
      Somehow we seem to have ended up without any actual tests of the
      CodeGen side. Easy enough to fix.
      
      llvm-svn: 225930
      a203ca61
  16. Jan 12, 2015
    • Adrian Prantl's avatar
      Debug info: Factor out the creation of DWARF expressions from AsmPrinter · b16d9ebb
      Adrian Prantl authored
      into a new class DwarfExpression that can be shared between AsmPrinter
      and DwarfUnit.
      
      This is the first step towards unifying the two entirely redundant
      implementations of dwarf expression emission in DwarfUnit and AsmPrinter.
      
      Almost no functional change — Testcases were updated because asm comments
      that used to be on two lines now appear on the same line, which is
      actually preferable.
      
      llvm-svn: 225706
      b16d9ebb
  17. Jan 08, 2015
    • Kristof Beyls's avatar
      Fix large stack alignment codegen for ARM and Thumb2 targets · 933de7aa
      Kristof Beyls authored
      This partially fixes PR13007 (ARM CodeGen fails with large stack
      alignment): for ARM and Thumb2 targets, but not for Thumb1, as it
      seems stack alignment for Thumb1 targets hasn't been supported at
      all.
      
      Producing an aligned stack pointer is done by zero-ing out the lower
      bits of the stack pointer. The BIC instruction was used for this.
      However, the immediate field of the BIC instruction only allows to
      encode an immediate that can zero out up to a maximum of the 8 lower
      bits. When a larger alignment is requested, a BIC instruction cannot
      be used; llvm was silently producing incorrect code in this case.
      
      This commit fixes code generation for large stack aligments by
      using the BFC instruction instead, when the BFC instruction is
      available.  When not, it uses 2 instructions: a right shift,
      followed by a left shift to zero out the lower bits.
      
      The lowering of ARM::Int_eh_sjlj_dispatchsetup still has code
      that unconditionally uses BIC to realign the stack pointer, so it
      very likely has the same problem. However, I wasn't able to
      produce a test case for that. This commit adds an assert so that
      the compiler will fail the assert instead of silently generating
      wrong code if this is ever reached.
      
      llvm-svn: 225446
      933de7aa
  18. Jan 07, 2015
  19. Jan 05, 2015
    • Charlie Turner's avatar
      Emit the build attribute Tag_conformance. · 8b2caa45
      Charlie Turner authored
      Claim conformance to version 2.09 of the ARM ABI.
      
      This build attribute must be emitted first amongst the build attributes when
      written to an object file. This is to simplify conformance detection by
      consumers.
      
      Change-Id: If9eddcfc416bc9ad6e5cc8cdcb05d0031af7657e
      llvm-svn: 225166
      8b2caa45
  20. Jan 03, 2015
    • Saleem Abdulrasool's avatar
      ARM: permit tail calls to weak externals on COFF · 67f72993
      Saleem Abdulrasool authored
      Weak externals are resolved statically, so we can actually generate the tail
      call on PE/COFF targets without breaking the requirements.  It is questionable
      whether we want to propagate the current behaviour for MachO as the requirements
      are part of the ARM ELF specifications, and it seems that prior to the SVN
      r215890, we would have tail'ed the call.  For now, be conservative and only
      permit it on PE/COFF where the call will always be fully resolved.
      
      llvm-svn: 225119
      67f72993
  21. Dec 23, 2014
    • Ahmed Bougacha's avatar
      [ARM] Don't break alignment when combining base updates into load/stores. · 4553bff4
      Ahmed Bougacha authored
      r223862/r224203 tried to also combine base-updating load/stores.
      There was a mistake there: the alignment was added as is as an operand to
      the ARMISD::VLD/VST node.  However, the VLD/VST selection logic doesn't care
      about less-than-standard alignment attributes.
      For example, no matter the alignment of a v2i64 load (say 1), SelectVLD picks
      VLD1q64 (because of the memory type).  But VLD1q64 ("vld1.64 {dXX, dYY}") is
      8-aligned, per ARMARMv7a 3.2.1.
      For the 1-aligned load, what we really want is VLD1q8.
      
      This commit introduces bitcasts if necessary, and changes the vld/vst type to
      one whose standard alignment matches the original load/store alignment.
      
      Differential Revision: http://reviews.llvm.org/D6759
      
      llvm-svn: 224754
      4553bff4
  22. Dec 22, 2014
  23. Dec 18, 2014
    • Eric Christopher's avatar
      Add a new string member to the TargetOptions struct for the name · 661f2d1c
      Eric Christopher authored
      of the abi we should be using. For targets that don't use the
      option there's no change, otherwise this allows external users
      to set the ABI via string and avoid some of the -backend-option
      pain in clang.
      
      Use this option to move the ABI for the ARM port from the
      Subtarget to the TargetMachine and update the testcases
      accordingly since it's no longer valid to set via -mattr.
      
      llvm-svn: 224492
      661f2d1c
    • Eric Christopher's avatar
      Model ARM backend ABI selection after the front end code doing the · 1971c350
      Eric Christopher authored
      same. This will change the "bare metal" ABI from APCS to AAPCS.
      
      The only difference between the front and back end code is that
      the code for Triple::GNU was added for environment. That will migrate
      to the front end shortly.
      
      Tests updated with the ABI they were originally testing in the case
      of bare metal (e.g. -mtriple armv7) or with a -gnu for arm-linux
      triples.
      
      llvm-svn: 224489
      1971c350
  24. Dec 16, 2014
  25. Dec 15, 2014
    • Duncan P. N. Exon Smith's avatar
      IR: Make metadata typeless in assembly · be7ea19b
      Duncan P. N. Exon Smith authored
      Now that `Metadata` is typeless, reflect that in the assembly.  These
      are the matching assembly changes for the metadata/value split in
      r223802.
      
        - Only use the `metadata` type when referencing metadata from a call
          intrinsic -- i.e., only when it's used as a `Value`.
      
        - Stop pretending that `ValueAsMetadata` is wrapped in an `MDNode`
          when referencing it from call intrinsics.
      
      So, assembly like this:
      
          define @foo(i32 %v) {
            call void @llvm.foo(metadata !{i32 %v}, metadata !0)
            call void @llvm.foo(metadata !{i32 7}, metadata !0)
            call void @llvm.foo(metadata !1, metadata !0)
            call void @llvm.foo(metadata !3, metadata !0)
            call void @llvm.foo(metadata !{metadata !3}, metadata !0)
            ret void, !bar !2
          }
          !0 = metadata !{metadata !2}
          !1 = metadata !{i32* @global}
          !2 = metadata !{metadata !3}
          !3 = metadata !{}
      
      turns into this:
      
          define @foo(i32 %v) {
            call void @llvm.foo(metadata i32 %v, metadata !0)
            call void @llvm.foo(metadata i32 7, metadata !0)
            call void @llvm.foo(metadata i32* @global, metadata !0)
            call void @llvm.foo(metadata !3, metadata !0)
            call void @llvm.foo(metadata !{!3}, metadata !0)
            ret void, !bar !2
          }
          !0 = !{!2}
          !1 = !{i32* @global}
          !2 = !{!3}
          !3 = !{}
      
      I wrote an upgrade script that handled almost all of the tests in llvm
      and many of the tests in cfe (even handling many `CHECK` lines).  I've
      attached it (or will attach it in a moment if you're speedy) to PR21532
      to help everyone update their out-of-tree testcases.
      
      This is part of PR21532.
      
      llvm-svn: 224257
      be7ea19b
  26. Dec 14, 2014
    • Ahmed Bougacha's avatar
      Reapply "[ARM] Combine base-updating/post-incrementing vector load/stores." · 0cb86163
      Ahmed Bougacha authored
      r223862 tried to also combine base-updating load/stores.
      r224198 reverted it, as "it created a regression on the test-suite
      on test MultiSource/Benchmarks/Ptrdist/anagram by scrambling the order
      in which the words are shown."
      Reapply, with a fix to ignore non-normal load/stores.
      Truncstores are handled elsewhere (you can actually write a pattern for
      those, whereas for postinc loads you can't, since they return two values),
      but it should be possible to also combine extloads base updates, by checking
      that the memory (rather than result) type is of the same size as the addend.
      
      Original commit message:
      We used to only combine intrinsics, and turn them into VLD1_UPD/VST1_UPD
      when the base pointer is incremented after the load/store.
      
      We can do the same thing for generic load/stores.
      
      Note that we can only combine the first load/store+adds pair in
      a sequence (as might be generated for a v16f32 load for instance),
      because other combines turn the base pointer addition chain (each
      computing the address of the next load, from the address of the last
      load) into independent additions (common base pointer + this load's
      offset).
      
      Differential Revision: http://reviews.llvm.org/D6585
      
      llvm-svn: 224203
      0cb86163
  27. Dec 13, 2014
  28. Dec 12, 2014
    • Charlie Turner's avatar
      Emit Tag_ABI_FP_16bit_format build attribute. · 1a53996c
      Charlie Turner authored
      The __fp16 type is unconditionally exposed. Since -mfp16-format is not yet
      supported, there is not a user switch to change this behaviour. This build
      attribute should capture the default behaviour of the compiler, which is to
      expose the IEEE 754 version of __fp16.
      
      When -mfp16-format is emitted, that will be the way to control the value of
      this build attribute.
      
      Change-Id: I8a46641ff0fd2ef8ad0af5f482a6d1af2ac3f6b0
      llvm-svn: 224115
      1a53996c
  29. Dec 11, 2014
    • Tim Northover's avatar
      ARM: correctly expand LDR-lit based globals. · 2ac7e4b3
      Tim Northover authored
      Quite a major error here: the expansions for the Pseudos with and without
      folded load were mixed up. Fortunately it only affects ARM-mode, when not using
      movw/movt, on Darwin. I'm guessing no-one actually uses that combination.
      
      llvm-svn: 223986
      2ac7e4b3
  30. Dec 10, 2014
    • Ahmed Bougacha's avatar
      [ARM] Combine base-updating/post-incrementing vector load/stores. · 7efbac74
      Ahmed Bougacha authored
      We used to only combine intrinsics, and turn them into VLD1_UPD/VST1_UPD
      when the base pointer is incremented after the load/store.
      
      We can do the same thing for generic load/stores.
      
      Note that we can only combine the first load/store+adds pair in
      a sequence (as might be generated for a v16f32 load for instance),
      because other combines turn the base pointer addition chain (each
      computing the address of the next load, from the address of the last
      load) into independent additions (common base pointer + this load's
      offset).
      
      Differential Revision: http://reviews.llvm.org/D6585
      
      llvm-svn: 223862
      7efbac74
  31. Dec 09, 2014
Loading