Skip to content
  1. Mar 10, 2014
    • Benjamin Kramer's avatar
      MemCpyOpt: When merging memsets also merge the trivial case of two memsets... · 3ef5e46b
      Benjamin Kramer authored
      MemCpyOpt: When merging memsets also merge the trivial case of two memsets with the same destination.
      
      The testcase is from PR19092, but I think the bug described there is actually a clang issue.
      
      llvm-svn: 203489
      3ef5e46b
    • Evan Cheng's avatar
      For functions with ARM target specific calling convention, when simplify-libcall · 0e8f4612
      Evan Cheng authored
      optimize a call to a llvm intrinsic to something that invovles a call to a C
      library call, make sure it sets the right calling convention on the call.
      
      e.g.
      extern double pow(double, double);
      double t(double x) {
        return pow(10, x);
      }
      
      Compiles to something like this for AAPCS-VFP:
      define arm_aapcs_vfpcc double @t(double %x) #0 {
      entry:
        %0 = call double @llvm.pow.f64(double 1.000000e+01, double %x)
        ret double %0
      }
      
      declare double @llvm.pow.f64(double, double) #1
      
      Simplify libcall (part of instcombine) will turn the above into:
      define arm_aapcs_vfpcc double @t(double %x) #0 {
      entry:
        %__exp10 = call double @__exp10(double %x) #1
        ret double %__exp10
      }
      
      declare double @__exp10(double)
      
      The pre-instcombine code works because calls to LLVM builtins are special.
      Instruction selection will chose the right calling convention for the call.
      However, the code after instcombine is wrong. The call to __exp10 will use
      the C calling convention.
      
      I can think of 3 options to fix this.
      
      1. Make "C" calling convention just work since the target should know what CC
         is being used.
      
         This doesn't work because each function can use different CC with the "pcs"
         attribute.
      
      2. Have Clang add the right CC keyword on the calls to LLVM builtin.
      
         This will work but it doesn't match the LLVM IR specification which states
         these are "Standard C Library Intrinsics".
      
      3. Fix simplify libcall so the resulting calls to the C routines will have the
         proper CC keyword. e.g.
         %__exp10 = call arm_aapcs_vfpcc double @__exp10(double %x) #1
      
         This works and is the solution I implemented here.
      
      Both solutions #2 and #3 would work. After carefully considering the pros and
      cons, I decided to implement #3 for the following reasons.
      
      1. It doesn't change the "spec" of the intrinsics.
      2. It's a self-contained fix.
      
      There are a couple of potential downsides.
      1. There could be other places in the optimizer that is broken in the same way
         that's not addressed by this.
      2. There could be other calling conventions that need to be propagated by
         simplify-libcall that's not handled.
      
      But for now, this is the fix that I'm most comfortable with.
      
      llvm-svn: 203488
      0e8f4612
    • Eli Bendersky's avatar
      Followup to r203483 - add test. · d47a5c2d
      Eli Bendersky authored
      [forgot to 'svn add' before committing r203483]
      
      llvm-svn: 203485
      d47a5c2d
    • Sasa Stankovic's avatar
      [mips] Implement NaCl sandboxing of loads, stores and SP changes: · 5fddf610
      Sasa Stankovic authored
        * Add masking instructions before loads and stores (in MC layer).
        * Add masking instructions after SP changes (in MC layer).
        * Forbid loads, stores and SP changes in delay slots (in MI layer).
      
      Differential Revision: http://llvm-reviews.chandlerc.com/D2904
      
      llvm-svn: 203484
      5fddf610
    • Adam Nemet's avatar
      [bugpoint] Add testcase for r203343. · 47492919
      Adam Nemet authored
      llvm-svn: 203472
      47492919
    • Reed Kotler's avatar
      Fix regression with -O0 for mips . · 96b7402b
      Reed Kotler authored
      llvm-svn: 203469
      96b7402b
    • JF Bastien's avatar
    • Matheus Almeida's avatar
    • Tim Northover's avatar
      AArch64: fix LowerCONCAT_VECTORS for new CodeGen. · 2a661f3f
      Tim Northover authored
      The function was making too many assumptions about its input:
      
      1. The NEON_VDUP optimisation was far too aggressive, assuming (I
      think) that the input would always be BUILD_VECTOR.
      
      2. We were treating most unknown concats as legal (by returning Op
      rather than SDValue()). I think only concats of pairs of vectors are
      actually legal.
      
      http://llvm.org/PR19094
      
      llvm-svn: 203450
      2a661f3f
    • Venkatraman Govindaraju's avatar
      [Sparc] Add support for decoding 'swap' instruction. · f703132b
      Venkatraman Govindaraju authored
      llvm-svn: 203424
      f703132b
  2. Mar 09, 2014
    • NAKAMURA Takumi's avatar
      Revert r203230, "CodeGenPrep: sink extends of illegal types into use block." · 1783e1e9
      NAKAMURA Takumi authored
      It choked i686 stage2.
      
      llvm-svn: 203386
      1783e1e9
    • David Majnemer's avatar
      IR: Change inalloca's grammar a bit · c4ab61cb
      David Majnemer authored
      The grammar for LLVM IR is not well specified in any document but seems
      to obey the following rules:
      
       - Attributes which have parenthesized arguments are never preceded by
         commas.  This form of attribute is the only one which ever has
         optional arguments.  However, not all of these attributes support
         optional arguments: 'thread_local' supports an optional argument but
         'addrspace' does not.  Interestingly, 'addrspace' is documented as
         being a "qualifier".  What constitutes a qualifier?  I cannot find a
         definition.
      
       - Some attributes use a space between the keyword and the value.
         Examples of this form are 'align' and 'section'.  These are always
         preceded by a comma.
      
       - Otherwise, the attribute has no argument.  These attributes do not
         have a preceding comma.
      
      Sometimes an attribute goes before the instruction, between the
      instruction and it's type, or after it's type.  'atomicrmw' has
      'volatile' between the instruction and the type while 'call' has 'tail'
      preceding the instruction.
      
      With all this in mind, it seems most consistent for 'inalloca' on an
      'inalloca' instruction to occur before between the instruction and the
      type.  Unlike the current formulation, there would be no preceding
      comma.  The combination 'alloca inalloca' doesn't look particularly
      appetizing, perhaps a better spelling of 'inalloca' is down the road.
      
      llvm-svn: 203376
      c4ab61cb
  3. Mar 08, 2014
  4. Mar 07, 2014
  5. Mar 06, 2014
    • Rafael Espindola's avatar
      Remove shouldEmitUsedDirectiveFor. · 3b30cb41
      Rafael Espindola authored
      Clang now uses llvm.compiler.used for these cases.
      
      llvm-svn: 203174
      3b30cb41
    • Rafael Espindola's avatar
      Convert test to FileCheck. · 123256a4
      Rafael Espindola authored
      llvm-svn: 203173
      123256a4
    • Andrea Di Biagio's avatar
      [X86] Teach the DAGCombiner how to fold a OR of two shufflevector nodes. · 6292a140
      Andrea Di Biagio authored
      This patch teaches the DAGCombiner how to fold a binary OR between two
      shufflevector into a single shuffle vector when possible.
      
      The rules are:
        1. fold (or (shuf A, V_0, MA), (shuf B, V_0, MB)) -> (shuf A, B, Mask1)
        2. fold (or (shuf A, V_0, MA), (shuf B, V_0, MB)) -> (shuf B, A, Mask2)
      
      The DAGCombiner can take advantage of the fact that OR is commutative and
      compute two possible shuffle masks (Mask1 and Mask2) for the resulting
      shuffle node.
      
      Before folding a dag according to either rule 1 or 2, DAGCombiner verifies
      that the resulting shuffle mask is legal for the target.
      DAGCombiner would firstly try to fold according to 1.; If not possible
      then it will try to fold according to 2.
      If both Mask1 and Mask2 are illegal then we conservatively don't fold
      the OR instruction.
      
      llvm-svn: 203156
      6292a140
    • Rafael Espindola's avatar
      Fix the printing of n_type. · 1194e69f
      Rafael Espindola authored
      Despite the name, n_type contains the type of the symbol, but also if it is
      extern or private extern.
      
      llvm-svn: 203154
      1194e69f
    • Matt Arsenault's avatar
      R600: Fix extloads from i8 / i16 to i64. · f9a995d6
      Matt Arsenault authored
      This appears to only be working for global loads. Private
      and local break for other reasons.
      
      llvm-svn: 203135
      f9a995d6
Loading