Skip to content
  1. Jul 17, 2014
    • Alp Toker's avatar
      Drop the udis86 wrapper from llvm::sys · 11698180
      Alp Toker authored
      This optional dependency on the udis86 library was added some time back to aid
      JIT development, but doesn't make much sense to link into LLVM binaries these
      days.
      
      llvm-svn: 213300
      11698180
    • Reid Kleckner's avatar
      TableGen: Add 'static' to a large array to avoid a huge stack allocation · 132c40fd
      Reid Kleckner authored
      Speculative fix for a -Wframe-larger-than warning from gcc.  Clang will
      implicitly promote such constant arrays to globals, so in theory it
      won't hit this.
      
      llvm-svn: 213298
      132c40fd
    • Arnaud A. de Grandmaison's avatar
    • Suyog Sarda's avatar
      Rectify r213231. Use proper version of 'ComputeNumSignBits'. · 1a212203
      Suyog Sarda authored
      Earlier when the code was in InstCombine, we were calling the version of ComputeNumSignBits in InstCombine.h
      that automatically added the DataLayout* before calling into ValueTracking.
      When the code moved to InstSimplify, we are calling into ValueTracking directly without passing in the DataLayout*.
      This patch rectifies the same by passing DataLayout in ComputeNumSignBits.
      
      llvm-svn: 213295
      1a212203
    • Lang Hames's avatar
      [MCJIT] Significantly refactor the RuntimeDyldMachO class. · a521688c
      Lang Hames authored
      The previous implementation of RuntimeDyldMachO mixed logic for all targets
      within a single class, creating problems for readability, maintainability, and
      performance. To address these issues, this patch strips the RuntimeDyldMachO
      class down to just target-independent functionality, and moves all
      target-specific functionality into target-specific subclasses RuntimeDyldMachO.
      
      The new class hierarchy is as follows:
      
      class RuntimeDyldMachO
      Implemented in RuntimeDyldMachO.{h,cpp}
      Contains logic that is completely independent of the target. This consists
      mostly of MachO helper utilities which the derived classes use to get their
      work done.
      
      
      template <typename Impl>
      class RuntimeDyldMachOCRTPBase<Impl> : public RuntimeDyldMachO
      Implemented in RuntimeDyldMachO.h
      Contains generic MachO algorithms/data structures that defer to the Impl class
      for target-specific behaviors.
      
      RuntimeDyldMachOARM : public RuntimeDyldMachOCRTPBase<RuntimeDyldMachOARM>
      RuntimeDyldMachOARM64 : public RuntimeDyldMachOCRTPBase<RuntimeDyldMachOARM64>
      RuntimeDyldMachOI386 : public RuntimeDyldMachOCRTPBase<RuntimeDyldMachOI386>
      RuntimeDyldMachOX86_64 : public RuntimeDyldMachOCRTPBase<RuntimeDyldMachOX86_64>
      Implemented in their respective *.h files in lib/ExecutionEngine/RuntimeDyld/MachOTargets
      Each of these contains the relocation logic specific to their target architecture.
      
      llvm-svn: 213293
      a521688c
    • Alexey Samsonov's avatar
      [ASan] Don't instrument load/stores with !nosanitize metadata. · 535b6f93
      Alexey Samsonov authored
      This is used to avoid instrumentation of instructions added by UBSan
      in Clang frontend (see r213291). This fixes PR20085.
      
      Reviewed in http://reviews.llvm.org/D4544.
      
      llvm-svn: 213292
      535b6f93
    • Hans Wennborg's avatar
      Typo: exists -> exits · 0fc52d4e
      Hans Wennborg authored
      llvm-svn: 213290
      0fc52d4e
    • Justin Holewinski's avatar
      [NVPTX] Improve handling of FP fusion · 428cf0e4
      Justin Holewinski authored
      We now consider the FPOpFusion flag when determining whether
      to fuse ops.  We also explicitly emit add.rn when fusion is
      disabled to prevent ptxas from fusing the operations on its
      own.
      
      llvm-svn: 213287
      428cf0e4
    • Matt Arsenault's avatar
      Fix typos · 97483694
      Matt Arsenault authored
      llvm-svn: 213285
      97483694
    • Zinovy Nis's avatar
    • Adam Nemet's avatar
      [X86] AVX512: Add disassembler support for compressed displacement · 5933c2f8
      Adam Nemet authored
      There are two parts here.  First is to modify tablegen to adjust the encoding
      type ENCODING_RM with the scaling factor.
      
      The second is to use the new encoding types to compute the correct
      displacement in the decoder.
      
      Fixes <rdar://problem/17608489>
      
      llvm-svn: 213281
      5933c2f8
    • Adam Nemet's avatar
      [X86] AVX512: Rename EVEX_CD8V to CD8_Form · 4c339aba
      Adam Nemet authored
      This is to match the naming of CD8_EltSize, CD8_Scale, etc.
      
      No functional change.
      
      llvm-svn: 213280
      4c339aba
    • Adam Nemet's avatar
      [X86] AVX512: Use the TD version of CD8_Scale in the assembler · 54adb0fc
      Adam Nemet authored
      Passes the computed scaling factor in TSFlags rather than the old attributes.
      
      Also removes the C++ version of computing the scaling factor (MemObjSize)
      along with the asserts added by the previous patch.
      
      No functional change.
      
      llvm-svn: 213279
      54adb0fc
    • Adam Nemet's avatar
      [X86] AVX512: Move compressed displacement logic to TD · 4dc92b9a
      Adam Nemet authored
      This does not actually move the logic yet but reimplements it in the Tablegen
      language.  Then asserts that the new implementation results in the same value.
      
      The next patch will remove the assert and the temporary use of the TSFlags and
      remove the C++ implementation.
      
      The formula requires a limited form of the logical left and right operators.
      I implemented these with the bit-extract/insert operator (i.e. blah{bits}).
      
      No functional change.
      
      llvm-svn: 213278
      4dc92b9a
    • Adam Nemet's avatar
      [TableGen] Allow shift operators to take bits<n> · 017fca02
      Adam Nemet authored
      Convert the operand to int if possible, i.e. if the value is properly
      initialized.  (I suppose there is further room for improvement here to also
      peform the shift if the uninitialized bits are shifted out.)
      
      With this little change we can now compute the scaling factor for compressed
      displacement with pure tablegen code in the X86 backend.  This is useful
      because both the X86-disassembler-specific part of tablegen and the assembler
      need this and TD is the natural sharing place.
      
      The patch also adds the missing documentation for the shift and add operator.
      
      llvm-svn: 213277
      017fca02
    • Justin Holewinski's avatar
      [NVPTX] Add missing .v4 qualifier on vector store instruction · e5a1173f
      Justin Holewinski authored
      llvm-svn: 213276
      e5a1173f
    • Saleem Abdulrasool's avatar
      MC: correct DWARF header for PE/COFF assembly input · 7d09530c
      Saleem Abdulrasool authored
      The header contains an offset to the DWARF abbreviations for the CU.  The offset
      must be section relative for COFF and absolute for others.  The non-assembly
      code path for the DWARF header generation already had the correct emission for
      the headers.  This corrects just the assembly path.  Due to the invalid
      relocation, processing of the debug information would halt previously on the
      first assembly input as the associated abbreviations would be out of range as
      they would have the location increased by image base and the section offset.
      
      This address PR20332.
      
      llvm-svn: 213275
      7d09530c
    • Saleem Abdulrasool's avatar
      MC: fix MCAsmInfo usage for windows-itanium · 862e60c7
      Saleem Abdulrasool authored
      Windows itanium uses the GNUCOFF assmebly format, not ELF.
      
      llvm-svn: 213274
      862e60c7
    • Saleem Abdulrasool's avatar
      MC: collapse emission of producer · 19f8bc65
      Saleem Abdulrasool authored
      Rather than use three EmitBytes, concatenate the string at compile time,
      constructing a single StringRef and emitting the data in one shot.  This also
      creates nicer assembly output.  NFC.
      
      llvm-svn: 213273
      19f8bc65
    • Justin Holewinski's avatar
      [NVPTX] Flag surface/texture query instructions with IsTexSurfQuery · 18cfe7d6
      Justin Holewinski authored
      Also, add some tests to make sure we can handle surface/texture
      queries on both Fermi and Kepler+.
      
      llvm-svn: 213268
      18cfe7d6
    • Justin Holewinski's avatar
      [NVPTX] Add more surface/texture intrinsics, including CUDA unified texture fetch · 9a2350e4
      Justin Holewinski authored
      This also uses TSFlags to mark machine instructions that are surface/texture
      accesses, as well as the vector width for surface operations.  This is used
      to simplify some of the switch statements that need to detect surface/texture
      instructions
      
      llvm-svn: 213256
      9a2350e4
    • Tim Northover's avatar
      ARM: support direct f16 <-> f64 conversions · 53f3bcf0
      Tim Northover authored
      ARMv8 has instructions to handle it, otherwise a libcall is needed.
      
      llvm-svn: 213254
      53f3bcf0
    • Justin Holewinski's avatar
      8a5bf7fa
    • Tim Northover's avatar
      CodeGen: generate single libcall for fptrunc -> f16 operations. · 84ce0a64
      Tim Northover authored
      Previously we asserted on this code. Currently compiler-rt doesn't
      actually implement any of these new libcalls, but external help is
      pretty much the only viable option for LLVM.
      
      I've followed the much more generic "__truncST2" naming, as opposed to
      the odd name for f32 -> f16 truncation. This can obviously be changed
      later, or overridden by any targets that need to.
      
      llvm-svn: 213252
      84ce0a64
    • Tim Northover's avatar
      X86: support double extension of f16 type. · 21310448
      Tim Northover authored
      x86 has no native ability to extend an f16 to f64, but the same result
      is obtained if we expand it into two separate extensions: f16 -> f32
      -> f64.
      
      Unfortunately the same is not true for truncate, so that still results
      in a compilation failure.
      
      llvm-svn: 213251
      21310448
    • Tim Northover's avatar
      CodeGen: extend f16 conversions to permit types > float. · fd7e4249
      Tim Northover authored
      This makes the two intrinsics @llvm.convert.from.f16 and
      @llvm.convert.to.f16 accept types other than simple "float". This is
      only strictly needed for the truncate operation, since otherwise
      double rounding occurs and there's no way to represent the strict IEEE
      conversion. However, for symmetry we allow larger types in the extend
      too.
      
      During legalization, we can expand an "fp16_to_double" operation into
      two extends for convenience, but abort when the truncate isn't legal. A new
      libcall is probably needed here.
      
      Even after this commit, various target tweaks are needed to actually use the
      extended intrinsics. I've put these into separate commits for clarity, so there
      are no actual tests of f64 conversion here.
      
      llvm-svn: 213248
      fd7e4249
    • Yi Kong's avatar
      Port memory barriers intrinsics to AArch64 · 2355066e
      Yi Kong authored
      Memory barrier __builtin_arm_[dmb, dsb, isb] intrinsics are required to
      implement their corresponding ACLE and MSVC intrinsics.
      
      This patch ports ARM dmb, dsb, isb intrinsic to AArch64.
      
      Differential Revision: http://reviews.llvm.org/D4520
      
      llvm-svn: 213247
      2355066e
    • Daniel Sanders's avatar
      [mips] .reginfo is 8 byte aligned on N32. · 701e9616
      Daniel Sanders authored
      Differential Revision: http://reviews.llvm.org/D4540
      
      llvm-svn: 213246
      701e9616
    • Daniel Sanders's avatar
      [mips] Correct ELF e_flags for the N32 ABI when using a mips-* triple rather than a mips64-* triple · 7f70573e
      Daniel Sanders authored
      Summary:
      Generally speaking, mips-* vs mips64-* should not be used to make decisions
      about the content or format of the ELF. This should be based on the ABI
      and CPU in use. For example, `mips-linux-gnu-clang -mips64r2 -mabi=64`
      should produce an ELF64 as should `mips64-linux-gnu-clang -mabi=64`.
      Conversely, `mips64-linux-gnu-clang -mabi=n32` should produce an ELF32 as
      should `mips-linux-gnu-clang -mips64r2 -mabi=n32`.
      
      This patch fixes the e_flags but leaves the ELF32 vs ELF64 issue for now
      since there is no apparent way to base this decision on the ABI and CPU.
      
      Differential Revision: http://reviews.llvm.org/D4539
      
      llvm-svn: 213244
      7f70573e
    • Daniel Sanders's avatar
      [mips] Correct .MIPS.abiflags for -mfpxx on MIPS32r6 · 185f23ad
      Daniel Sanders authored
      Summary:
      The cpr1_size field describes the minimum register width to run the program
      rather than the size of the registers on the target. MIPS32r6 was acting
      as if -mfp64 has been given because it starts off with 64-bit FPU registers.
      
      Differential Revision: http://reviews.llvm.org/D4538
      
      llvm-svn: 213243
      185f23ad
    • Daniel Sanders's avatar
      [mips] Fix ELF e_flags related to -mabicalls and -mplt. · 16ec6c19
      Daniel Sanders authored
      Summary:
      These options are not implemented yet but we act as if they are always
      given.
      
      The integrated assembler is driven by the clang driver so the e_flag test
      cases should match the e_flags emitted by GCC+GAS rather than GAS
      by itself.
      
      Differential Revision: http://reviews.llvm.org/D4536
      
      llvm-svn: 213242
      16ec6c19
    • Yi Kong's avatar
      Fix the prefix for arm64 triple · 7d78ab57
      Yi Kong authored
      Triple.cpp still returns "arm64" as prefix for arm64 triple, causing Clang not
      being able to select the correct GCCBuiltin IR.
      
      This patch changes the value to correct prefix "aarch64". Regression test will
      be added in the coming patch.
      
      Differential Revision: http://reviews.llvm.org/D4516
      
      llvm-svn: 213240
      7d78ab57
    • Evgeniy Stepanov's avatar
      [msan] Avoid redundant origin stores. · c8227aa1
      Evgeniy Stepanov authored
      Origin is meaningless for fully initialized values. Avoid
      storing origin for function arguments that are known to
      be always initialized (i.e. shadow is a compile-time null
      constant).
      
      This is not about correctness, but purely an optimization.
      Seems to affect compilation time of blacklisted functions
      significantly.
      
      llvm-svn: 213239
      c8227aa1
    • Suyog Sarda's avatar
      Move ashr optimization from InstCombineShift to InstSimplify. · 68862414
      Suyog Sarda authored
      Refactor code, no functionality change, test case moved from instcombine to instsimplify.
      
      Differential Revision: http://reviews.llvm.org/D4102
       
      
      llvm-svn: 213231
      68862414
    • Matt Arsenault's avatar
      Use range for · ac6e39cf
      Matt Arsenault authored
      llvm-svn: 213230
      ac6e39cf
    • Matt Arsenault's avatar
      R600: Short circuit alloca check if address space isn't private. · 5e2b0f51
      Matt Arsenault authored
      Skip calling GetUnderlyingObject in cases where it obviously
      isn't from an alloca. This should only be a compile time improvement.
      
      llvm-svn: 213229
      5e2b0f51
    • Suyog Sarda's avatar
      Fix Typo (first commit to test commit access) · de409fd7
      Suyog Sarda authored
      llvm-svn: 213228
      de409fd7
    • Eric Fiselier's avatar
      [lit] Add --show-unsupported flag to LIT · 5cfa2e46
      Eric Fiselier authored
      llvm-svn: 213227
      5cfa2e46
    • Saleem Abdulrasool's avatar
      MC: make WinEH opcode an opaque value · ab820860
      Saleem Abdulrasool authored
      This makes the opcode an opaque value (unsigned int) rather than the
      enumeration.  This permits the use of target specific operands.
      
      Split out the generic type into a MCWinEH header and add a supporting
      MCWin64EH::Instruction to abstract out the selection of the opcode and
      construction of the actual instruction.
      
      llvm-svn: 213221
      ab820860
    • Hal Finkel's avatar
      Improve BasicAA CS-CS queries (redux) · 354e23b0
      Hal Finkel authored
      This reverts, "r213024 - Revert r212572 "improve BasicAA CS-CS queries", it
      causes PR20303." with a fix for the bug in pr20303. As it turned out, the
      relevant code was both wrong and over-conservative (because, as with the code
      it replaced, it would return the overall ModRef mask even if just Ref had been
      implied by the argument aliasing results). Hopefully, this correctly fixes both
      problems.
      
      Thanks to Nick Lewycky for reducing the test case for pr20303 (which I've
      cleaned up a little and added in DSE's test directory). The BasicAA test has
      also been updated to check for this error.
      
      Original commit message:
      
      BasicAA contains knowledge of certain intrinsics, such as memcpy and memset,
      and uses that information to form more-accurate answers to CallSite vs. Loc
      ModRef queries. Unfortunately, it did not use this information when answering
      CallSite vs. CallSite queries.
      
      Generically, when an intrinsic takes one or more pointers and the intrinsic is
      marked only to read/write from its arguments, the offset/size is unknown. As a
      result, the generic code that answers CallSite vs. CallSite (and CallSite vs.
      Loc) queries in AA uses UnknownSize when forming Locs from an intrinsic's
      arguments. While BasicAA's CallSite vs. Loc override could use more-accurate
      size information for some intrinsics, it did not do the same for CallSite vs.
      CallSite queries.
      
      This change refactors the intrinsic-specific logic in BasicAA into a generic AA
      query function: getArgLocation, which is overridden by BasicAA to supply the
      intrinsic-specific knowledge, and used by AA's generic implementation. This
      allows the intrinsic-specific knowledge to be used by both CallSite vs. Loc and
      CallSite vs. CallSite queries, and simplifies the BasicAA implementation.
      
      Currently, only one function, Mac's memset_pattern16, is handled by BasicAA
      (all the rest are intrinsics). As a side-effect of this refactoring, BasicAA's
      getModRefBehavior override now also returns OnlyAccessesArgumentPointees for
      this function (which is an improvement).
      
      llvm-svn: 213219
      354e23b0
Loading