Skip to content
  1. Aug 09, 2018
    • Reid Kleckner's avatar
      [GlobalOpt] Don't apply fastcc if it would break inalloca invariants · 80c6ec11
      Reid Kleckner authored
      The inalloca parameter has to be the only parameter passed in memory.
      Changing the convention to fastcc can break that.
      
      At some point we should teach global opt how to optimize ABI attributes
      like inalloca and maybe byval. These attributes are mainly used to match
      C ABIs. They are harder for LLVM to optimize and they don't always
      generate the best code.
      
      Fixes PR38487
      
      llvm-svn: 339360
      80c6ec11
    • Sanjay Patel's avatar
      [SelectionDAG] try harder to convert funnel shift to rotate · 15d1501a
      Sanjay Patel authored
      Similar to rL337966 - if the DAGCombiner's rotate matching was 
      working as expected, I don't think we'd see any test diffs here.
      
      AArch only goes right, and PPC only goes left. 
      x86 has both, so no diffs there.
      
      Differential Revision: https://reviews.llvm.org/D50091
      
      llvm-svn: 339359
      15d1501a
    • Michael Berg's avatar
      extend folding fsub/fadd to fneg for FMF · ca382546
      Michael Berg authored
      Summary: This change provides a common optimization path for both Unsafe and FMF driven optimization for this fsub fold adding reassociation, as it the flag that most closely represents the translation
      
      Reviewers: spatel, wristow, arsenm
      
      Reviewed By: spatel
      
      Subscribers: wdng
      
      Differential Revision: https://reviews.llvm.org/D50195
      
      llvm-svn: 339357
      ca382546
    • Evandro Menezes's avatar
      [ARM] Adjust the feature set for Exynos · 8c436627
      Evandro Menezes authored
      Enable `FeatureZCZeroing`, `FeatureHasSlowFPVMLx`, `FeatureExpandMLx`,
      `FeatureProfUnpredicate`, `FeatureSlowVDUP32`, `FeatureSlowVGETLNi32`,
      `FeatureSplatVFPToNeon`, `FeatureHasRetAddrStack`, `FeatureSlowFPBrcc` for
      all Exynos processors.
      
      llvm-svn: 339356
      8c436627
    • Evandro Menezes's avatar
      [ARM] Replace processor check with feature · 9a92fe0c
      Evandro Menezes authored
      Add new feature, `FeatureUseWideStrideVFP`, that replaces the need for a
      processor check.  Otherwise, NFC.
      
      llvm-svn: 339354
      9a92fe0c
    • Andrea Di Biagio's avatar
      [MC][PredicateExpander] Extend the grammar to support simple switch and return statements. · f3bde048
      Andrea Di Biagio authored
      This patch introduces tablegen class MCStatement.
      
      Currently, an MCStatement can be either a return statement, or a switch
      statement.
      
      ```
      MCStatement:
         MCReturnStatement
         MCOpcodeSwitchStatement
      ```
      
      A MCReturnStatement expands to a return statement, and the boolean expression
      associated with the return statement is described by a MCInstPredicate.
      
      An MCOpcodeSwitchStatement is a switch statement where the condition is a check
      on the machine opcode. It allows the definition of multiple checks, as well as a
      default case. More details on the grammar implemented by these two new
      constructs can be found in the diff for TargetInstrPredicates.td.
      
      This patch makes it easier to read the body of auto-generated TargetInstrInfo
      predicates.
      
      In future, I plan to reuse/extend the MCStatement grammar to describe more
      complex target hooks. For now, this is just a first step (mostly a minor
      cosmetic change to polish the new predicates framework).
      
      Differential Revision: https://reviews.llvm.org/D50457
      
      llvm-svn: 339352
      f3bde048
    • Bjorn Pettersson's avatar
      [MC] Remove PhysRegSize from MCRegisterClass · c8b782ce
      Bjorn Pettersson authored
      Summary:
      The interface to get size and spill size of a register
      was moved from MCRegisterInfo to TargetRegisterInfo over
      a year ago. Afaik the old interface has bee around
      to give out-of-tree targets a chance to adapt to the
      new interface.
      
      One problem with the old MCRegisterClass::PhysRegSize was that
      it represented the size of a register as "size in bits" / 8.
      So a register had to be a multiple of eight bits wide for the
      size to be correct (and the byte size for the target needed to
      be eight bits).
      
      Reviewers: kparzysz, qcolombet
      
      Reviewed By: kparzysz
      
      Subscribers: llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D47199
      
      llvm-svn: 339350
      c8b782ce
    • Sanjay Patel's avatar
      [InstCombine] reduce code duplication; NFC · ebec4204
      Sanjay Patel authored
      llvm-svn: 339349
      ebec4204
    • Simon Pilgrim's avatar
      [TargetLowering] Add BuildSDIVPattern helper to BuildExactSDIV (NFCI). · a9f95429
      Simon Pilgrim authored
      As requested in D50392, pull the magic constant calculations out into a helper function.
      
      llvm-svn: 339346
      a9f95429
    • Sjoerd Meijer's avatar
      [ARM] FP16: codegen support for VTRN · 806f70d2
      Sjoerd Meijer authored
      Differential Revision: https://reviews.llvm.org/D50454
      
      llvm-svn: 339340
      806f70d2
    • Simon Pilgrim's avatar
      [X86][SSE] Remove PMULDQ/PMULUDQ by zero · 511c3fc5
      Simon Pilgrim authored
      Exposed by D50328
      
      Differential Revision: https://reviews.llvm.org/D50328
      
      llvm-svn: 339337
      511c3fc5
    • Simon Pilgrim's avatar
      [X86][SSE] Combine (some) target shuffles with multiple uses · 01ae462f
      Simon Pilgrim authored
      As discussed on D41794, we have many cases where we fail to combine shuffles as the input operands have other uses.
      
      This patch permits these shuffles to be combined as long as they don't introduce additional variable shuffle masks, which should reduce instruction dependencies and allow the total number of shuffles to still drop without increasing the constant pool.
      
      However, this may mean that some memory folds may no longer occur, and on pre-AVX require the occasional extra register move.
      
      This also exposes some poor PMULDQ/PMULUDQ codegen which was doing unnecessary upper/lower calculations which will in fact fold to zero/undef - the fix will be added in a followup commit.
      
      Differential Revision: https://reviews.llvm.org/D50328
      
      llvm-svn: 339335
      01ae462f
    • Andrew V. Tischenko's avatar
      24f63bcb
    • Jonas Hahnfeld's avatar
      [NVPTX] Select atomic loads and stores · 20526bf4
      Jonas Hahnfeld authored
      According to PTX ISA .volatile has the same memory synchronization
      semantics as .relaxed.sys, so it can be used to implement monotonic
      atomic loads and stores. This is important for OpenMP's atomic
      construct where
       - 'read's and 'write's are lowered to atomic loads and stores, and
       - an update of float or double types are lowered into a cmpxchg loop.
      (Note that PTX could do better because it has atom.add.f{32,64} but
      LLVM's atomicrmw instruction only allows integer types.)
      
      Higher levels of atomicity (like acquire and release) need additional
      synchronization properties which were added with PTX ISA 6.0 / sm_70.
      So using these instructions still results in an error.
      
      Differential Revision: https://reviews.llvm.org/D50391
      
      llvm-svn: 339316
      20526bf4
    • Roger Ferrer Ibanez's avatar
      [RISCV] Add "lla" pseudo-instruction to assembler · 577a97e2
      Roger Ferrer Ibanez authored
      This pseudo-instruction is similar to la but uses PC-relative addressing
      unconditionally. This is, la is only different to lla when using -fPIC. This
      pseudo-instruction seems often forgotten in several specs but it is definitely
      mentioned in binutils opcodes/riscv-opc.c. The semantics are defined both in
      page 37 of the "RISC-V Reader" book but also in function macro found in
      gas/config/tc-riscv.c.
      
      This is a very first step towards adding PIC support for Linux in the RISC-V
      backend.
      
      The lla pseudo-instruction expands to a sequence of auipc + addi with a couple
      of pc-rel relocations where the second points to the first one. This is
      described in
      https://github.com/riscv/riscv-elf-psabi-doc/blob/master/riscv-elf.md#pc-relative-symbol-addresses
      
      For now, this patch only introduces support of that pseudo instruction at the
      assembler parser.
      
      Differential Revision: https://reviews.llvm.org/D49661
      
      llvm-svn: 339314
      577a97e2
    • JF Bastien's avatar
      [NFC] ConstantMerge: don't insert when find should be used · 3f270336
      JF Bastien authored
      Summary: DenseMap's operator[] performs an insertion if the entry isn't found. The second phase of ConstantMerge isn't trying to insert anything: it's just looking to see if the first phased performed an insertion. Use find instead, avoiding insertion of every single global initializer in the map of constants. This has the side-effect of making all entries in CMap non-null (because only global declarations would have null initializers, and that would be a bug).
      
      Subscribers: dexonsmith, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D50476
      
      llvm-svn: 339309
      3f270336
    • Philip Reames's avatar
    • Paul Robinson's avatar
      [DWARF] Verifier now handles .debug_types sections. · 508b0815
      Paul Robinson authored
      Differential Revision: https://reviews.llvm.org/D50466
      
      llvm-svn: 339302
      508b0815
    • Sanjay Patel's avatar
      [DAGCombiner] loosen constraints for fsub+fadd fold · e47dc1a4
      Sanjay Patel authored
      isNegatibleForFree() should not matter here (as the test diffs show)
      because it's always a win to replace an fsub+fadd with fneg. The
      problem in D50195 persists because either (1) we are doing these
      folds in the wrong order or (2) we're missing another fold for fadd.
      
      llvm-svn: 339299
      e47dc1a4
    • Sanjay Patel's avatar
      [DAGCombiner] move fadd simplification ahead of other folds · e327266d
      Sanjay Patel authored
        
      I don't know if it's possible to expose this diff in a test,
      but we should always try simplifications (no new nodes created)
      before more complicated transforms for efficiency (similar to
      what we do in IR).
      
      llvm-svn: 339298
      e327266d
    • Petr Hosek's avatar
      [ADT] Normalize empty triple components · 7b274544
      Petr Hosek authored
      LLVM triple normalization is handling "unknown" and empty components
      differently; for example given "x86_64-unknown-linux-gnu" and
      "x86_64-linux-gnu" which should be equivalent, triple normalization
      returns "x86_64-unknown-linux-gnu" and "x86_64--linux-gnu". autoconf's
      config.sub returns "x86_64-unknown-linux-gnu" for both
      "x86_64-linux-gnu" and "x86_64-unknown-linux-gnu". This changes the
      triple normalization to behave the same way, replacing empty triple
      components with "unknown".
      
      This addresses PR37129.
      
      Differential Revision: https://reviews.llvm.org/D50219
      
      llvm-svn: 339294
      7b274544
  2. Aug 08, 2018
Loading