Skip to content
  1. Aug 12, 2013
  2. Aug 11, 2013
    • Reed Kotler's avatar
      Don't generate floating point stubs for mips16 code if the function · d265e888
      Reed Kotler authored
      is actually an instrinsic that will not occur in libc. This list here
      is not exhaustive but fixes the one places in test-suite where this occurs.
      I have filed a bug against myself to research the full list and add them
      to the array of such cases. In the future, actual stub generation will occur
      in a later phase and we won't need this code because we will know at that time
      during the compilation that in fact no helper function was even needed.
      
      llvm-svn: 188149
      d265e888
    • Elena Demikhovsky's avatar
      AVX-512: Added more tests for BROADCAST · 5fed3b95
      Elena Demikhovsky authored
      llvm-svn: 188148
      5fed3b95
    • Elena Demikhovsky's avatar
      AVX-512: Added VPERM* instructons and MOV* zmm-to-zmm instructions. · cf5b1458
      Elena Demikhovsky authored
      Added a test for shuffles using VPERM.
      
      llvm-svn: 188147
      cf5b1458
    • Chandler Carruth's avatar
      Re-instate r187323 which fast-tracks promotable allocas as soon as the · d7cd7e36
      Chandler Carruth authored
      SROA-based analysis has enough information. This should work now that
      both mem2reg *and* the SSAUpdater-based AllocaPromoter have been updated
      to be able to promote the types of allocas that the SROA analysis
      detects.
      
      I've included tests for the AllocaPromoter that were only possible to
      write once we fast-tracked promotable allocas without rewriting them.
      This includes a test both for r187347 and r188145.
      
      Original commit log for r187323:
      """
      Now that mem2reg understands how to cope with a slightly wider set of uses of
      an alloca, we can pre-compute promotability while analyzing an alloca for
      splitting in SROA. That lets us short-circuit the common case of a bunch of
      trivially promotable allocas. This cuts 20% to 30% off the run time of SROA for
      typical frontend-generated IR sequneces I'm seeing. It gets the new SROA to
      within 20% of ScalarRepl for such code. My current benchmark for these numbers
      is PR15412, but it fits the general pattern of IR emitted by Clang so it should
      be widely applicable.
      """
      
      llvm-svn: 188146
      d7cd7e36
    • Chandler Carruth's avatar
      Finish fixing the SSAUpdater-based AllocaPromoter strategy in SROA to cope with · c17283b4
      Chandler Carruth authored
      the more general set of patterns that are now handled by mem2reg and that we
      can detect quickly while doing SROA's initial analysis. Notably, this allows it
      to promote through no-op bitcast and GEP sequences. A core part of the
      SSAUpdater approach is the ability to test whether a particular instruction is
      part of the set being promoted. Testing this becomes significantly more complex
      in the world where the operand to every load and store isn't the alloca itself.
      I ended up using the approach of walking up the def-chain until we find the
      alloca. I benchmarked this against keeping a set of pointer operands and
      keeping a set of the loads and stores we care about, and this one seemed faster
      although the difference was very small.
      
      No test case yet because currently the rewriting always "fixes" the inputs to
      not require this. The next patch which re-enables early promotion of easy cases
      in SROA will include a test case that specifically exercises this aspect of the
      alloca promoter.
      
      llvm-svn: 188145
      c17283b4
    • Chandler Carruth's avatar
      Reformat some bits of AllocaPromoter and simplify the name and type of · 45b136f4
      Chandler Carruth authored
      our visiting datastructures in the AllocaPromoter/SSAUpdater path of
      SROA. Also shift the order if clears around to be more consistent.
      
      No functionality changed here, this is just a cleanup.
      
      llvm-svn: 188144
      45b136f4
    • Reed Kotler's avatar
      Incorrect JAL instruction attributes caused the optimizer to make a wrong · 705c5951
      Reed Kotler authored
      instruction move. Just affects static relocation. -static works fine now
      with mips16 for the most part.
      
      llvm-svn: 188143
      705c5951
  3. Aug 10, 2013
  4. Aug 09, 2013
    • Peter Collingbourne's avatar
      DataFlowSanitizer: Remove unreachable BBs so IR continues to verify · ae66d57b
      Peter Collingbourne authored
      under the args ABI.
      
      Differential Revision: http://llvm-reviews.chandlerc.com/D1316
      
      llvm-svn: 188113
      ae66d57b
    • Reed Kotler's avatar
      Add another intrinsic that LLVM gives an incorrect prototype to. · be316cff
      Reed Kotler authored
      I need to go through all the runtime routine list and see if there
      are any more I need to add for mips16 floating point. Prototypes must
      be correct or else I don't know to add a helper function call.
      
      llvm-svn: 188106
      be316cff
    • Michael Gottesman's avatar
      [stackprotector] Simplify SP Pass so that we emit different fail basic blocks... · 8afcf3a4
      Michael Gottesman authored
      [stackprotector] Simplify SP Pass so that we emit different fail basic blocks for each fail condition.
      
      This patch decouples the stack protector pass so that we can support stack
      protector implementations that do not use the IR level generated stack protector
      fail basic block.
      
      No codesize increase is caused by this change since the MI level tail merge pass
      properly merges together the fail condition blocks (see the updated test).
      
      llvm-svn: 188105
      8afcf3a4
    • Jakub Staszak's avatar
      23ec6a97
    • Benjamin Kramer's avatar
      Add a overload to CostTable which allows it to infer the size of the table. · 21585fd9
      Benjamin Kramer authored
      Use it to avoid repeating ourselves too often. Also store MVT::SimpleValueType
      in the TTI tables so they can be statically initialized, MVT's constructors
      create bloated initialization code otherwise.
      
      llvm-svn: 188095
      21585fd9
    • David Blaikie's avatar
      DebugInfo: provide the ability to add members to a class after it has been constructed · f103c2f9
      David Blaikie authored
      This is necessary to allow Clang to only emit implicit members when
      there is code generated for them, rather than whenever they are ODR
      used.
      
      llvm-svn: 188082
      f103c2f9
    • Benjamin Kramer's avatar
      Make helper static and fix formatting. · df03449a
      Benjamin Kramer authored
      llvm-svn: 188074
      df03449a
    • Mihai Popa's avatar
      This fixes the Thumb2 CPS assembly syntax. · 4c2801f7
      Mihai Popa authored
      In Thumb1, only one variant is supported: CPS{effect} {flags}
      
      Thumb2 supports three:
      CPS{effect}.W {flags}
      CPS{effect} {flags} {mode}
      CPS {mode}
      
      Canonically, .W should be used only when ambiguity is present between encodings of different width.
      The wide suffix is still accepted for the latter two forms via aliases.
      
      llvm-svn: 188071
      4c2801f7
    • Mihai Popa's avatar
      Fix assembling of Thumb2 branch instructions. · ad18d3ce
      Mihai Popa authored
      The long encoding for Thumb2 unconditional branches is broken.
      Additionally, there is no range checking for target operands; as such 
      for instructions originating in assembly code, only short Thumb encodings
      are generated, regardless of the bitsize needed for the offset.
      
      Adding range checking is non trivial due to the representation of Thumb
      branch instructions. There is no true difference between conditional and
      unconditional branches in terms of operands and syntax - even unconditional
      branches have a predicate which is expected to match that of the IT block
      they are in. Yet, the encodings and the permitted size of the offset differ.
      
      Due to this, for any mnemonic there are really 4 encodings to choose for.
      
      The problem cannot be handled in the parser alone or by manipulating td files.
      Because the parser builds first a set of match candidates and then checks them
      one by one, whatever tablegen-only solution might be found will ultimately be
      dependent of the parser's evaluation order. What's worse is that due to the fact
      that all branches have the same syntax and the same kinds of operands, that 
      order is governed by the lexicographical ordering of the names of operand 
      classes...
      
      To circumvent all this, any necessary disambiguation is added to the instruction
      validation pass.
      
      llvm-svn: 188067
      ad18d3ce
Loading