Skip to content
  1. Dec 18, 2018
    • Kristof Beyls's avatar
      Introduce control flow speculation tracking pass for AArch64 · e66bc1f7
      Kristof Beyls authored
      The pass implements tracking of control flow miss-speculation into a "taint"
      register. That taint register can then be used to mask off registers with
      sensitive data when executing under miss-speculation, a.k.a. "transient
      execution".
      This pass is aimed at mitigating against SpectreV1-style vulnarabilities.
      
      At the moment, it implements the tracking of miss-speculation of control
      flow into a taint register, but doesn't implement a mechanism yet to then
      use that taint register to mask off vulnerable data in registers (something
      for a follow-on improvement). Possible strategies to mask out vulnerable
      data that can be implemented on top of this are:
      - speculative load hardening to automatically mask of data loaded
        in registers.
      - using intrinsics to mask of data in registers as indicated by the
        programmer (see https://lwn.net/Articles/759423/).
      
      For AArch64, the following implementation choices are made.
      Some of these are different than the implementation choices made in
      the similar pass implemented in X86SpeculativeLoadHardening.cpp, as
      the instruction set characteristics result in different trade-offs.
      - The speculation hardening is done after register allocation. With a
        relative abundance of registers, one register is reserved (X16) to be
        the taint register. X16 is expected to not clash with other register
        reservation mechanisms with very high probability because:
        . The AArch64 ABI doesn't guarantee X16 to be retained across any call.
        . The only way to request X16 to be used as a programmer is through
          inline assembly. In the rare case a function explicitly demands to
          use X16/W16, this pass falls back to hardening against speculation
          by inserting a DSB SYS/ISB barrier pair which will prevent control
          flow speculation.
      - It is easy to insert mask operations at this late stage as we have
        mask operations available that don't set flags.
      - The taint variable contains all-ones when no miss-speculation is detected,
        and contains all-zeros when miss-speculation is detected. Therefore, when
        masking, an AND instruction (which only changes the register to be masked,
        no other side effects) can easily be inserted anywhere that's needed.
      - The tracking of miss-speculation is done by using a data-flow conditional
        select instruction (CSEL) to evaluate the flags that were also used to
        make conditional branch direction decisions. Speculation of the CSEL
        instruction can be limited with a CSDB instruction - so the combination of
        CSEL + a later CSDB gives the guarantee that the flags as used in the CSEL
        aren't speculated. When conditional branch direction gets miss-speculated,
        the semantics of the inserted CSEL instruction is such that the taint
        register will contain all zero bits.
        One key requirement for this to work is that the conditional branch is
        followed by an execution of the CSEL instruction, where the CSEL
        instruction needs to use the same flags status as the conditional branch.
        This means that the conditional branches must not be implemented as one
        of the AArch64 conditional branches that do not use the flags as input
        (CB(N)Z and TB(N)Z). This is implemented by ensuring in the instruction
        selectors to not produce these instructions when speculation hardening
        is enabled. This pass will assert if it does encounter such an instruction.
      - On function call boundaries, the miss-speculation state is transferred from
        the taint register X16 to be encoded in the SP register as value 0.
      
      Future extensions/improvements could be:
      - Implement this functionality using full speculation barriers, akin to the
        x86-slh-lfence option. This may be more useful for the intrinsics-based
        approach than for the SLH approach to masking.
        Note that this pass already inserts the full speculation barriers if the
        function for some niche reason makes use of X16/W16.
      - no indirect branch misprediction gets protected/instrumented; but this
        could be done for some indirect branches, such as switch jump tables.
      
      Differential Revision: https://reviews.llvm.org/D54896
      
      llvm-svn: 349456
      e66bc1f7
    • Serge Guelton's avatar
      Portable Python script across Python version · 3744de52
      Serge Guelton authored
      In Python2, division between integer yields an integer, while it yields a float in Python3.
      Use a combination of from __future__ import division and // operator to get a portable behavior.
      
      Differential Revision: https://reviews.llvm.org/D55204
      
      llvm-svn: 349455
      3744de52
    • Serge Guelton's avatar
      Portable Python script across Python version · c0ebe773
      Serge Guelton authored
      Using from __future__ import print_function it is possible to have a compatible behavior of `print(...)` across Python version.
      
      Differential Revision: https://reviews.llvm.org/D55213
      
      llvm-svn: 349454
      c0ebe773
    • Martin Storsjö's avatar
      85833393
    • Martin Storsjö's avatar
      [Driver] Automatically enable -munwind-tables if -fseh-exceptions is enabled · 56f9c81c
      Martin Storsjö authored
      For targets where SEH exceptions are used by default (on MinGW,
      only x86_64 so far), -munwind-tables are added automatically. If
      -fseh-exeptions is enabled on a target where SEH exeptions are
      availble but not enabled by default yet (aarch64), we need to
      pass -munwind-tables if -fseh-exceptions was specified.
      
      Differential Revision: https://reviews.llvm.org/D55749
      
      llvm-svn: 349452
      56f9c81c
    • Martin Storsjö's avatar
      [AArch64] [MinGW] Allow enabling SEH exceptions · 8f0cb9c3
      Martin Storsjö authored
      The default still is dwarf, but SEH exceptions can now be enabled
      optionally for the MinGW target.
      
      Differential Revision: https://reviews.llvm.org/D55748
      
      llvm-svn: 349451
      8f0cb9c3
    • Craig Topper's avatar
      [X86] Add test cases to show isel failing to match BMI blsmsk/blsi/blsr when... · 284d426f
      Craig Topper authored
      [X86] Add test cases to show isel failing to match BMI blsmsk/blsi/blsr when the flag result is used.
      
      A similar things happen to TBM instructions which we already have tests for.
      
      llvm-svn: 349450
      284d426f
    • Serge Guelton's avatar
      Portable Python script across Python version · 73cf752f
      Serge Guelton authored
      ConfigParser module has been renamed as configparser in Python3
      
      Differential Revision: https://reviews.llvm.org/D55200
      
      llvm-svn: 349449
      73cf752f
    • Serge Guelton's avatar
      Portable Python script across Python version · c5d97e3e
      Serge Guelton authored
      Replace `xrange(...)` by either `range(...)` or `list(range(...))` depending on the context.
      
      Differential Revision: https://reviews.llvm.org/D55193
      
      llvm-svn: 349448
      c5d97e3e
    • Serge Guelton's avatar
      Portable Python script across Python version · 366c089b
      Serge Guelton authored
      dict no longer have the `has_key` method in Python3. Instead, one can
      use the `in` keyword which already works in Python2.
      
      llvm-svn: 349447
      366c089b
    • Kewen Lin's avatar
      [PowerPC][NFC]Update vabsd cases with vselect test cases · bbb461f7
      Kewen Lin authored
      Power9 VABSDU* instructions can be exploited for some special vselect sequences.
      Check in the orignal test case here, later the exploitation patch will update this 
      and reviewers can check the differences easily.
      
      llvm-svn: 349446
      bbb461f7
    • Kewen Lin's avatar
      [PowerPC] Exploit power9 new instruction setb · 44ace925
      Kewen Lin authored
      Check the expected pattens feeding to SELECT_CC like:
         (select_cc lhs, rhs,  1, (sext (setcc [lr]hs, [lr]hs, cc2)), cc1)
         (select_cc lhs, rhs, -1, (zext (setcc [lr]hs, [lr]hs, cc2)), cc1)
         (select_cc lhs, rhs,  0, (select_cc [lr]hs, [lr]hs,  1, -1, cc2), seteq)
         (select_cc lhs, rhs,  0, (select_cc [lr]hs, [lr]hs, -1,  1, cc2), seteq)
      Further transform the sequence to comparison + setb if hits.
      
      Differential Revision: https://reviews.llvm.org/D53275
      
      llvm-svn: 349445
      44ace925
    • Tan S. B.'s avatar
      [ExprConstant] Handle compound assignment when LHS has integral type and RHS... · 9f935e87
      Tan S. B. authored
      [ExprConstant] Handle compound assignment when LHS has integral type and RHS has floating point type
      
      Fixes PR39858
      
      Differential Revision: https://reviews.llvm.org/D55413
      
      llvm-svn: 349444
      9f935e87
    • QingShan Zhang's avatar
      [NFC] Add new test to cover the lhs scheduling issue for P9. · ecdab5bd
      QingShan Zhang authored
      llvm-svn: 349443
      ecdab5bd
    • JF Bastien's avatar
      Automatic variable initialization · 14daa20b
      JF Bastien authored
      Summary:
      Add an option to initialize automatic variables with either a pattern or with
      zeroes. The default is still that automatic variables are uninitialized. Also
      add attributes to request uninitialized on a per-variable basis, mainly to disable
      initialization of large stack arrays when deemed too expensive.
      
      This isn't meant to change the semantics of C and C++. Rather, it's meant to be
      a last-resort when programmers inadvertently have some undefined behavior in
      their code. This patch aims to make undefined behavior hurt less, which
      security-minded people will be very happy about. Notably, this means that
      there's no inadvertent information leak when:
      
        - The compiler re-uses stack slots, and a value is used uninitialized.
        - The compiler re-uses a register, and a value is used uninitialized.
        - Stack structs / arrays / unions with padding are copied.
      
      This patch only addresses stack and register information leaks. There's many
      more infoleaks that we could address, and much more undefined behavior that
      could be tamed. Let's keep this patch focused, and I'm happy to address related
      issues elsewhere.
      
      To keep the patch simple, only some `undef` is removed for now, see
      `replaceUndef`. The padding-related infoleaks are therefore not all gone yet.
      This will be addressed in a follow-up, mainly because addressing padding-related
      leaks should be a stand-alone option which is implied by variable
      initialization.
      
      There are three options when it comes to automatic variable initialization:
      
        0. Uninitialized
      
          This is C and C++'s default. It's not changing. Depending on code
          generation, a programmer who runs into undefined behavior by using an
          uninialized automatic variable may observe any previous value (including
          program secrets), or any value which the compiler saw fit to materialize on
          the stack or in a register (this could be to synthesize an immediate, to
          refer to code or data locations, to generate cookies, etc).
      
        1. Pattern initialization
      
          This is the recommended initialization approach. Pattern initialization's
          goal is to initialize automatic variables with values which will likely
          transform logic bugs into crashes down the line, are easily recognizable in
          a crash dump, without being values which programmers can rely on for useful
          program semantics. At the same time, pattern initialization tries to
          generate code which will optimize well. You'll find the following details in
          `patternFor`:
      
          - Integers are initialized with repeated 0xAA bytes (infinite scream).
          - Vectors of integers are also initialized with infinite scream.
          - Pointers are initialized with infinite scream on 64-bit platforms because
            it's an unmappable pointer value on architectures I'm aware of. Pointers
            are initialize to 0x000000AA (small scream) on 32-bit platforms because
            32-bit platforms don't consistently offer unmappable pages. When they do
            it's usually the zero page. As people try this out, I expect that we'll
            want to allow different platforms to customize this, let's do so later.
          - Vectors of pointers are initialized the same way pointers are.
          - Floating point values and vectors are initialized with a negative quiet
            NaN with repeated 0xFF payload (e.g. 0xffffffff and 0xffffffffffffffff).
            NaNs are nice (here, anways) because they propagate on arithmetic, making
            it more likely that entire computations become NaN when a single
            uninitialized value sneaks in.
          - Arrays are initialized to their homogeneous elements' initialization
            value, repeated. Stack-based Variable-Length Arrays (VLAs) are
            runtime-initialized to the allocated size (no effort is made for negative
            size, but zero-sized VLAs are untouched even if technically undefined).
          - Structs are initialized to their heterogeneous element's initialization
            values. Zero-size structs are initialized as 0xAA since they're allocated
            a single byte.
          - Unions are initialized using the initialization for the largest member of
            the union.
      
          Expect the values used for pattern initialization to change over time, as we
          refine heuristics (both for performance and security). The goal is truly to
          avoid injecting semantics into undefined behavior, and we should be
          comfortable changing these values when there's a worthwhile point in doing
          so.
      
          Why so much infinite scream? Repeated byte patterns tend to be easy to
          synthesize on most architectures, and otherwise memset is usually very
          efficient. For values which aren't entirely repeated byte patterns, LLVM
          will often generate code which does memset + a few stores.
      
        2. Zero initialization
      
          Zero initialize all values. This has the unfortunate side-effect of
          providing semantics to otherwise undefined behavior, programs therefore
          might start to rely on this behavior, and that's sad. However, some
          programmers believe that pattern initialization is too expensive for them,
          and data might show that they're right. The only way to make these
          programmers wrong is to offer zero-initialization as an option, figure out
          where they are right, and optimize the compiler into submission. Until the
          compiler provides acceptable performance for all security-minded code, zero
          initialization is a useful (if blunt) tool.
      
      I've been asked for a fourth initialization option: user-provided byte value.
      This might be useful, and can easily be added later.
      
      Why is an out-of band initialization mecanism desired? We could instead use
      -Wuninitialized! Indeed we could, but then we're forcing the programmer to
      provide semantics for something which doesn't actually have any (it's
      uninitialized!). It's then unclear whether `int derp = 0;` lends meaning to `0`,
      or whether it's just there to shut that warning up. It's also way easier to use
      a compiler flag than it is to manually and intelligently initialize all values
      in a program.
      
      Why not just rely on static analysis? Because it cannot reason about all dynamic
      code paths effectively, and it has false positives. It's a great tool, could get
      even better, but it's simply incapable of catching all uses of uninitialized
      values.
      
      Why not just rely on memory sanitizer? Because it's not universally available,
      has a 3x performance cost, and shouldn't be deployed in production. Again, it's
      a great tool, it'll find the dynamic uses of uninitialized variables that your
      test coverage hits, but it won't find the ones that you encounter in production.
      
      What's the performance like? Not too bad! Previous publications [0] have cited
      2.7 to 4.5% averages. We've commmitted a few patches over the last few months to
      address specific regressions, both in code size and performance. In all cases,
      the optimizations are generally useful, but variable initialization benefits
      from them a lot more than regular code does. We've got a handful of other
      optimizations in mind, but the code is in good enough shape and has found enough
      latent issues that it's a good time to get the change reviewed, checked in, and
      have others kick the tires. We'll continue reducing overheads as we try this out
      on diverse codebases.
      
      Is it a good idea? Security-minded folks think so, and apparently so does the
      Microsoft Visual Studio team [1] who say "Between 2017 and mid 2018, this
      feature would have killed 49 MSRC cases that involved uninitialized struct data
      leaking across a trust boundary. It would have also mitigated a number of bugs
      involving uninitialized struct data being used directly.". They seem to use pure
      zero initialization, and claim to have taken the overheads down to within noise.
      Don't just trust Microsoft though, here's another relevant person asking for
      this [2]. It's been proposed for GCC [3] and LLVM [4] before.
      
      What are the caveats? A few!
      
        - Variables declared in unreachable code, and used later, aren't initialized.
          This goto, Duff's device, other objectionable uses of switch. This should
          instead be a hard-error in any serious codebase.
        - Volatile stack variables are still weird. That's pre-existing, it's really
          the language's fault and this patch keeps it weird. We should deprecate
          volatile [5].
        - As noted above, padding isn't fully handled yet.
      
      I don't think these caveats make the patch untenable because they can be
      addressed separately.
      
      Should this be on by default? Maybe, in some circumstances. It's a conversation
      we can have when we've tried it out sufficiently, and we're confident that we've
      eliminated enough of the overheads that most codebases would want to opt-in.
      Let's keep our precious undefined behavior until that point in time.
      
      How do I use it:
      
        1. On the command-line:
      
          -ftrivial-auto-var-init=uninitialized (the default)
          -ftrivial-auto-var-init=pattern
          -ftrivial-auto-var-init=zero -enable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang
      
        2. Using an attribute:
      
          int dont_initialize_me __attribute((uninitialized));
      
        [0]: https://users.elis.ugent.be/~jsartor/researchDocs/OOPSLA2011Zero-submit.pdf
        [1]: https://twitter.com/JosephBialek/status/1062774315098112001
        [2]: https://outflux.net/slides/2018/lss/danger.pdf
        [3]: https://gcc.gnu.org/ml/gcc-patches/2014-06/msg00615.html
        [4]: https://github.com/AndroidHardeningArchive/platform_external_clang/commit/776a0955ef6686d23a82d2e6a3cbd4a6a882c31c
        [5]: http://wg21.link/p1152
      
      I've also posted an RFC to cfe-dev: http://lists.llvm.org/pipermail/cfe-dev/2018-November/060172.html
      
      <rdar://problem/39131435>
      
      Reviewers: pcc, kcc, rsmith
      
      Subscribers: JDevlieghere, jkorous, dexonsmith, cfe-commits
      
      Differential Revision: https://reviews.llvm.org/D54604
      
      llvm-svn: 349442
      14daa20b
    • Craig Topper's avatar
      [X86] Add test case for PR40060. NFC · 4adf9ca7
      Craig Topper authored
      llvm-svn: 349441
      4adf9ca7
    • Craig Topper's avatar
      [X86] Const correct some helper functions X86InstrInfo.cpp. NFC · 1ff7356f
      Craig Topper authored
      llvm-svn: 349440
      1ff7356f
    • QingShan Zhang's avatar
      [NFC] fix test case issue that with wrong label check. · f5498125
      QingShan Zhang authored
      llvm-svn: 349439
      f5498125
    • Artur Pilipenko's avatar
      [CaptureTracking] Pass MaxUsesToExplore from wrappers to the actual implementation · 2a0146e0
      Artur Pilipenko authored
          
      This is a follow up for rL347910. In the original patch I somehow forgot to pass
      the limit from wrappers to the function which actually does the job.
      
      llvm-svn: 349438
      2a0146e0
    • Kewen Lin's avatar
      [PowerPC] Improve vec_abs on P9 · 3dac1252
      Kewen Lin authored
      Improve the current vec_abs support on P9, generate ISD::ABS node for vector types,
      combine ABS node to VABSD node for some special cases to make use of P9 VABSD* insns,
      do custom lowering to vsub(vneg later)+vmax if it has no combination opportunity.
      
      Differential Revision: https://reviews.llvm.org/D54783
      
      llvm-svn: 349437
      3dac1252
    • Reid Kleckner's avatar
      [COFF] Set the CPU string for LTO like ELF does · 0aa260d2
      Reid Kleckner authored
      Fixes PR40043
      
      llvm-svn: 349436
      0aa260d2
    • Jim Ingham's avatar
      Call DeleteCurrentProcess before we replace the old process. · 362d022d
      Jim Ingham authored
      We need to ensure that Finalize gets called before we start
      to destroy the old Process or the weak_ptr->shared_ptr link
      from Threads to Target gets broken before the threads are 
      destroyed.
      
      <rdar://problem/43586979>
      
      Differential Revision: https://reviews.llvm.org/D55631
      
      llvm-svn: 349435
      362d022d
    • Eli Friedman's avatar
      [Support] Fix GNU/kFreeBSD build · f4574702
      Eli Friedman authored
      Patch by James Clarke.
      
      Differential Revision: https://reviews.llvm.org/D55296
      
      llvm-svn: 349434
      f4574702
    • Reid Kleckner's avatar
      [codeview] Update comment on aligning symbol records · 4ab50b85
      Reid Kleckner authored
      llvm-svn: 349433
      4ab50b85
    • Joel E. Denny's avatar
      [FileCheck] Try to fix test on windows due to r349418 · c646b4b0
      Joel E. Denny authored
      llvm-svn: 349432
      c646b4b0
    • Reid Kleckner's avatar
      [codeview] Align symbol records to save 441MB during linking clang.pdb · 53ce0596
      Reid Kleckner authored
      In PDBs, symbol records must be aligned to four bytes. However, in the
      object file, symbol records may not be aligned. MSVC does not pad out
      symbol records to make sure they are aligned. That means the linker has
      to do extra work to insert the padding. Currently, LLD calculates the
      required space with alignment, and copies each record one at a time
      while padding them out to the correct size. It has a fast path that
      avoids this copy when the records are already aligned.
      
      This change fixes a bug in that codepath so that the copy is actually
      saved, and tweaks LLVM's symbol record emission to align symbol records.
      Here's how things compare when doing a plain clang Release+PDB build:
      - objs are 0.65% bigger (negligible)
      - link is 3.3% faster (negligible)
      - saves allocating 441MB
      - new LLD high water mark is ~1.05GB
      
      llvm-svn: 349431
      53ce0596
    • David Blaikie's avatar
      Recommit r348806: DebugInfo: Use symbol difference for CU length to simplify... · c4e08feb
      David Blaikie authored
      Recommit r348806: DebugInfo: Use symbol difference for CU length to simplify assembly reading/editing
      
      Mucking about simplifying a test case ( https://reviews.llvm.org/D55261 ) I stumbled across something I've hit before - that LLVM's (GCC's does too, FWIW) assembly output includes a hardcode length for a DWARF unit in its header. Instead we could emit a label difference - making the assembly easier to read/edit (though potentially at a slight (I haven't tried to observe it) performance cost of delaying/sinking the length computation into the MC layer).
      
      Fix: Predicated all the changes (including creating the labels, even if they aren't used/needed) behind the NVPTX useSectionsAsReferences, avoiding emitting labels in NVPTX where ptxas can't parse them.
      
      Reviewers: JDevlieghere, probinson, ABataev
      
      Differential Revision: https://reviews.llvm.org/D55281
      
      llvm-svn: 349430
      c4e08feb
    • Greg Clayton's avatar
      Add "dump" command as a custom "process plugin" subcommand when ProcessMinidump is used. · 48a28c16
      Greg Clayton authored
      Each process plug-in can create its own custom commands. I figured it would be nice to be able to dump things from the minidump file from the lldb command line, so I added the start of the some custom commands.
      
      Currently you can dump:
      
      minidump stream directory
      all linux specifc streams, most of which are strings
      each linux stream individually if desired, or all with --linux
      The idea is we can expand the command set to dump more things, search for data in the core file, and much more. This patch gets us started.
      
      Differential Revision: https://reviews.llvm.org/D55727
      
      llvm-svn: 349429
      48a28c16
    • Peter Collingbourne's avatar
      hwasan: Allow range of frame descriptors to be empty. · 44ea4f57
      Peter Collingbourne authored
      As of r349413 it's now possible for a binary to contain an empty
      hwasan frame section. Handle that case simply by doing nothing.
      
      Differential Revision: https://reviews.llvm.org/D55796
      
      llvm-svn: 349428
      44ea4f57
    • Louis Dionne's avatar
      [libcxx] Handle AppleClang 9 and 10 in XFAILs for aligned allocation tests · 06caa6d2
      Louis Dionne authored
      I forgot that those don't behave like Clang trunk, again.
      
      llvm-svn: 349427
      06caa6d2
    • Louis Dionne's avatar
      [libcxx] Properly mark aligned allocation macro test as XFAIL on OS X · afb1d72e
      Louis Dionne authored
      This test was initially marked as XFAIL using `XFAIL: macosx10.YY`, and
      was then moved to `UNSUPPORTED: macosx10.YY`. The intent is to mark the
      test as XFAILing when a deployment target older than macosx10.14 is used,
      and the right way to do this is `XFAIL: availability=macosx10.YY`.
      
      llvm-svn: 349426
      afb1d72e
    • Joel E. Denny's avatar
      [FileCheck] Annotate input dump (final tweaks) · e2afb614
      Joel E. Denny authored
      Apply final suggestions from probinson for this patch series plus a
      few more tweaks:
      
      * Improve various docs, for MatchType in particular.
      
      * Rename some members of MatchType.  The main problem was that the
        term "final match" became a misnomer when CHECK-COUNT-<N> was
        created.
      
      * Split InputStartLine, etc. declarations into multiple lines.
      
      Differential Revision: https://reviews.llvm.org/D55738
      
      Reviewed By: probinson
      
      llvm-svn: 349425
      e2afb614
    • Joel E. Denny's avatar
      [FileCheck] Annotate input dump (7/7) · 96f0e84c
      Joel E. Denny authored
      This patch implements annotations for diagnostics reporting CHECK-NOT
      failed matches.  These diagnostics are enabled by -vv.  As for
      diagnostics reporting failed matches for other directives, these
      annotations mark the search ranges using `X~~`.  The difference here
      is that failed matches for CHECK-NOT are successes not errors, so they
      are green not red when colors are enabled.
      
      For example:
      
      ```
      $ FileCheck -dump-input=help
      The following description was requested by -dump-input=help to
      explain the input annotations printed by -dump-input=always and
      -dump-input=fail:
      
        - L:     labels line number L of the input file
        - T:L    labels the only match result for a pattern of type T from line L of
                 the check file
        - T:L'N  labels the Nth match result for a pattern of type T from line L of
                 the check file
        - ^~~    marks good match (reported if -v)
        - !~~    marks bad match, such as:
                 - CHECK-NEXT on same line as previous match (error)
                 - CHECK-NOT found (error)
                 - CHECK-DAG overlapping match (discarded, reported if -vv)
        - X~~    marks search range when no match is found, such as:
                 - CHECK-NEXT not found (error)
                 - CHECK-NOT not found (success, reported if -vv)
                 - CHECK-DAG not found after discarded matches (error)
        - ?      marks fuzzy match when no match is found
        - colors success, error, fuzzy match, discarded match, unmatched input
      
      If you are not seeing color above or in input dumps, try: -color
      
      $ FileCheck -vv -dump-input=always check5 < input5 |& sed -n '/^<<<</,$p'
      <<<<<<
               1: abcdef
      check:1     ^~~
      not:2          X~~
               2: ghijkl
      not:2       ~~~
      check:3        ^~~
               3: mnopqr
      not:4       X~~~~~
               4: stuvwx
      not:4       ~~~~~~
               5:
      eof:4       ^
      >>>>>>
      
      $ cat check5
      CHECK: abc
      CHECK-NOT: foobar
      CHECK: jkl
      CHECK-NOT: foobar
      
      $ cat input5
      abcdef
      ghijkl
      mnopqr
      stuvwx
      ```
      
      Reviewed By: george.karpenkov, probinson
      
      Differential Revision: https://reviews.llvm.org/D53899
      
      llvm-svn: 349424
      96f0e84c
    • Joel E. Denny's avatar
      [FileCheck] Annotate input dump (6/7) · f7c1c4d8
      Joel E. Denny authored
      This patch implements input annotations for diagnostics reporting
      CHECK-DAG discarded matches.  These diagnostics are enabled by -vv.
      These annotations mark discarded match ranges using `!~~` because they
      are bad matches even though they are not errors.
      
      CHECK-DAG discarded matches create another case where there can be
      multiple match results for the same directive.
      
      For example:
      
      ```
      $ FileCheck -dump-input=help
      The following description was requested by -dump-input=help to
      explain the input annotations printed by -dump-input=always and
      -dump-input=fail:
      
        - L:     labels line number L of the input file
        - T:L    labels the only match result for a pattern of type T from line L of
                 the check file
        - T:L'N  labels the Nth match result for a pattern of type T from line L of
                 the check file
        - ^~~    marks good match (reported if -v)
        - !~~    marks bad match, such as:
                 - CHECK-NEXT on same line as previous match (error)
                 - CHECK-NOT found (error)
                 - CHECK-DAG overlapping match (discarded, reported if -vv)
        - X~~    marks search range when no match is found, such as:
                 - CHECK-NEXT not found (error)
                 - CHECK-DAG not found after discarded matches (error)
        - ?      marks fuzzy match when no match is found
        - colors success, error, fuzzy match, discarded match, unmatched input
      
      If you are not seeing color above or in input dumps, try: -color
      
      $ FileCheck -vv -dump-input=always check4 < input4 |& sed -n '/^<<<</,$p'
      <<<<<<
               1: abcdef
      dag:1       ^~~~
      dag:2'0       !~~~ discard: overlaps earlier match
               2: cdefgh
      dag:2'1     ^~~~
      check:3         X~ error: no match found
      >>>>>>
      
      $ cat check4
      CHECK-DAG: abcd
      CHECK-DAG: cdef
      CHECK: efgh
      
      $ cat input4
      abcdef
      cdefgh
      ```
      
      This shows that the line 3 CHECK fails to match even though its
      pattern appears in the input because its search range starts after the
      line 2 CHECK-DAG's match range.  The trouble might be that the line 2
      CHECK-DAG's match range is later than expected because its first match
      range overlaps with the line 1 CHECK-DAG match range and thus is
      discarded.
      
      Because `!~~` for CHECK-DAG does not indicate an error, it is not
      colored red.  Instead, when colors are enabled, it is colored cyan,
      which suggests a match that went cold.
      
      Reviewed By: george.karpenkov, probinson
      
      Differential Revision: https://reviews.llvm.org/D53898
      
      llvm-svn: 349423
      f7c1c4d8
    • Joel E. Denny's avatar
      [FileCheck] Annotate input dump (5/7) · 7df86967
      Joel E. Denny authored
      This patch implements input annotations for diagnostics enabled by -v,
      which report good matches for directives.  These annotations mark
      match ranges using `^~~`.
      
      For example:
      
      ```
      $ FileCheck -dump-input=help
      The following description was requested by -dump-input=help to
      explain the input annotations printed by -dump-input=always and
      -dump-input=fail:
      
        - L:     labels line number L of the input file
        - T:L    labels the only match result for a pattern of type T from line L of
                 the check file
        - T:L'N  labels the Nth match result for a pattern of type T from line L of
                 the check file
        - ^~~    marks good match (reported if -v)
        - !~~    marks bad match, such as:
                 - CHECK-NEXT on same line as previous match (error)
                 - CHECK-NOT found (error)
        - X~~    marks search range when no match is found, such as:
                 - CHECK-NEXT not found (error)
        - ?      marks fuzzy match when no match is found
        - colors success, error, fuzzy match, unmatched input
      
      If you are not seeing color above or in input dumps, try: -color
      
      $ FileCheck -v -dump-input=always check3 < input3 |& sed -n '/^<<<</,$p'
      <<<<<<
               1: abc foobar def
      check:1     ^~~
      not:2           !~~~~~     error: no match expected
      check:3                ^~~
      >>>>>>
      
      $ cat check3
      CHECK:     abc
      CHECK-NOT: foobar
      CHECK:     def
      
      $ cat input3
      abc foobar def
      ```
      
      -vv enables these annotations for FileCheck's implicit EOF patterns as
      well.  For an example where EOF patterns become relevant, see patch 7
      in this series.
      
      If colors are enabled, `^~~` is green to suggest success.
      
      -v plus color enables highlighting of input text that has no final
      match for any expected pattern.  The highlight uses a cyan background
      to suggest a cold section.  This highlighting can make it easier to
      spot text that was intended to be matched but that failed to be
      matched in a long series of good matches.
      
      CHECK-COUNT-<num> good matches are another case where there can be
      multiple match results for the same directive.
      
      Reviewed By: george.karpenkov, probinson
      
      Differential Revision: https://reviews.llvm.org/D53897
      
      llvm-svn: 349422
      7df86967
    • Joel E. Denny's avatar
      [FileCheck] Annotate input dump (4/7) · 0e7e3fa0
      Joel E. Denny authored
      This patch implements input annotations for diagnostics that report
      unexpected matches for CHECK-NOT.  Like wrong-line matches for
      CHECK-NEXT, CHECK-SAME, and CHECK-EMPTY, these annotations mark match
      ranges using red `!~~` to indicate bad matches that are errors.
      
      For example:
      
      ```
      $ FileCheck -dump-input=help
      The following description was requested by -dump-input=help to
      explain the input annotations printed by -dump-input=always and
      -dump-input=fail:
      
        - L:     labels line number L of the input file
        - T:L    labels the only match result for a pattern of type T from line L of
                 the check file
        - T:L'N  labels the Nth match result for a pattern of type T from line L of
                 the check file
        - !~~    marks bad match, such as:
                 - CHECK-NEXT on same line as previous match (error)
                 - CHECK-NOT found (error)
        - X~~    marks search range when no match is found, such as:
                 - CHECK-NEXT not found (error)
        - ?      marks fuzzy match when no match is found
        - colors error, fuzzy match
      
      If you are not seeing color above or in input dumps, try: -color
      
      $ FileCheck -v -dump-input=always check3 < input3 |& sed -n '/^<<<</,$p'
      <<<<<<
             1: abc foobar def
      not:2         !~~~~~     error: no match expected
      >>>>>>
      
      $ cat check3
      CHECK:     abc
      CHECK-NOT: foobar
      CHECK:     def
      
      $ cat input3
      abc foobar def
      ```
      
      Reviewed By: george.karpenkov, probinson
      
      Differential Revision: https://reviews.llvm.org/D53896
      
      llvm-svn: 349421
      0e7e3fa0
    • Joel E. Denny's avatar
      [FileCheck] Annotate input dump (3/7) · cadfcef4
      Joel E. Denny authored
      This patch implements input annotations for diagnostics that report
      wrong-line matches for the directives CHECK-NEXT, CHECK-SAME, and
      CHECK-EMPTY.  Instead of the usual `^~~`, which is used by later
      patches for good matches, these annotations use `!~~` to mark the bad
      match ranges so that this category of errors is visually distinct.
      Because such matches are errors, these annotates are red when colors
      are enabled.
      
      For example:
      
      ```
      $ FileCheck -dump-input=help
      The following description was requested by -dump-input=help to
      explain the input annotations printed by -dump-input=always and
      -dump-input=fail:
      
        - L:     labels line number L of the input file
        - T:L    labels the only match result for a pattern of type T from line L of
                 the check file
        - T:L'N  labels the Nth match result for a pattern of type T from line L of
                 the check file
        - !~~    marks bad match, such as:
                 - CHECK-NEXT on same line as previous match (error)
        - X~~    marks search range when no match is found, such as:
                 - CHECK-NEXT not found (error)
        - ?      marks fuzzy match when no match is found
        - colors error, fuzzy match
      
      If you are not seeing color above or in input dumps, try: -color
      
      $ FileCheck -v -dump-input=always check2 < input2 |& sed -n '/^<<<</,$p'
      <<<<<<
              1: foo bar
      next:2         !~~ error: match on wrong line
      >>>>>>
      
      $ cat check2
      CHECK: foo
      CHECK-NEXT: bar
      
      $ cat input2
      foo bar
      ```
      
      Reviewed By: george.karpenkov, probinson
      
      Differential Revision: https://reviews.llvm.org/D53894
      
      llvm-svn: 349420
      cadfcef4
    • Joel E. Denny's avatar
      [FileCheck] Annotate input dump (2/7) · 2c007c80
      Joel E. Denny authored
      This patch implements input annotations for diagnostics that suggest
      fuzzy matches for directives for which no matches were found.  Instead
      of using the usual `^~~`, which is used by later patches for good
      matches, these annotations use `?` so that fuzzy matches are visually
      distinct.  No tildes are included as these diagnostics (independently
      of this patch) currently identify only the start of the match.
      
      For example:
      
      ```
      $ FileCheck -dump-input=help
      The following description was requested by -dump-input=help to
      explain the input annotations printed by -dump-input=always and
      -dump-input=fail:
      
        - L:     labels line number L of the input file
        - T:L    labels the only match result for a pattern of type T from line L of
                 the check file
        - T:L'N  labels the Nth match result for a pattern of type T from line L of
                 the check file
        - X~~    marks search range when no match is found
        - ?      marks fuzzy match when no match is found
        - colors error, fuzzy match
      
      If you are not seeing color above or in input dumps, try: -color
      
      $ FileCheck -v -dump-input=always check1 < input1 |& sed -n '/^<<<</,$p'
      <<<<<<
                1: ; abc def
                2: ; ghI jkl
      next:3'0     X~~~~~~~~ error: no match found
      next:3'1       ?       possible intended match
      >>>>>>
      
      $ cat check1
      CHECK: abc
      CHECK-SAME: def
      CHECK-NEXT: ghi
      CHECK-SAME: jkl
      
      $ cat input1
      ; abc def
      ; ghI jkl
      ```
      
      This patch introduces the concept of multiple "match results" per
      directive.  In the above example, the first match result for the
      CHECK-NEXT directive is the failed match, for which the annotation
      shows the search range.  The second match result is the fuzzy match.
      Later patches will introduce other cases of multiple match results per
      directive.
      
      When colors are enabled, `?` is colored magenta.  That is, it doesn't
      indicate the actual error, which a red `X~~` marker indicates, but its
      color suggests it's closely related.
      
      Reviewed By: george.karpenkov, probinson
      
      Differential Revision: https://reviews.llvm.org/D53893
      
      llvm-svn: 349419
      2c007c80
    • Joel E. Denny's avatar
      [FileCheck] Annotate input dump (1/7) · 3c5d267e
      Joel E. Denny authored
      Extend FileCheck to dump its input annotated with FileCheck's
      diagnostics: errors, good matches if -v, and additional information if
      -vv.  The goal is to make it easier to visualize FileCheck's matching
      behavior when debugging.
      
      Each patch in this series implements input annotations for a
      particular category of FileCheck diagnostics.  While the first few
      patches alone are somewhat useful, the annotations become much more
      useful as later patches implement annotations for -v and -vv
      diagnostics, which show the matching behavior leading up to the error.
      
      This first patch implements boilerplate plus input annotations for
      error diagnostics reporting that no matches were found for a
      directive.  These annotations mark the search ranges of the failed
      directives.  Instead of using the usual `^~~`, which is used by later
      patches for good matches, these annotations use `X~~` so that this
      category of errors is visually distinct.
      
      For example:
      
      ```
      $ FileCheck -dump-input=help
      The following description was requested by -dump-input=help to
      explain the input annotations printed by -dump-input=always and
      -dump-input=fail:
      
        - L:     labels line number L of the input file
        - T:L    labels the match result for a pattern of type T from line L of
                 the check file
        - X~~    marks search range when no match is found
        - colors error
      
      If you are not seeing color above or in input dumps, try: -color
      
      $ FileCheck -v -dump-input=always check1 < input1 |& sed -n '/^Input file/,$p'
      Input file: <stdin>
      Check file: check1
      
      -dump-input=help describes the format of the following dump.
      
      Full input was:
      <<<<<<
              1: ; abc def
              2: ; ghI jkl
      next:3     X~~~~~~~~ error: no match found
      >>>>>>
      
      $ cat check1
      CHECK: abc
      CHECK-SAME: def
      CHECK-NEXT: ghi
      CHECK-SAME: jkl
      
      $ cat input1
      ; abc def
      ; ghI jkl
      ```
      
      Some additional details related to the boilerplate:
      
      * Enabling: The annotated input dump is enabled by `-dump-input`,
        which can also be set via the `FILECHECK_OPTS` environment variable.
        Accepted values are `help`, `always`, `fail`, or `never`.  As shown
        above, `help` describes the format of the dump.  `always` is helpful
        when you want to investigate a successful FileCheck run, perhaps for
        an unexpected pass. `-dump-input-on-failure` and
        `FILECHECK_DUMP_INPUT_ON_FAILURE` remain as a deprecated alias for
        `-dump-input=fail`.
      
      * Diagnostics: The usual diagnostics are not suppressed in this mode
        and are printed first.  For brevity in the example above, I've
        omitted them using a sed command.  Sometimes they're perfectly
        sufficient, and then they make debugging quicker than if you were
        forced to hunt through a dump of long input looking for the error.
        If you think they'll get in the way sometimes, keep in mind that
        it's pretty easy to grep for the start of the input dump, which is
        `<<<`.
      
      * Colored Annotations: The annotated input is colored if colors are
        enabled (enabling colors can be forced using -color).  For example,
        errors are red.  However, as in the above example, colors are not
        vital to reading the annotations.
      
      I don't know how to test color in the output, so any hints here would
      be appreciated.
      
      Reviewed By: george.karpenkov, zturner, probinson
      
      Differential Revision: https://reviews.llvm.org/D52999
      
      llvm-svn: 349418
      3c5d267e
    • Jason Molenda's avatar
      A few small updates to the testsuite for running against an iOS device. · f47c734e
      Jason Molenda authored
      Remove the expected-fails for 34538611; using an alternate platform
      implementation handles these correctly.
      
      llvm-svn: 349417
      f47c734e
Loading