- Nov 17, 2013
-
-
Hal Finkel authored
llvm-svn: 194941
-
Matt Arsenault authored
llvm-svn: 194940
-
Hal Finkel authored
This adds a loop rerolling pass: the opposite of (partial) loop unrolling. The transformation aims to take loops like this: for (int i = 0; i < 3200; i += 5) { a[i] += alpha * b[i]; a[i + 1] += alpha * b[i + 1]; a[i + 2] += alpha * b[i + 2]; a[i + 3] += alpha * b[i + 3]; a[i + 4] += alpha * b[i + 4]; } and turn them into this: for (int i = 0; i < 3200; ++i) { a[i] += alpha * b[i]; } and loops like this: for (int i = 0; i < 500; ++i) { x[3*i] = foo(0); x[3*i+1] = foo(0); x[3*i+2] = foo(0); } and turn them into this: for (int i = 0; i < 1500; ++i) { x[i] = foo(0); } There are two motivations for this transformation: 1. Code-size reduction (especially relevant, obviously, when compiling for code size). 2. Providing greater choice to the loop vectorizer (and generic unroller) to choose the unrolling factor (and a better ability to vectorize). The loop vectorizer can take vector lengths and register pressure into account when choosing an unrolling factor, for example, and a pre-unrolled loop limits that choice. This is especially problematic if the manual unrolling was optimized for a machine different from the current target. The current implementation is limited to single basic-block loops only. The rerolling recognition should work regardless of how the loop iterations are intermixed within the loop body (subject to dependency and side-effect constraints), but the significant restriction is that the order of the instructions in each iteration must be identical. This seems sufficient to capture all current use cases. This pass is not currently enabled by default at any optimization level. llvm-svn: 194939
-
Fariborz Jahanian authored
CF objects with objc_bridge'ing annotaiton. // rdar://15454846 llvm-svn: 194938
-
- Nov 16, 2013
-
-
Juergen Ributzka authored
llvm-svn: 194936
-
Hal Finkel authored
InstCombine, in visitFPTrunc, applies the following optimization to sqrt calls: (fptrunc (sqrt (fpext x))) -> (sqrtf x) but does not apply the same optimization to llvm.sqrt. This is a problem because, to enable vectorization, Clang generates llvm.sqrt instead of sqrt in fast-math mode, and because this optimization is being applied to sqrt and not applied to llvm.sqrt, sometimes the fast-math code is slower. This change makes InstCombine apply this optimization to llvm.sqrt as well. This fixes the specific problem in PR17758, although the same underlying issue (optimizations applied to libcalls are not applied to intrinsics) exists for other optimizations in SimplifyLibCalls. llvm-svn: 194935
-
Matt Arsenault authored
llvm-svn: 194934
-
Matt Arsenault authored
llvm-svn: 194932
-
Tobias Grosser authored
llvm-svn: 194931
-
Fariborz Jahanian authored
of ObjectiveC objects to CF types when CF type has the objc_bridge attribute. llvm-svn: 194930
-
Benjamin Kramer authored
This was a source of bugs in the past. llvm-svn: 194929
-
Benjamin Kramer authored
Fix ScalarEvolution bugs uncovered by this. llvm-svn: 194928
-
Vincent Lejeune authored
llvm-svn: 194927
-
Duncan P. N. Exon Smith authored
Per Rafael's review of r194514. llvm-svn: 194926
-
Benjamin Kramer authored
This is common in bitfield code. llvm-svn: 194925
-
Duncan P. N. Exon Smith authored
llvm-svn: 194924
-
Sebastian Pop authored
llvm-svn: 194923
-
Sebastian Pop authored
to be able to call the same functionality from registerPollyEarlyAsPossiblePasses and registerPollyOptLevel0Passes. llvm-svn: 194922
-
Sebastian Pop authored
llvm-svn: 194921
-
Benjamin Kramer authored
llvm-svn: 194920
-
Alp Toker authored
clang -cc1 skips the driver so it never made sense to include these with the Driver tests. Basic type tests and flag tests generally both go in Frontend. Now that the final -cc1 tests have been moved out of test/Driver, add a local substitution to enforce and detect future mistakes. These miscategorized tests were probably the source of confusion in r194817. llvm-svn: 194919
-
NAKAMURA Takumi authored
llvm-svn: 194918
-
Manman Ren authored
No functionality change. llvm-svn: 194917
-
Richard Smith authored
it's also __attribute__((used)), since that undoes the problematic part of 'inline'. llvm-svn: 194916
-
Fariborz Jahanian authored
// rdar://15454846. llvm-svn: 194915
-
Rui Ueyama authored
No functionality change. llvm-svn: 194914
-
Rui Ueyama authored
llvm-svn: 194913
-
Jason Molenda authored
(and same thing to Thread base class) which can be used when looking at an ExtendedBacktrace thread; it will try to find the IndexID() of the original thread that was executing this backtrace when it was recorded. If lldb can't find a record of that thread, it will return the same value as IndexID() for the ExtendedBacktrace thread. llvm-svn: 194912
-
Rui Ueyama authored
llvm-svn: 194911
-
Tobias Grosser authored
llvm-svn: 194910
-
Rui Ueyama authored
llvm-svn: 194909
-
Rui Ueyama authored
llvm-svn: 194908
-
Jim Grosbach authored
Teach the '-arch' command line option to enable the compiler-friendly features of core-avx2 CPUs on Darwin. Pass the information along in the target triple like Darwin+ARM does. llvm-svn: 194907
-
Jim Grosbach authored
llvm-svn: 194906
-
Richard Smith authored
projects are relying on such (questionable) practices, so we should give them a way to opt out of this diagnostic. llvm-svn: 194905
-
Matt Arsenault authored
llvm-svn: 194904
-
Matt Arsenault authored
The tests just hit this with a different sized address space since I haven't figured out how to use this to break it. I thought I committed this a long time ago, and I'm not sure why missing this hasn't caused any problems. llvm-svn: 194903
-
David Blaikie authored
llvm-svn: 194902
-
David Blaikie authored
llvm-svn: 194901
-
DeLesley Hutchins authored
Earlier versions discarded the state too soon, and did not track state changes, e.g. when passing a temporary to a move constructor. Patch by chris.wailes@gmail.com; review and minor fixes by delesley. llvm-svn: 194900
-