- Feb 10, 2011
-
-
Douglas Gregor authored
AST/PCH files more lazy: - Don't preload all of the file source-location entries when reading the AST file. Instead, load them lazily, when needed. - Only look up header-search information (whether a header was already #import'd, how many times it's been included, etc.) when it's needed by the preprocessor, rather than pre-populating it. Previously, we would pre-load all of the file source-location entries, which also populated the header-search information structure. This was a relatively minor performance issue, since we would end up stat()'ing all of the headers stored within a AST/PCH file when the AST/PCH file was loaded. In the normal PCH use case, the stat()s were cached, so the cost--of preloading ~860 source-location entries in the Cocoa.h case---was relatively low. However, the recent optimization that replaced stat+open with open+fstat turned this into a major problem, since the preloading of source-location entries would now end up opening those files. Worse, those files wouldn't be closed until the file manager was destroyed, so just opening a Cocoa.h PCH file would hold on to ~860 file descriptors, and it was easy to blow through the process's limit on the number of open file descriptors. By eliminating the preloading of these files, we neither open nor stat the headers stored in the PCH/AST file until they're actually needed for something. Concretely, we went from *** HeaderSearch Stats: 835 files tracked. 364 #import/#pragma once files. 823 included exactly once. 6 max times a file is included. 3 #include/#include_next/#import. 0 #includes skipped due to the multi-include optimization. 1 framework lookups. 0 subframework lookups. *** Source Manager Stats: 835 files mapped, 3 mem buffers mapped. 37460 SLocEntry's allocated, 11215575B of Sloc address space used. 62 bytes of files mapped, 0 files with line #'s computed. with a trivial program that uses a chained PCH including a Cocoa PCH to *** HeaderSearch Stats: 4 files tracked. 1 #import/#pragma once files. 3 included exactly once. 2 max times a file is included. 3 #include/#include_next/#import. 0 #includes skipped due to the multi-include optimization. 1 framework lookups. 0 subframework lookups. *** Source Manager Stats: 3 files mapped, 3 mem buffers mapped. 37460 SLocEntry's allocated, 11215575B of Sloc address space used. 62 bytes of files mapped, 0 files with line #'s computed. for the same program. llvm-svn: 125286
-
Roman Divacky authored
is specified in the FreeBSD linker driver. llvm-svn: 125285
-
David Greene authored
[AVX] Implement 256-bit vector lowering for EXTRACT_VECTOR_ELT. llvm-svn: 125284
-
Roman Divacky authored
llvm-svn: 125283
-
Roman Divacky authored
llvm-svn: 125282
-
Ken Dyck authored
character units. llvm-svn: 125281
-
Ken Dyck authored
r125156. llvm-svn: 125280
-
Che-Liang Chiou authored
llvm-svn: 125279
-
NAKAMURA Takumi authored
Unixen and Cygwin do not need it. llvm-svn: 125277
-
NAKAMURA Takumi authored
llvm-svn: 125275
-
NAKAMURA Takumi authored
llvm-svn: 125274
-
NAKAMURA Takumi authored
llvm-svn: 125273
-
NAKAMURA Takumi authored
llvm-svn: 125272
-
Chris Lattner authored
gep to explicit addressing, we know that none of the intermediate computation overflows. This could use review: it seems that the shifts certainly wouldn't overflow, but could the intermediate adds overflow if there is a negative index? Previously the testcase would instcombine to: define i1 @test(i64 %i) { %p1.idx.mask = and i64 %i, 4611686018427387903 %cmp = icmp eq i64 %p1.idx.mask, 1000 ret i1 %cmp } now we get: define i1 @test(i64 %i) { %cmp = icmp eq i64 %i, 1000 ret i1 %cmp } llvm-svn: 125271
-
Chris Lattner authored
for NSW/NUW binops to follow the pattern of exact binops. This allows someone to use Builder.CreateAdd(x, y, "tmp", MaybeNUW); llvm-svn: 125270
-
Greg Clayton authored
llvm-svn: 125269
-
John McCall authored
linkage into Decl.cpp. Disable this logic for extern "C" functions, because the operative rule there is weaker. Fixes rdar://problem/8898466 llvm-svn: 125268
-
Chris Lattner authored
exact/nsw/nuw shifts and have instcombine infer them when it can prove that the relevant properties are true for a given shift without them. Also, a variety of refactoring to use the new patternmatch logic thrown in for good luck. I believe that this takes care of a bunch of related code quality issues attached to PR8862. llvm-svn: 125267
-
Chris Lattner authored
optimizations to be much more aggressive in the face of exact/nsw/nuw div and shifts. For example, these (which are the same except the first is 'exact' sdiv: define i1 @sdiv_icmp4_exact(i64 %X) nounwind { %A = sdiv exact i64 %X, -5 ; X/-5 == 0 --> x == 0 %B = icmp eq i64 %A, 0 ret i1 %B } define i1 @sdiv_icmp4(i64 %X) nounwind { %A = sdiv i64 %X, -5 ; X/-5 == 0 --> x == 0 %B = icmp eq i64 %A, 0 ret i1 %B } compile down to: define i1 @sdiv_icmp4_exact(i64 %X) nounwind { %1 = icmp eq i64 %X, 0 ret i1 %1 } define i1 @sdiv_icmp4(i64 %X) nounwind { %X.off = add i64 %X, 4 %1 = icmp ult i64 %X.off, 9 ret i1 %1 } This happens when you do something like: (ptr1-ptr2) == 42 where the pointers are pointers to non-unit types. llvm-svn: 125266
-
Chris Lattner authored
conversions". :) llvm-svn: 125265
-
Chris Lattner authored
and generally tidying things up. Only very trivial functionality changes like now doing (-1 - A) -> (~A) for vectors too. InstCombineAddSub.cpp | 296 +++++++++++++++++++++----------------------------- 1 file changed, 126 insertions(+), 170 deletions(-) llvm-svn: 125264
-
Chris Lattner authored
are shifting out since they do require them to be zeros. Similarly for NUW/NSW bits of shl llvm-svn: 125263
-
Ted Kremenek authored
llvm-svn: 125262
-
Ted Kremenek authored
This reduces memory usage of the analyzer on sqlite by another 5%. llvm-svn: 125260
-
Evan Cheng authored
After 3-addressifying a two-address instruction, update the register maps; add a missing check when considering whether it's profitable to commute. rdar://8977508. llvm-svn: 125259
-
Johnny Chen authored
and a helper method UnalignedSupport(). llvm-svn: 125258
-
Eric Christopher authored
llvm-svn: 125257
-
Bill Wendling authored
llvm-svn: 125256
-
Caroline Tice authored
input reader. Always make sure the input reader stack is not empty before trying to get the top element from the stack. llvm-svn: 125255
-
Cameron Zwarich authored
Natural Loop Information Loop Pass Manager Canonicalize natural loops Scalar Evolution Analysis Loop Pass Manager Induction Variable Users Canonicalize natural loops Induction Variable Users Loop Strength Reduction into this: Scalar Evolution Analysis Loop Pass Manager Canonicalize natural loops Induction Variable Users Loop Strength Reduction This fixes <rdar://problem/8869639>. I also filed PR9184 on doing this sort of thing automatically, but it seems easier to just change the ordering of the passes if this is the only case. llvm-svn: 125254
-
Ted Kremenek authored
This is a hack because we really should only search in the 'include/clang/StaticAnalyzer' directory if we are in 'lib/StaticAnalyzer'. My CMake knowledge is limited, so I appeal to anyone with more expertise. llvm-svn: 125252
-
Ted Kremenek authored
Split 'include/clang/StaticAnalyzer' into 'include/clang/StaticAnalyzer/Core' and 'include/clang/StaticAnalyzer/Checkers'. This layout matches lib/StaticAnalyzer, which corresponds to two StaticAnalyzer libraries. llvm-svn: 125251
-
Devang Patel authored
llvm-svn: 125250
-
Devang Patel authored
llvm-svn: 125249
-
Jim Grosbach authored
When matching operands for a candidate opcode match in the auto-generated AsmMatcher, check each operand against the expected operand match class. Previously, operands were classified independently of the opcode being handled, which led to difficulties when operand match classes were more complicated than simple subclass relationships. llvm-svn: 125245
-
Johnny Chen authored
of the CPSR during the course of executing an opcode, and modified SelectInstrSet() to update this variable instead of the original m_inst_cpsr, which should be the cached copy of the CPSR at the beginning of executing the opcode. llvm-svn: 125244
-
Jakob Stoklund Olesen authored
Loop splitting is better handled by the more generic global region splitting based on the edge bundle graph. llvm-svn: 125243
-
Johnny Chen authored
and a helper method ALUWritePC(Context&, uint32_t). llvm-svn: 125241
-
Greg Clayton authored
indirect forms, deals with empty DW_AT_comp_dir attributes, and fixups for handling other signed integer types. llvm-svn: 125240
-
Douglas Gregor authored
I have another way to achieve the same goal. llvm-svn: 125239
-