Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
L
LLVM bpEVL
Manage
Activity
Members
Labels
Plan
Issues
0
Issue boards
Milestones
Wiki
Code
Merge requests
0
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Package Registry
Model registry
Operate
Environments
Terraform modules
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
Lorenzo Albano
LLVM bpEVL
Commits
3f40c690
Commit
3f40c690
authored
16 years ago
by
Evan Cheng
Browse files
Options
Downloads
Patches
Plain Diff
On x86, it's safe to treat i32 load anyext as a normal i32 load. Ditto for i8 anyext load to i16.
llvm-svn: 51019
parent
d78c400b
No related branches found
No related tags found
No related merge requests found
Changes
3
Hide whitespace changes
Inline
Side-by-side
Showing
3 changed files
llvm/lib/Target/X86/README-SSE.txt
+0
-25
0 additions, 25 deletions
llvm/lib/Target/X86/README-SSE.txt
llvm/lib/Target/X86/X86InstrInfo.td
+28
-2
28 additions, 2 deletions
llvm/lib/Target/X86/X86InstrInfo.td
llvm/test/CodeGen/X86/vec_set-H.ll
+15
-0
15 additions, 0 deletions
llvm/test/CodeGen/X86/vec_set-H.ll
with
43 additions
and
27 deletions
llvm/lib/Target/X86/README-SSE.txt
+
0
−
25
View file @
3f40c690
...
@@ -757,31 +757,6 @@ or iseling it.
...
@@ -757,31 +757,6 @@ or iseling it.
//===---------------------------------------------------------------------===//
//===---------------------------------------------------------------------===//
Take the following code:
#include <xmmintrin.h>
__m128i doload64(short x) {return _mm_set_epi16(x,x,x,x,x,x,x,x);}
LLVM currently generates the following on x86:
doload64:
movzwl 4(%esp), %eax
movd %eax, %xmm0
punpcklwd %xmm0, %xmm0
pshufd $0, %xmm0, %xmm0
ret
gcc's generated code:
doload64:
movd 4(%esp), %xmm0
punpcklwd %xmm0, %xmm0
pshufd $0, %xmm0, %xmm0
ret
LLVM should be able to generate the same thing as gcc. This looks like it is
just a matter of matching (scalar_to_vector (load x)) to movd.
//===---------------------------------------------------------------------===//
LLVM currently generates stack realignment code, when it is not necessary
LLVM currently generates stack realignment code, when it is not necessary
needed. The problem is that we need to know about stack alignment too early,
needed. The problem is that we need to know about stack alignment too early,
before RA runs.
before RA runs.
...
...
This diff is collapsed.
Click to expand it.
llvm/lib/Target/X86/X86InstrInfo.td
+
28
−
2
View file @
3f40c690
...
@@ -229,9 +229,35 @@ def i32immSExt8 : PatLeaf<(i32 imm), [{
...
@@ -229,9 +229,35 @@ def i32immSExt8 : PatLeaf<(i32 imm), [{
}]>;
}]>;
// Helper fragments for loads.
// Helper fragments for loads.
// It's always safe to treat a anyext i16 load as a i32 load. Ditto for
// i8 to i16.
def loadi16 : PatFrag<(ops node:$ptr), (i16 (ld node:$ptr)), [{
if (LoadSDNode *LD = dyn_cast<LoadSDNode>(N)) {
if (LD->getAddressingMode() != ISD::UNINDEXED)
return false;
ISD::LoadExtType ExtType = LD->getExtensionType();
if (ExtType == ISD::NON_EXTLOAD)
return true;
if (ExtType == ISD::EXTLOAD)
return LD->getAlignment() >= 16;
}
return false;
}]>;
def loadi32 : PatFrag<(ops node:$ptr), (i32 (ld node:$ptr)), [{
if (LoadSDNode *LD = dyn_cast<LoadSDNode>(N)) {
if (LD->getAddressingMode() != ISD::UNINDEXED)
return false;
ISD::LoadExtType ExtType = LD->getExtensionType();
if (ExtType == ISD::NON_EXTLOAD)
return true;
if (ExtType == ISD::EXTLOAD)
return LD->getAlignment() >= 16;
}
return false;
}]>;
def loadi8 : PatFrag<(ops node:$ptr), (i8 (load node:$ptr))>;
def loadi8 : PatFrag<(ops node:$ptr), (i8 (load node:$ptr))>;
def loadi16 : PatFrag<(ops node:$ptr), (i16 (load node:$ptr))>;
def loadi32 : PatFrag<(ops node:$ptr), (i32 (load node:$ptr))>;
def loadi64 : PatFrag<(ops node:$ptr), (i64 (load node:$ptr))>;
def loadi64 : PatFrag<(ops node:$ptr), (i64 (load node:$ptr))>;
def loadf32 : PatFrag<(ops node:$ptr), (f32 (load node:$ptr))>;
def loadf32 : PatFrag<(ops node:$ptr), (f32 (load node:$ptr))>;
...
...
This diff is collapsed.
Click to expand it.
llvm/test/CodeGen/X86/vec_set-H.ll
0 → 100644
+
15
−
0
View file @
3f40c690
; RUN: llvm-as < %s | llc -march=x86 -mattr=+sse2 | not grep movz
define
<
2
x
i64
>
@doload64
(
i16
signext
%x
)
nounwind
{
entry:
%tmp36
=
insertelement
<
8
x
i16
>
undef
,
i16
%x
,
i32
0
; <<8 x i16>> [#uses=1]
%tmp37
=
insertelement
<
8
x
i16
>
%tmp36
,
i16
%x
,
i32
1
; <<8 x i16>> [#uses=1]
%tmp38
=
insertelement
<
8
x
i16
>
%tmp37
,
i16
%x
,
i32
2
; <<8 x i16>> [#uses=1]
%tmp39
=
insertelement
<
8
x
i16
>
%tmp38
,
i16
%x
,
i32
3
; <<8 x i16>> [#uses=1]
%tmp40
=
insertelement
<
8
x
i16
>
%tmp39
,
i16
%x
,
i32
4
; <<8 x i16>> [#uses=1]
%tmp41
=
insertelement
<
8
x
i16
>
%tmp40
,
i16
%x
,
i32
5
; <<8 x i16>> [#uses=1]
%tmp42
=
insertelement
<
8
x
i16
>
%tmp41
,
i16
%x
,
i32
6
; <<8 x i16>> [#uses=1]
%tmp43
=
insertelement
<
8
x
i16
>
%tmp42
,
i16
%x
,
i32
7
; <<8 x i16>> [#uses=1]
%tmp46
=
bitcast
<
8
x
i16
>
%tmp43
to
<
2
x
i64
>
; <<2 x i64>> [#uses=1]
ret
<
2
x
i64
>
%tmp46
}
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment