A Problem with LLVM's Undef
LLVM has a special value in its SSA value hierarchy called undef
that is used to model (amongst other things) reads from uninitialized
memory. Semantically, an undef
value has a potentially new bit
pattern of the compiler’s choosing at each use site, meaning that
values like xor i32 %a, %a
need not always evaluate to 0
when
%a
is undef
(even though they’re allowed to). This lack of
consistency lets LLVM get away without allocating registers to
remember a specific “version” of undef
.
Another way to look at this is that undef
isn’t a normal SSA value,
and uses of an undef
value are also its defs. This leads to
some interesting restrictions on data flow analysis via control flow,
and, in some cases, accounting for undef
inhibits optimization
instead of enabling it.
For instance, consider this:
declare i1 @predicate()
declare void @use(i32)
define void @f(i32 %d, i32 %n) {
entry:
%division_unsafe = icmp eq i32 %d, 0
br i1 %division_unsafe, label %leave, label %loop.ph
loop.ph:
br label %loop
loop:
%iv = phi i32 [ 0 , %loop.ph ], [ %iv.inc, %be ]
%iv.inc = add i32 %iv, 1
%p = call i1 @predicate()
br i1 %p, label %divide, label %be
divide:
%q = udiv i32 1, %d
call void @use(i32 %q)
br label %be
be:
%be.cond = icmp ult i32 %iv, %n
br i1 %be.cond, label %loop, label %leave
leave:
ret void
}
Is it legal to hoist the division %q
in divide
to the loop
preheader? At first, it does look like it is legal, since the loop
preheader is guarded on the %d
not being 0
. However, in the
transformed program
declare i1 @predicate()
declare void @use(i32)
define void @f(i32 %d, i32 %n) {
entry:
%division_unsafe = icmp eq i32 %d, 0
br i1 %division_unsafe, label %leave, label %loop.ph
loop.ph:
%q = udiv i32 1, %d
br label %loop
loop:
%iv = phi i32 [ 0 , %loop.ph ], [ %iv.inc, %be ]
%iv.inc = add i32 %iv, 1
%p = call i1 @predicate()
br i1 %p, label %divide, label %be
divide:
call void @use(i32 %q)
br label %be
be:
%be.cond = icmp ult i32 %iv, %n
br i1 %be.cond, label %loop, label %leave
leave:
ret void
}
we have a problem. If %d
is undef
, then %division_unsafe = icmp
eq i32 undef, 0
is allowed to be false
while %q = udiv i32 1,
undef
is allowed to have undefined behavior (by choosing 0
for
undef
). If @predicate
always returns false
then the original
program is perfectly well defined for %d
= undef
while the
transformed program isn’t, even though the transform we made looks
very reasonable.
Generally, with undef
in the play, control dependence in the control
flow graph cannot be used to derive facts about SSA values. If the
value we’re interested in happens to be undef
, then it can “pretend”
to satisfy the predicate the control dependence is on while
“pretending” to not satisfy the predicate on later control dependent
uses of the same value. In cases like the above, presence of undef
in LLVM IR actually inhibits optimization.
This problem isn’t unique to branches – many kinds of correlated
value or predicate analyses are problematic. Consider %expr =
smax(%a + 1, %a) - smin(%a + 1, %a)
, with smax
implemented using
select
and icmp
. Is %expr
always non-zero? In the absence of
undef
, %expr
is either -1
or 1
(so it is tempting to say
yes). However, if %a
is undef
, then %expr
is undef
as well
(and thus allowed to be 0
).