playing with pointers

Understanding Ladner's Theorem

Thu, 07 May 2020 00:00:00 -0700

As you probably know, whether \(P = NP\) is a major unsolved problem in computer science.

Even if you believe \(P \ne NP\), it is tempting to think that NP \(=\) P \(\cup\) NP-complete – that every problem in NP can either be solved in polynomial time or is expressive enough to encode SAT.

Seemingly Impossible Turing Machines

Sat, 29 Jun 2019 00:00:00 -0700

In this post I’ll explore a concept I discovered in a very interesting writeup by Martin Escardo on Andrej Bauer’s blog. The title of this post is an homage to the title used by Martin Escardo: “Seemingly impossible functional programs”.

I have made the following changes from the original post:

Resource Variable Operations in TensorFlow/XLA

Fri, 21 Jun 2019 00:00:00 -0700

In this post we will look at how we auto-cluster resource variable operations in TensorFlow graphs into XLA computations and why it isn’t entirely trivial.

A Word About Tensors

Tensors in TensorFlow are represented as instances (surprise!) of the Tensor class. Tensor instances contain information about the shape of a tensor and a pointer to a reference counted buffer, which is an instance of TensorBuffer. Multiple instance of Tensor can point to the same TensorBuffer.

Control Flow in TensorFlow & XLA's Auto-Clustering

Sun, 07 Oct 2018 00:00:00 -0700

In this post we’ll look at an interesting issue that crops up when auto-clustering TensorFlow graphs. I’ve deliberately focused more on the problem than on the solution – the possible solutions are, in my opinion, fairly obvious once the problem is clear.

Control flow in TensorFlow

First we need a high level overview of how control flow is represented in TensorFlow graphs.

An Issue with Java's Final Fields

Tue, 21 Mar 2017 00:00:00 -0700

I believe the current specification of final fields in the Java Memory Model is broken in one of the following ways:

It prevents some basic CSE-type compiler-optimizations
It requires the JVM to make every load an acquire load
It complicates compiler IR by forcing it to track syntactic dependencies
It requires weakening the JMM in a backward incompatible way

While this isn’t exactly news (I have been told that the wording around final fields in the JMM is known to be problematic), I could not easily find an explicit record of this issue anywhere else on the internet. Hence this blog post.

Integer overflow in LLVM's ScalarEvolution

Sun, 18 Sep 2016 00:00:00 -0700

This is a short note on how integer overflow fits in with LLVM’s ScalarEvolution. This post is specific to LLVM’s implementation of ScalarEvolution, and I’ve assumed some familiarity with LLVM internals and integer arithmetic.

ScalarEvolution and add recurrences

ScalarEvolution is an analysis in LLVM¹ that helps its clients reason about induction variables. It does this by mapping² SSA values into objects of a SCEV type, and implementing an algebra on top of it. Using this algebra, clients of ScalarEvolution can ask questions like “is A always signed-less-than B?” or “is the difference between A and B a constant integer?”, where A and B are objects of the SCEV data type.

Inter-Procedural Optimization and Derefinement

Sat, 16 Jul 2016 00:00:00 -0700

This is a summary of an issue that was semi-recently fixed in LLVM and GCC. It merits a blog post because the issue is somewhat subtle, and a central place to refer to can be helpful.

Setting the Stage

In this post we will focus on C++ inline functions. The problem described here may apply to other cases as well, but we won’t focus on those.

Check Widening in LLVM

Fri, 17 Jun 2016 00:00:00 -0700

This post describes infrastructure that has gone in to LLVM piecemeal over the last couple of months. All of the information in this post is scattered throughout in commit messages on llvm-commits and email threads on llvm-dev. This post is intended to present a coherent story for people not actively involved in the original discussions and without the spare time to stitch together the big picture from individual commits and emails.

Motivation: Checks in Managed Languages

In “safe” languages like Java, it is the virtual machine’s job to ensure that illegal operations (like dereferencing bad memory or unsound type coercions) does not lead to the program into arbitrarily bad states. This is typically enforced by adding runtime checks to certain operations to check for violations. Field accesses elicit a null check, array loads and stores have a range check (and in some cases a type check), type casts check the cast is well-formed etc. For this post we’ll focus on range checks only, but the general idea is applicable to any kind of runtime check.

A Problem with LLVM's Undef

Tue, 29 Dec 2015 00:00:00 -0800

LLVM has a special value in its SSA value hierarchy called undef that is used to model (amongst other things) reads from uninitialized memory. Semantically, an undef value has a potentially new bit pattern of the compiler’s choosing at each use site, meaning that values like xor i32 %a, %a need not always evaluate to 0 when %a is undef (even though they’re allowed to). This lack of consistency lets LLVM get away without allocating registers to remember a specific “version” of undef.

Another way to look at this is that undef isn’t a normal SSA value, and uses of an undef value are also its defs. This leads to some interesting restrictions on data flow analysis via control flow, and, in some cases, accounting for undef inhibits optimization instead of enabling it.

Reference Counting: Harder than it Sounds

Wed, 14 Oct 2015 00:00:00 -0700

Naive reference counting is “easy” to implement on a system that does not share objects between threads, but thinking about reference counting in systems that do share objects between threads, two problems (other than the standard “increments and decrements need to be atomic operations”) come to mind. So far the contents of this post (which are not novel) have lived in one-off tweets and emails, but I think it is time to write them down in an organized way.

Problem 0: Stores to the Heap need to be XCHGes

(Edit: I initially had a few mistakes here – I’d claimed that the stores need to be CAS’es when an XCHG would be sufficient. The order between the increment and the decrement was also incorrect. Thanks @barrkel for pointing these out!)