Zachtronics’ new assembly game EXAPUNKS was released this week. This post covers some optimization techniques I discovered during play. This post contains spoilers but keep reading because this game is pretty boring by virtue of being similar to real programming.

First, some general guidelines for new players.

There are 3 metrics: speed, code size and exas.
You can only optimize one metric at a time.
Most optimizations are boilerplates(covered below).
Solve all the puzzles first. Optimization is a different game in itself. I solved all the puzzles before starting optimization.
When you are optimizing a loop, use all 32 lines.

Technique 0: Omit DROP and HALT

You don’t need DROP or HALT on the last line. It’s kind of like how in Linux you don’t call free() right before program terminates.

Example: the first program.

Technique 1: ALU register output

Consider this program:

LINK 800
GRAB 200
COPY F X
ADDI X F X
MULI X F X
SUBI X F X
COPY X F
LINK 800

Instead of saving the output of the ALU instructions to X register, just output to the final register F.

LINK 800
GRAB 200
ADDI F F X
MULI X F X
SUBI X F F
LINK 800

Technique 2: Unroll Loops

This has nothing to do with SIMD. The goal is to minimize branch penalty. The idea is to make the loop body fatter so there are more action for every TEST instruction. It gets repetitive pretty fast but it’s unavoidable if you want to be on the leaderboard. Maybe one day Zachtronics will come up with a new scoring mechanism to circumvent this.

Note: before unwinding loops, try and eliminate as many instructions as possible.

Example 1:

This is the tutorial puzzle where you write n down to 0. Here is an optimized version.

LINK 800
GRAB 200
COPY F X
WIPE
LINK 800
MAKE

MARK LOOP
DIVI X 13 T
FJMP DONE
COPY X F
@rep 13
SUBI X @{1,1} F
@end
JUMP LOOP

MARK DONE
COPY X F
@rep 11
MODI -1 X X
COPY X F
@end

Note:

@rep 13
SUBI X @{1,1} F
@end

expands to

SUBI X 1 F
SUBI X 2 F
SUBI X 3 F
SUBI X 4 F
SUBI X 5 F
SUBI X 6 F
SUBI X 7 F
...
SUBI X 13 F

Basically, you pack the loop as tightly as you can(13 items) and then handle the remainder(12 items).

Example 2, highway sign:

Naive version:

sol1

GRAB 300
LINK 800
MARK LOOP
DIVI X 9 #DATA
MODI X 9 #DATA
COPY F #DATA
ADDI X 1 X
TEST X = 27
TJMP END
JUMP LOOP
MARK END
WIPE

Unrolled version:

sol1

Technique 3: Quick halt

MODI packs subtraction and halting into one. It either subtracts 1 or throws a division by 0 runtime error equivalent to HALT.

Here is the formula for modulo:

let a = -1 in
r = a % n
  = a - n * ⌊a / n⌋
  = n - 1

There are more in the runtime error section in the manual but MODI is the most useful one.

Technique 4: Use `T` as Counter

TEST sets T to 1 if positive. So a common practice is to use T as a counter. When T = 0 the loop will exit.

COPY 8 T
MARK LOOP

NOOP
SUBI T 1 T

TJMP LOOP
NOOP

Alternatively, do something N times then die.

COPY <N> T
MARK LOOP
NOOP
MOD -1 X X
JUMP LOOP

Technique 5: Eliminate idle nodes

When a node is idling, ie. blocking read from channel, queuing to cross a link, the cycle counter is still ticking. So try REPLing in a vacant host and fan out work load.

Example:

sol1

Technique 6: Reorder control flow to eliminate jumps

These are context specific. The hot path should be the one immediately after TEST to save a JUMP.

Conclusion

In this post, I covered some common EXAPUNKS optimization techniques. If you have more please let me know!

Technique 0: Omit DROP and HALT

Technique 1: ALU register output

Technique 2: Unroll Loops

Example 1:

Example 2, highway sign:

Technique 3: Quick halt

Technique 4: Use T as Counter

Technique 5: Eliminate idle nodes

Technique 6: Reorder control flow to eliminate jumps

Conclusion

Technique 4: Use `T` as Counter