Understanding Registers and Data Movement in x86-64 Assembly

A hands-on guide to general-purpose registers and data movement in x86-64

Jul 16, 2025

∙ Paid

“In the beginning, there was a word. Then came the doubleword, and finally the quadword.”

This article is part of our series on x86-64 assembly. So far we have learned to write simple programs that can move some data around and invoke system calls. For the complete list of articles published so far in this series, check out the series overview.

Understanding Computer Organization from First Principles
Bits, memory, and the logic behind modern computing. A gentle dive into the foundations.
Binary Arithmetic and Bitwise Operations for Systems Programming
Signed numbers, two's complement, masking tricks, and bit-level manipulations that matter.
The System-Level Foundation of Assembly
How your code goes from main() to a running process, and where assembly fits in.
Building (and Breaking) Your First X86 Assembly Program
A minimal working program from scratch, with no runtime or C library. Learn by breaking it apart.
Debugging X86-64 Assembly with GDB
Hands-on debugging walkthroughs to inspecting registers, memory, and control flow.
Making System Calls in x86-64 Assembly
How to interact with the operating system directly using syscalls without a C runtime.

I’m also publishing this in the form an ebook (PDF). If you don’t wish to upgrade to a subscription, you can purchase the PDF using the following link. If you are a paid subscriber you can get it at a discount (monthly subs: 20% and annual subs: 50%), please email me for the discounted link.

Purchase Ebook

Introduction

Now that we've written and debugged a few x86-64 assembly programs, it's time to take a closer look at one of the most fundamental pieces of the architecture: the general-purpose registers.

Rather than throwing a table of names and sizes at you, we'll build up a mental model of how these registers evolved, starting from the 8086 and leading up to modern 64-bit hardware. That historical context makes it much easier to understand the naming conventions and relationships, so you're not constantly wondering where things like sil or r8d came from.

The article also includes hands-on exercises to help you understand how values move between registers of different sizes, and to develop an intuition for how partial registers behave. Along the way, we’ll also cover some of the edge cases and architectural quirks. These often overwhelm beginners, but I’ve tried to present them in the right context, so they’re easier to understand and less likely to trip you up.

Registers in the 16-bit Era

The x86 architecture formally began life with the 8086 processor, which was a 16-bit machine. This meant that it had 16-bit wide registers, and its instructions could operate on values up to 16 bits in size.

The general-purpose registers were named after the first four letters of the alphabet: ax, bx, cx, and dx.

8-bit Register Halves

While these registers could work with 16-bit values, there was also a need to handle 8-bit data. Using bitwise masks to access just the higher or lower 8 bits would have been cumbersome and inefficient, requiring extra instructions. To solve this, the 8086 architecture introduced alternate names to refer directly to the upper and lower 8-bit halves of the 16-bit registers.

The naming was logical: replace the "x" in the 16-bit register name with "h" for the high byte or "l" for the low byte. For example, ah refers to the high 8 bits of ax, and al refers to the low 8 bits.

The following diagram shows the full set of general-purpose registers in the 8086, including how the 8-bit halves map onto the 16-bit registers:

The breakdown of 16-bit registers and their 8-bit halves in the 8086 processor

Word Size and Instruction Suffixes

If you remember, when we wrote our first x86-64 assembly program, we wrote the following instruction:

movq $32, %rdi

Here, mov is the instruction, and the q suffix stands for "quadword", which in x86-64 means 64 bits.

x86 uses suffixes to indicate operand sizes: 8-bit, 16-bit, 32-bit, and 64-bit. These suffixes evolved along with the architecture, and we'll explore them as we move from 16-bit to 64-bit.

You're right to think that if a quadword is 64 bits, then a word must be 16 bits. The 8086 was a 16-bit processor, and as a result its word size was also 16 bits. In computer architecture, the word size is the number of bits of data that the processor can handle in a single operation. So, the assembly instructions for 8086 used the suffix “w" for 16-bit values.

Hands-on Exercise: Working with 16-bit Registers

Here’s an example that writes two 16-bit values into ax and bx, computes their difference, and exits.

.text

.globl _start
_start:
    # write two 16-bit values into ax and bx
    movw $100, %ax
    movw $58, %bx

    # compute the difference: ax = ax - bx
    subw %bx, %ax

    # exit with status code: 0    
    movq $60, %rax
    # xoring rdi with itself zeroes it
    xorq %rdi, %rdi
    syscall

Try running this inside gdb, and observe the values of the registers ax and bx after each instruction. You can use the following commands to do this:

p (short) $ax 
p (short) $bx

Note About the xor Instruction: In the above program, xorq %rdi, %rdi zeroes out the rdi register. This is a common and efficient trick: XOR-ing a register with itself always results in zero.

Hands-on Exercise: Working with 8-bit Registers

Let’s run a small program that helps you visualize how the ah and al 8-bit halves relate to the full 16-bit ax register.

.text
.globl _start

_start:
    # write a 16-bit value 0x1234 into ax
    movw $0x1234, %ax

    # copy the high 8 bits of ax into bl
    movb %ah, %bl

    # copy the low 8 bits of ax into ch
    movb %al, %ch

    # exit
    movq $60, %rax
    xorq %rdi, %rdi
    syscall

Try this in GDB, and inspect the values of %ax, %bl, and %ch after each instruction. You should see:

%ax contains 0x1234
%ah (upper byte of ax) is 0x12 → copied to %bl
%al (lower byte of ax) is 0x34 → copied to %ch

You can use the following commands to inspect the values of these registers:

p (short) $ax
p (char) $bl
p (char) $ch