4 - Operand Size and Memory-to-Memory Restrictions
When writing assembly, the CPU instructions must have fully specified operand sizes. If an instruction does not make the operand size clear, the assembler cannot encode it.
Additionally, the MOV instruction in x86 does not allow both operands to be memory locations, which means data must often pass through a register.
This note demonstrates both of these rules.
The Experiment
Consider the following program:
section .bss
var1 resb 2
section .data
var2 db 6
section .text
global _start
_start:
MOV eax, 0x1
MOV ebx, 0x1
MOV byte [var2], 0x45
MOV bl, [var2]
MOV [var1], bl
INT 0x80Problem 1: Writing a Constant to Memory
Suppose we try the following instruction:
MOV [var2], 0x45This instruction fails to assemble.
The reason is that the operand size is ambiguous. The assembler must generate machine code that specifies the exact number of bytes being written to memory.
Possible interpretations include:
MOV byte [var2], 0x45
MOV word [var2], 0x45
MOV dword [var2], 0x45Since the instruction itself does not indicate the size, the assembler rejects it.
Explicitly Specifying the Operand Size
To resolve the ambiguity, the operand size must be specified explicitly.
MOV byte [var2], 0x45This tells the assembler:
Write one byte (0x45) to the address of var2Once the size is known, the assembler can generate the correct machine instruction.
Why the Assembler Does Not Infer the Size
You might wonder:
“But
var2was declared usingdb, which defines a byte. Why can't the assembler infer the size from that?”
Example declaration:
var2 db 6This directive:
- reserves one byte in memory
- assigns a label (
var2) to that memory address
However, in assembly, labels are primarily treated as addresses, not typed variables like in high-level languages.
Although the assembler internally knows the size used in the declaration, it does not automatically infer operand size from the symbol when encoding instructions. Instead, it requires the programmer to specify the size when the instruction is ambiguous.
This is a design choice that keeps assembly instructions explicit and predictable.
Problem 2: Memory-to-Memory Transfers
Consider the instruction:
MOV word [var1], [var2]This attempts to move data directly from one memory location to another.
memory → memoryThe MOV instruction does not allow both operands to be memory operands.
Most x86 instructions follow the rule:
At most one operand can refer to memoryCorrect Approach
Data must pass through a register.
memory → register → memoryExample:
MOV bl, [var2]
MOV [var1], blStep-by-step:
- Load the value at
var2into registerbl - Store the value from
blinto memory atvar1
Memory Layout
var1
var1 resb 2This reserves 2 bytes of uninitialized memory.
var2
var2 db 6This stores a single byte with value: 6
After the instruction:
MOV byte [var2], 0x45Memory becomes:
var2 = 0x45Then:
MOV bl, [var2]
MOV [var1], blResult:
var1[0] = 0x45
var1[1] = unchanged (undefined)Only one byte is written because the register bl is 8 bits wide.
Summary
-
The assembler must know the exact operand size to encode an instruction.
-
If the instruction does not make the size clear, it must be specified explicitly using:
byte word dword qword -
Labels in assembly represent addresses, not typed variables.
-
The
MOVinstruction does not support memory-to-memory operands. -
Most x86 instructions allow at most one memory operand.
-
Data movement typically follows the pattern:
memory → register → memory