PyCCA Assembler

The core of pycca is an x86 assembly compiler that allows the creation and execution of machine code for IA-32 and Intel-64 architectures. The output of pycca’s assembler is tested to generate identical output to the GNU assembler for all supported instructions.

Assembly language

PyCCA’s assembler uses a syntax and instruction mnemonics very similar to the intel / NASM assembly syntax. Instructions consist of a mnemonic (instruction name) followed by whitespace and a number of comma-separated operands:

label:         # Comments follow a hash
   push ebp
   sub esp, 32
   mov eax, dword ptr [edx + ecx*8 + 12]
   jmp label

Note: Many assembler examples found on the internet use the AT&T syntax, which prefixes register names with ‘%’ and reverses the order of operands (the intel syntax puts the destination operand first; AT&T puts the source operand first).

Operands may be one of four types:

  • The name of a register (see all registers).
  • An “immediate” integer data value. These may be signed or unsigned and are evaluated as python expressions, so “0xFF” and “0b1101” are also accepted syntaxes.
  • The name of a label declared elsewhere in the code (these are ultimately compiled as immediate values pointing to the address of the label declaration).
  • A pointer to data in memory. Pointers provide both a memory address and the size of data they point to.

In x86, memory addresses are specified as the sum of a base register, a scaled offset register, and an integer displacement:

address = base + offset*scale + displacement

Where scale may be 1, 2, 4, or 8, and addresses may contain any combination of these three elements. Memory operands are written with square brackets surrounding the address expression. For example:

Memory operand Description
[rax] Pointer to address stored in register rax
[eax + ebx*2] Address calculated as eax + ebx*2
[0x1000] Pointer to address 0x1000
[rax + rbx + 8] Address calculated as rax + rbx + 8
word ptr [rbp - 0x10] Pointer to 2 byte data beginning at rbp - 0x10
qword ptr [eax] pointer to 8 byte data beginning at eax

On 64 bit architectures, it is only valid to use 64 or 32 bit registers for memory addresses. On 32 bit architectures, it is only valid to use 32 or 16 bit registers for memory addresses. Note: the allowed expression forms for 16 bit addresses are very limited and are not covered here.

Building assembly from Python objects

It is also possible to write assembly code as a list of Instruction instances. This has the advantage of avoiding the parsing stage and facilitating dynamically-generated assembly code:

from pycca.asm import *

code = [
    label('start'),
    push(ebp),
    mov(ebp, esp),
    push(dword([ebp+12])),
    push(dword([ebp+8])),
    mov(eax, func_ptr),
    call(eax),
    mov(esp, ebp),
    pop(ebp),
    ret(ret_byts),
    jmp('start'),
]

Thanks to similarities in the NASM and Python syntaxes, there are only minor differences in this approach:

  • Instructions are classes and thus require parentheses to instantiate
  • Use pointer size functions like dword(address) instead of dword ptr.
  • Labels are also objects and they are referenced by their string name (see label and jmp lines above).

Compiling Python functions from assembly

Executing this code is only a matter of compiling it into a ctypes function and providing the return and argument types:

func = mkfunction(code)
func.restype = ctypes.c_double
func.argtypes = (ctypes.c_double,)

result = func(3.1415)

For more examples of building and calling functions, accessing array data, and more, see asm_examples.py. For lists of supported instructions and registers, see the Assembly API Reference.

Differences with GNU-AS

PyCCA’s assembly is closely modeled after the intel assembler syntax and is tested to produce identical output to the GNU assembler using the ”.intel-mnemonic” directive. There are a few differences, however:

  • GAS quietly ignores undefined symbols, treating them as null pointers; pycca will raise an exception.
  • GAS quietly truncates displacement values; pycca will raise an exception if the displacement is too large to be encoded.

Adding support for new instructions

Although pycca currently supports only a small subset of the x86 instruction set, it is relatively simple to add support for new instructions by transcribing the instruction encoding from the intel reference (see volume 2) <http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html>. Github pull requests adding new instruction support are encuraged, but should be accompanied by adequate documentation and unit tests.

To add new instructions:

  • Look through pycca/asm/instructions.py for examples of already implemented instructions.

  • Create a new subclass of Instruction and set the name class attribute to the instruction mnemonic.

  • The class docstrings are generally copied from the first paragraph or two from the instruction description in the Intel reference, with minor modifications.

  • The modes attribute should be set to an OrderedDict containing data found in the instruction encoding table

    • Keys must be tuples describing the operand types (r32, r/m64, imm8, etc.) accepted for each encoding.
    • Values must be a sequence of 4 items: [instruction encoding, operand encoding, 64-bit support, and 32-bit support]. These values are usually copied verbatim from the reference manual (but note that the manual is often inconsistent or contains errors; when in doubt refer to the already implemented instructions for examples).
  • The operand_enc attribute must be a dict containing information found in the operand encoding table for that instruction in the Intel reference. Keys are the same strings as found in modes[][1], and each value is a list of encoding strings for each operand. These strings are usually copied verbatim from the reference, but again this representation is not always consistent (in fact some instructions lack an operand encoding table altogether).

  • Add a new test function to pycca/asm/test_asm.py, using other instructions as examples. Each mode in the modes attribute should be tested at least once.

  • For debugging, make use of tools in pycca.asm.util, especially the compare, as_code, and phexbin functions.

Note that advanced CPU extensions such as SSE2 and AVX are not yet supported.

Assembly API Reference

Building executable code

pycca.asm.mkfunction(code, namespace=None)

Convenience function that creates a CodePage from the supplied code argument and returns a function pointing to the first byte of the compiled code.

See CodePage.get_function().

class pycca.asm.CodePage(asm, namespace=None)

Compiles assembly, loads machine code into executable memory, and generates python functions for accessing the code.

Initialize with either an assembly string or a list of Instruction instances. The namespace argument may be used to define extra symbols when compiling from an assembly string.

This class encapsulates a block of executable mapped memory to which a sequence of asm commands are compiled and written. The memory page(s) may contain multiple functions; use get_function(label) to create functions beginning at a specific location in the code.

dump()

Return a string representation of the machine code and assembly instructions contained in the code page.

get_function(label=None)

Create and return a python function that points to a specific label within the compiled code block, or the first byte if no label is given.

The return value is a ctypes function; it is recommended to set the restype and argtypes properties on the function before calling it.

For more examples of building and calling functions, accessing array data, and more, see asm_examples.py.

Supported registers

All registers may be accessed as attributes of the pycca.asm or pycca.asm.register modules.

General purpose registers:

arch 32 / 64 64 only
size 8 16 32 64 8 16 32 64
  al ax eax rax r8b r8w r8d r8
  cl cx ecx rcx r9b r9w r9d r9
  dl dx edx rdx r10b r10w r10d r10
  bl bx ebx rbx r11b r11w r11d r11
  ah sp esp rsp r12b r12w r12d r12
  ch bp ebp rbp r13b r13w r13d r13
  dh si esi rsi r14b r14w r14d r14
  bh di edi rdi r15b r15w r15d r15

Floating-point registers:

arch 32 / 64
size 80 64 128
  st(0) mm0 xmm0
  st(1) mm1 xmm1
  st(2) mm2 xmm2
  st(3) mm3 xmm3
  st(4) mm4 xmm4
  st(5) mm5 xmm5
  st(6) mm6 xmm6
  st(7) mm7 xmm7
class pycca.asm.register.Register(val, name, bits)[source]

General purpose register.

bits[source]

Register size in bits

check_arch()[source]

Raise an exception if this register is not supported for the current architecture.

name[source]

Register name

rex[source]

Bool indicating value of 4th bit of register code

val[source]

3-bit integer code for this register.

Supported instructions

All instructions currently supported by pycca are listed below. Most instructions accept a variety of operand types which are listed in a table for each instruction:

Operand code Operand type
r general purpose register
r/m register or memory
imm immediate value
st(i) x87 ST register
xmmI xmm register

Operand codes are followed by one or more values indicating the allowed size(s) for the operand. For example, the push instruction gives the following table:

src 32-bit 64-bit description
r/m8 X X Push src onto stack
r/m16 X X  
r/m32 X    
r/m64   X  
imm8/32 X X  

This table indicates that push accepts one operand src that may be a general purpose register, a memory address, or an immediate value. The allowed operand sizes depend on the target architecture (for this instruction, 64 bit memory/register operands are not allowed on 32 bit architectures and vice-versa).

class pycca.asm.instructions.add(dst, src)[source]

Adds the destination operand (first operand) and the source operand (second operand) and then stores the result in the destination operand.

The destination operand can be a register or a memory location; the source operand can be an immediate, a register, or a memory location. (However, two memory operands cannot be used in one instruction.) When an immediate value is used as an operand, it is sign-extended to the length of the destination operand format.

dst src 32-bit 64-bit description
r/m8 r/m8, imm8 X X dst += src
r/m16 r/m16, imm8/16 X X  
r/m32 r/m32, imm8/32 X X  
r/m64 r/m64, imm8/32   X  
class pycca.asm.instructions.call(addr)[source]

Saves procedure linking information on the stack and branches to the called procedure specified using the target operand.

The target operand specifies the address of the first instruction in the called procedure. The operand can be an immediate value, a general-purpose register, or a memory location.

dst 32-bit 64-bit description
rel32 X X Call address relative to this instruction
r/m16 X   Call absolute address stored at r/m16/32/64
r/m32 X    
r/m64   X  
class pycca.asm.instructions.cmp(src1, src2)[source]

Compares the first source operand with the second source operand and sets the status flags in the EFLAGS register according to the results.

The comparison is performed by subtracting the second operand from the first operand and then setting the status flags in the same manner as the SUB instruction. When an immediate value is used as an operand, it is sign-extended to the length of the first operand.

src1 src2 32-bit 64-bit description
r/m8 r/m8, imm8 X X  
r/m16 r/m16, imm8/16 X X  
r/m32 r/m32, imm8/32 X X  
r/m64 r/m64, imm8/32   X  
class pycca.asm.instructions.dec(dst)[source]

Subtracts 1 from the destination operand, while preserving the state of the CF flag.

The destination operand can be a register or a memory location. This instruction allows a loop counter to be updated without disturbing the CF flag. (To perform a decrement operation that updates the CF flag, use a SUB instruction with an immediate operand of 1.)

dst 32-bit 64-bit description
r/m8 X X dst -= 1
r/m16 X X  
r/m32 X X  
r/m64   X  
class pycca.asm.instructions.fabs[source]

Clears the sign bit of ST(0) to create the absolute value of the operand. Accepts no operands.

class pycca.asm.instructions.fadd(*args)[source]

Adds the destination and source operands and stores the sum in the destination location.

The destination operand is always an FPU register; the source operand can be a register or a memory location. Source operands in memory can be in single-precision or double-precision floating-point format.

dst src 32-bit 64-bit description
st(j) st(i) X X dst += src (at least one operand must be st(0))
class pycca.asm.instructions.faddp(*args)[source]

Adds the destination and source operands and stores the sum in the destination location.

The FADDP instructions perform the additional operation of popping the FPU register stack after storing the result. To pop the register stack, the processor marks the ST(0) register as empty and increments the stack pointer (TOP) by 1.

dst src 32-bit 64-bit description
st(i) st(0) X X X X dst += st(0), pop st(0) from FP stack st(1) += st(0), pop st(0) from FP stack
class pycca.asm.instructions.fcomi(src1, src2)[source]

Performs an unordered comparison of the contents of registers ST(0) and ST(i) and sets the status flags ZF, PF, and CF in the EFLAGS register according to the results (see the table below). The sign of zero is ignored for comparisons, so that -0.0 is equal to +0.0.

Comparison ZF PF CF
st(0) > st(i) 0 0 0
st(0) < st(i) 0 0 1
st(0) = st(i) 1 0 0
unordered 1 1 1
src1 src2 32-bit 64-bit description
st(0) st(i) X X  
class pycca.asm.instructions.fcomip(src1, src2)[source]

The FCOMIP instruction is similar to FCOMI but also pops the register stack following the comparison operation. To pop the register stack, the processor marks the ST(0) register as empty and increments the stack pointer (TOP) by 1.

src1 src2 32-bit 64-bit description
st(0) st(i) X X  
class pycca.asm.instructions.fdiv(*args)[source]

Divides the destination operand by the source operand and stores the result in the destination location. The destination operand (dividend) is always in an FPU register; the source operand (divisor) can be a register or a memory location. Source operands in memory can be in single-precision or double-precision floating-point format.

dst src 32-bit 64-bit description
st(j) st(i) X X dst /= src (at least one operand must be st(0))
class pycca.asm.instructions.fdivp(*args)[source]

The FDIVP instructions are similar to FDIV but perform the additional operation of popping the FPU register stack after storing the result. To pop the register stack, the processor marks the ST(0) register as empty and increments the stack pointer (TOP) by 1.

dst src 32-bit 64-bit description
st(i) st(0) X X X X dst /= st(0), pop st(0) from FP stack st(1) /= st(0), pop st(0) from FP stack
class pycca.asm.instructions.fiadd(src)[source]

The FIADD instructions are similar to FADD, but convert an integer source operand to double extended-precision floating-point format before performing the addition.

src 32-bit 64-bit description
m32 X X ST(0) += src
m64 X X  
class pycca.asm.instructions.fidiv(src)[source]

Divides the destination operand by the source operand and stores the result in the destination location. The destination operand (dividend) is always in an FPU register; the source operand (divisor) can be a register or a memory location. Source operands in memory can be in word or doubleword integer format.

src 32-bit 64-bit description
m32 X X ST(0) /= src
m64 X X  
class pycca.asm.instructions.fild(src)[source]

Converts the signed-integer source operand into double extended-precision floating-point format and pushes the value onto the FPU register stack.

The source operand can be a word, doubleword, or quadword integer. It is loaded without rounding errors. The sign of the source operand is preserved.

src 32-bit 64-bit description
m16 X X Push 16 bit int at src onto FPU stack
m32 X X Push 32 bit int at src onto FPU stack
m64 X X Push 64 bit int at src onto FPU stack
class pycca.asm.instructions.fimul(src)[source]

Multiplies the destination and source operands and stores the product in the destination location. The destination operand is always an FPU data register; the source operand can be an FPU data register or a memory location. Source operands in memory can be in word or doubleword integer format.

src 32-bit 64-bit description
m32 X X ST(0) *= src
m64 X X  
class pycca.asm.instructions.fist(dst)[source]

The FIST instruction converts the value in the ST(0) register to a signed integer and stores the result in the destination operand.

Values can be stored in word or doubleword integer format. The destination operand specifies the address where the first byte of the destination value is to be stored.

dst 32-bit 64-bit description
m16 X X Store ST(0) to 16 bit signed int at dst
m32 X X Store ST(0) to 32 bit signed int at dst
class pycca.asm.instructions.fistp(src)[source]

The FISTP instruction performs the same operation as the FIST instruction and then pops the register stack.

To pop the register stack, the processor marks the ST(0) register as empty and increments the stack pointer (TOP) by 1. The FISTP instruction also stores values in quadword integer format.

dst 32-bit 64-bit description
m16 X X Store ST(0) to 16 bit signed int at dst
m32 X X Store ST(0) to 32 bit signed int at dst
m64 X X Store ST(0) to 64 bit signed int at dst
class pycca.asm.instructions.fisub(src)[source]

Subtracts the source operand from the destination operand and stores the difference in the destination location. The destination operand is always an FPU data register; the source operand can be a register or a memory location. Source operands in memory can be in word or doubleword integer format.

src 32-bit 64-bit description
m32 X X ST(0) -= src
m64 X X  
class pycca.asm.instructions.fld(src)[source]

Pushes the source operand onto the FPU register stack.

The source operand can be in single-precision, double-precision, or double extended-precision floating-point format. If the source operand is in single-precision or double-precision floating-point format, it is automatically converted to the double extended-precision floating-point format before being pushed on the stack.

src 32-bit 64-bit description
m32 X X Push 32 bit float at src onto FPU stack
m64 X X Push 64 bit float at src onto FPU stack
m80 X X Push 80 bit float at src onto FPU stack
ST(i) X X Push float at ST(i) onto FPU stack
class pycca.asm.instructions.fmul(*args)[source]

Multiplies the destination and source operands and stores the product in the destination location. The destination operand is always an FPU data register; the source operand can be an FPU data register or a memory location. Source operands in memory can be in single-precision or double-precision floating-point format.

dst src 32-bit 64-bit description
st(j) st(i) X X dst *= src (at least one operand must be st(0))
class pycca.asm.instructions.fmulp(*args)[source]

The FMULP instructions are similar to FMUL but perform the additional operation of popping the FPU register stack after storing the product. To pop the register stack, the processor marks the ST(0) register as empty and increments the stack pointer (TOP) by 1.

dst src 32-bit 64-bit description
st(i) st(0) X X X X dst *= st(0), pop st(0) from FP stack st(1) *= st(0), pop st(0) from FP stack
class pycca.asm.instructions.fst(dst)[source]

The FST instruction copies the value in the ST(0) register to the destination operand, which can be a memory location or another register in the FPU register stack.

When storing the value in memory, the value is converted to single-precision or double-precision floating-point format.

dst 32-bit 64-bit description
m32 X X Store ST(0) to 32 bit float at dst
m64 X X Store ST(0) to 64 bit float at dst
m80 X X Store ST(0) to 80 bit float at dst
ST(i)   X Store ST(0) to ST(i)
class pycca.asm.instructions.fstp(dst)[source]

The FSTP instruction performs the same operation as the FST instruction and then pops the register stack.

To pop the register stack, the processor marks the ST(0) register as empty and increments the stack pointer (TOP) by 1. The FSTP instruction can also store values in memory in double extended-precision floating-point format.

dst 32-bit 64-bit description
m32 X X Pop ST(0) to 32 bit float at dst
m64 X X Pop ST(0) to 64 bit float at dst
m80 X X Pop ST(0) to 80 bit float at dst
ST(i)   X Pop ST(0) to ST(i)
class pycca.asm.instructions.fsub(*args)[source]

Subtracts the source operand from the destination operand and stores the difference in the destination location. The destination operand is always an FPU data register; the source operand can be a register or a memory location. Source operands in memory can be in single-precision or double-precision floating-point format.

dst src 32-bit 64-bit description
st(j) st(i) X X dst -= src (at least one operand must be st(0))
class pycca.asm.instructions.fsubp(*args)[source]

The FSUBP instructions are similar to FSUB but perform the additional operation of popping the FPU register stack following the subtraction. To pop the register stack, the processor marks the ST(0) register as empty and increments the stack pointer (TOP) by 1.

dst src 32-bit 64-bit description
st(i) st(0) X X X X dst -= st(0), pop st(0) from FP stack st(1) -= st(0), pop st(0) from FP stack
class pycca.asm.instructions.fucomi(src1, src2)[source]

The FUCOMI instruction performs the same operation as the FCOMI instruction. The only difference is that the FUCOMI instruction raises the invalid-arithmetic-operand exception (#IA) only when either or both operands are an SNaN or are in an unsupported format; QNaNs cause the condition code flags to be set to unordered, but do not cause an exception to be generated.

src1 src2 32-bit 64-bit description
st(0) st(i) X X  
class pycca.asm.instructions.fucomip(src1, src2)[source]

The FUCOMIP instruction is similar to FUCOMI but also pops the register stack following the comparison operation. To pop the register stack, the processor marks the ST(0) register as empty and increments the stack pointer (TOP) by 1.

src1 src2 32-bit 64-bit description
st(0) st(i) X X  
class pycca.asm.instructions.idiv(src)[source]

Divides the (signed) value in the AX, DX:AX, or EDX:EAX (dividend) by the source operand (divisor) and stores the result in the AX (AH:AL), DX:AX, or EDX:EAX registers. The source operand can be a general-purpose register or a memory location. The action of this instruction depends on the operand size (dividend/divisor).

src 32-bit 64-bit description
r/m8 X X Divide AX by src, set AL=quotient, AH=remainder
r/m16 X X Divide DX:AX by src, set AX=quotient, DX=remainder
r/m32 X X Divide EDX:EAX by src, set EAX=quotient, EDX=remainder
r/m64   X Divide RDX:RAX by src, set RAX=quotient, RDX=remainder
class pycca.asm.instructions.imul(*args)[source]

Performs a signed multiplication of two operands. This instruction has three forms, depending on the number of operands.

  • One-operand form — [Not currently supported by pycca] This form is identical to that used by the MUL instruction. Here, the source operand (in a general-purpose register or memory location) is multiplied by the value in the AL, AX, EAX, or RAX register (depending on the operand size) and the product (twice the size of the input operand) is stored in the AX, DX:AX, EDX:EAX, or RDX:RAX registers, respectively.

  • Two-operand form — With this form the destination operand (the first operand) is multiplied by the source operand (second operand). The destination operand is a general-purpose register and the source operand is an immediate value, a general-purpose register, or a memory location. The intermediate product (twice the size of the input operand) is truncated and stored in the destination operand location.

    dst src 32-bit 64-bit description
    r16 r/m16 X X dst *= src
    r32 r/m32 X X  
    r64 r/m64   X  
  • Three-operand form — This form requires a destination operand (the first operand) and two source operands (the second and the third operands). Here, the first source operand (which can be a general-purpose register or a memory location) is multiplied by the second source operand (an immediate value). The intermediate product (twice the size of the first source operand) is truncated and stored in the destination operand (a general-purpose register).

    dst src1 src2 32-bit 64-bit description
    r16 r/m16 imm8/16 X X dst = src1 * src2
    r32 r/m32 imm8/32 X X  
    r64 r/m64 imm8/64   X  
class pycca.asm.instructions.inc(dst)[source]

Adds 1 to the destination operand, while preserving the state of the CF flag.

The destination operand can be a register or a memory location. This instruction allows a loop counter to be updated without disturbing the CF flag. (Use a ADD instruction with an immediate operand of 1 to perform an increment operation that does updates the CF flag.)

dst 32-bit 64-bit description
r/m8 X X dst += 1
r/m16 X X  
r/m32 X X  
r/m64   X  
class pycca.asm.instructions.int_(code)[source]

The INT n instruction generates a call to the interrupt or exception handler specified with the destination operand. The destination operand specifies a vector from 0 to 255, encoded as an 8-bit unsigned intermediate value. Each vector provides an index to a gate descriptor in the IDT. The first 32 vectors are reserved by Intel for system use. Some of these vectors are used for internally generated exceptions.

dst 32-bit 64-bit description
imm8 X X  
class pycca.asm.instructions.ja(addr)

Jump near if above (CF=0 and ZF=0). Accepts an immediate address relative to the current instruction.

class pycca.asm.instructions.jae(addr)

Jump near if above or equal (CF=0). Accepts an immediate address relative to the current instruction.

class pycca.asm.instructions.jb(addr)

Jump near if below (CF=1). Accepts an immediate address relative to the current instruction.

class pycca.asm.instructions.jbe(addr)

Jump near if below or equal (CF=1 or ZF=1). Accepts an immediate address relative to the current instruction.

class pycca.asm.instructions.jc(addr)

Jump near if carry (CF=1). Accepts an immediate address relative to the current instruction.

class pycca.asm.instructions.je(addr)

Jump near if equal (ZF=1). Accepts an immediate address relative to the current instruction.

class pycca.asm.instructions.jg(addr)

Jump near if greater (ZF=0 and SF=OF). Accepts an immediate address relative to the current instruction.

class pycca.asm.instructions.jge(addr)

Jump near if greater or equal (SF=OF). Accepts an immediate address relative to the current instruction.

class pycca.asm.instructions.jl(addr)

Jump near if less (SF≠ OF). Accepts an immediate address relative to the current instruction.

class pycca.asm.instructions.jle(addr)

Jump near if less or equal (ZF=1 or SF≠ OF). Accepts an immediate address relative to the current instruction.

class pycca.asm.instructions.jmp(addr)[source]

Transfers program control to a different point in the instruction stream without recording return information. The destination (target) operand specifies the address of the instruction being jumped to. This operand can be an immediate value, a general-purpose register, or a memory location.

dst 32-bit 64-bit description
imm8 X X Jump to address relative to current instruction
imm16 X    
imm32 X X  
r/m16 X   Jump to absolute address stored in r/m
r/m32 X    
r/m64   X  
class pycca.asm.instructions.jna(addr)

Jump near if not above (CF=1 or ZF=1). Accepts an immediate address relative to the current instruction.

class pycca.asm.instructions.jnae(addr)

Jump near if not above or equal (CF=1). Accepts an immediate address relative to the current instruction.

class pycca.asm.instructions.jnb(addr)

Jump near if not below (CF=0). Accepts an immediate address relative to the current instruction.

class pycca.asm.instructions.jnbe(addr)

Jump near if not below or equal (CF=0 and ZF=0). Accepts an immediate address relative to the current instruction.

class pycca.asm.instructions.jnc(addr)

Jump near if not carry (CF=0). Accepts an immediate address relative to the current instruction.

class pycca.asm.instructions.jne(addr)

Jump near if not equal (ZF=0). Accepts an immediate address relative to the current instruction.

class pycca.asm.instructions.jng(addr)

Jump near if not greater (ZF=1 or SF≠ OF). Accepts an immediate address relative to the current instruction.

class pycca.asm.instructions.jnge(addr)

Jump near if not greater or equal (SF ≠ OF). Accepts an immediate address relative to the current instruction.

class pycca.asm.instructions.jnl(addr)

Jump near if not less (SF=OF). Accepts an immediate address relative to the current instruction.

class pycca.asm.instructions.jnle(addr)

Jump near if not less or equal (ZF=0 and SF=OF). Accepts an immediate address relative to the current instruction.

class pycca.asm.instructions.jno(addr)

Jump near if not overflow (OF=0). Accepts an immediate address relative to the current instruction.

class pycca.asm.instructions.jnp(addr)

Jump near if not parity (PF=0). Accepts an immediate address relative to the current instruction.

class pycca.asm.instructions.jns(addr)

Jump near if not sign (SF=0). Accepts an immediate address relative to the current instruction.

class pycca.asm.instructions.jnz(addr)

Jump near if not zero (ZF=0). Accepts an immediate address relative to the current instruction.

class pycca.asm.instructions.jo(addr)

Jump near if overflow (OF=1). Accepts an immediate address relative to the current instruction.

class pycca.asm.instructions.jp(addr)

Jump near if parity (PF=1). Accepts an immediate address relative to the current instruction.

class pycca.asm.instructions.jpe(addr)

Jump near if parity even (PF=1). Accepts an immediate address relative to the current instruction.

class pycca.asm.instructions.jpo(addr)

Jump near if parity odd (PF=0). Accepts an immediate address relative to the current instruction.

class pycca.asm.instructions.js(addr)

Jump near if sign (SF=1). Accepts an immediate address relative to the current instruction.

class pycca.asm.instructions.jz(addr)

Jump near if 0 (ZF=1). Accepts an immediate address relative to the current instruction.

class pycca.asm.instructions.lea(dst, src)[source]

Computes the effective address of the second operand (the source operand) and stores it in the first operand (destination operand).

The source operand is a memory address (offset part) specified with one of the processors addressing modes; the destination operand is a general- purpose register.

dst src 32-bit 64-bit description
r16 m X X Store src address in dst.
r32 m X X  
r64 m   X  
class pycca.asm.instructions.leave[source]

LEAVE

High-level procedure exit. Accepts no operands. Equivalent to:

mov(esp, ebp)
pop(ebp)
class pycca.asm.instructions.mov(dst, src)[source]

Copies the second operand (source operand) to the first operand (destination operand).

The source operand can be an immediate value, general-purpose register, segment register, or memory location; the destination register can be a general-purpose register, segment register, or memory location. Both operands must be the same size, which can be a byte, a word, a doubleword, or a quadword.

dst src 32-bit 64-bit description
r/m8 r/m8, imm8 X X Copy src value to dst
r/m16 r/m16, imm16 X X  
r/m32 r/m32, imm32 X X  
r/m64 r/m64, imm32   X  
r64 imm64   X  
class pycca.asm.instructions.movsd(dst, src)[source]

MOVSD moves a scalar double-precision floating-point value from the source operand (second operand) to the destination operand (first operand).

The source and destination operands can be XMM registers or 64-bit memory locations. This instruction can be used to move a double-precision floating-point value to and from the low quadword of an XMM register and a 64-bit memory location, or to move a double-precision floating-point value between the low quadwords of two XMM registers. The instruction cannot be used to transfer data between memory locations.

dst src 32-bit 64-bit description
xmm xmm, m64 X X Copy xmm or m64 to xmm
m64 xmm X X Copy xmm to m64
class pycca.asm.instructions.pop(dst)[source]

Loads the value from the top of the stack to the location specified with the destination operand (or explicit opcode) and then increments the stack pointer.

The destination operand can be a general-purpose register, memory location, or segment register.

dst 32-bit 64-bit description
r/m8 X X Pop value from stack into dst
r/m16 X X  
r/m32 X    
r/m64   X  
class pycca.asm.instructions.push(src)[source]

Decrements the stack pointer and then stores the source operand on the top of the stack.

src 32-bit 64-bit description
r/m8 X X Push src onto stack
r/m16 X X  
r/m32 X    
r/m64   X  
imm8/32 X X  
class pycca.asm.instructions.ret(*args)[source]

RET

Return; pop a value from the stack and branch to that address. Optionally, extra values may be popped from the stack after the return address.

size 32-bit 64-bit description
(no operands) X X Return without touching stack
imm16 X X Pop size bytes from stack and return
class pycca.asm.instructions.sub(dst, src)[source]

Subtracts the second operand (source operand) from the first operand (destination operand) and stores the result in the destination operand.

The destination operand can be a register or a memory location; the source operand can be an immediate, register, or memory location. (However, two memory operands cannot be used in one instruction.) When an immediate value is used as an operand, it is sign-extended to the length of the destination operand format.

dst src 32-bit 64-bit description
r/m8 r/m8, imm8 X X dst -= src
r/m16 r/m16, imm8/16 X X  
r/m32 r/m32, imm8/32 X X  
r/m64 r/m64, imm8/64   X  
class pycca.asm.instructions.syscall[source]

SYSCALL invokes an OS system-call handler at privilege level 0. It does so by loading RIP from the IA32_LSTAR MSR (after saving the address of the instruction following SYSCALL into RCX). (The WRMSR instruction ensures that the IA32_LSTAR MSR always contain a canonical address.)

SYSCALL also saves RFLAGS into R11 and then masks RFLAGS using the IA32_FMASK MSR (MSR address C0000084H); specifically, the processor clears in RFLAGS every bit corresponding to a bit that is set in the IA32_FMASK MSR.

Accepts no operands.

class pycca.asm.instructions.test(a, b)[source]

Computes the bit-wise logical AND of first operand (source 1 operand) and the second operand (source 2 operand) and sets the SF, ZF, and PF status flags according to the result. The result is then discarded.

src1 src2 32-bit 64-bit description
r/m8 r8, imm8 X X  
r/m16 r16, imm8/16 X X  
r/m32 r32, imm8/32 X X  
r/m64 r64, imm8/32   X  

The Instruction class

class pycca.asm.Instruction(*args)
asm

An intel-syntax assembler string matching this instruction.

check_mode(sig, mode)

Return True if an argument of type sig may be used to satisfy operand type mode.

The method may instead return an integer to indicate that the mode is encodable but not preferred.

sig may look like ‘r16’, ‘m32’, ‘imm8’, ‘rel32’, ‘xmm1’, etc. mode may look like ‘r8’, ‘m32/64’, ‘r/m32’, ‘xmm1/m64’, ‘xmm2’, etc.

clean_args

Filtered arguments.

These are derived from the arguments supplied when instantiating the instruction, with possible changes:

  • int values are converted to a packed string
  • lists are converted to Pointer
code

The compiled machine code for this instruction.

If the instruction uses an unresolved symbol (such as a label) then a Code instance is returned which can be used to compile the final machine code after symbols are resolved.

generate_code()

Generate complete bytecode for this instruction.

Sets self._code.

generate_instruction_parts()

Generate bytecode strings for each piece of the instruction.

Sets self._prefixes, self._rex_byte, self._opcode, and self._operands

mode

The selected encoding mode to use for this instruction.

opcode

Opcode string to use in the compiled instruction.

operands

List of compiled operands to use in the compiled instruction.

parse_operands()

Use supplied arguments and selected operand encodings to determine how to encode operands.

Returns a tuple of 6 items:

  1. prefixes: a list of prefix strings
  2. rex_byt: an integer REX byte (0 for no REX byte)
  3. opcode_reg: a register to encode as the last 3 bits of the opcode (or None)
  4. reg: register to use in the reg field of a ModR/M byte
  5. rm: register or pointer to use in the r/m field of a ModR/M byte
  6. imm: immediate string
prefixes

List of string prefixes to use in the compiled instruction.

read_signature()

Determine signature of argument types.

This method may be overridden by subclasses.

Sets self._sig to a tuple of strings like ‘r32’, ‘r/m64’, and ‘imm8’ Sets self._clean_args to a tuple of arguments that have been processed:

  • lists are converted to Pointer
  • ints are converted to packed string
rex_byte

REX byte string to use in the compiled instruction.

select_instruction_mode()

Select a compatible instruction mode from self.modes based on the signature of arguments provided.

Sets self.use_sig to the compatible signature selected. Sets self.mode to the instruction mode selected.

sig

The signature of arguments provided for this instruction.

This is a tuple with strings like ‘r32’, ‘r/m64’, and ‘imm8’.

use_sig

The argument signature supported by this instruction that is compatible with the supplied arguments.

The format is the same as the sig property.

Debugging tools

pycca.asm.util.compare(instr)[source]

Print instruction’s code beside the output of GNU-as.

Accepts a single Instruction argument. This is used to determine the machine code differences (hopefully there are none!) between the output of an Instruction and the equivalent output from the GNU assembler.

pycca.asm.util.as_code(asm, quiet=False, check_invalid_reg=False, cache=False)[source]

Use GNU assembler to compile the asm string argument.

This prepends the given code with .intel_syntax noprefix before compiling and returns the machine code output converted to a bytearray. If the compile fails, then an exception is raised.

If check_invalid_reg is True, then an exception will be raised if the instruction makes use of a register that is not supported on the current architecture (by default, GNU-as silently ignores such symbols).

If cache is True, then the result will be cached in pycca/asm/gnu_as_cache.pk to speed up subsequent requests for the same instruction.

pycca.asm.util.phexbin(code)[source]

Print hexadecimal and binary representations of machine code.

Argument may be string, bytes, or bytearray.