Pigeon Computer 0.1 documentation

Pigeon Assembler Internals

«  Writing out Binary Data   ::   Contents   ::   Pigeon Compiler  »

Pigeon Assembler Internals

The internal operation of the assembler is fairly straightforward.

There is a first pass where an assembly source file (which is just a Python script containing calls to the directives and assembly instructions) is executed within a special context. This generates an internal data structure that maps addresses to bit-pattern generators and their associated args.

During the second pass these bit-pattern generators are called to generate the proper binary op codes. Addresses of labels are available for things like adjusting the target addresses of relative instructions.

First Pass

The pigeon.assembler.pyavrasm.AVRAssembly.assemble() and pigeon.assembler.pyavrasm.AVRAssembly.assemble_file() methods read an assembly-in-python source file (for example, the Pigeon Firmware.)

Any of the instruction methods defined in the pigeon.assembler.instructions.InstructionsMixin that are called in the source file will create an entry in the assembler’s pigeon.assembler.pyavrasm.AVRAssembly.data dictionary. These entries consist of the name of the op (instruction) and the arguments that were passed to the instruction methods.

Intermediate Representations

Second Pass

When you call pigeon.assembler.pyavrasm.AVRAssembly.pass2() the assembler goes through the pigeon.assembler.pyavrasm.AVRAssembly.data dictionary and calls the bit-pattern-generating op functions with the associated arguments. The returned binary strings are compiled into another dictionary, keyed by their target addresses.

This dictionary is used by the pigeon.assembler.pyavrasm.AVRAssembly.to_hex() method to generate the HEX format output.

Python AVR Assembler

This is the Pigeon Assembler.

class pigeon.assembler.pyavrasm.AVRAssembly(initial_context=None)[source]

Bases: pigeon.assembler.instructions.InstructionsMixin, pigeon.assembler.pyavrasm.DirectivesMixin, object

This is the primary assembler object. It is created out of the InstructionsMixin and DirectivesMixin, which together define the functions that you use in your assembly code, and the methods in this class which run the assembly proper.

Assembling a file is a two-pass process.

First, the text of the asm code is passed to the assemble() method (or you can pass a file name to assemble_file()) which builds up an internal model of the op codes to be assembled.

Second, you call pass2() which converts the internal model of the op codes to be assembled into a dictionary that maps addresses in the output machine code to the byte strings of the data that should reside at those addresses.

Then you’re probably going to want to call the to_hex() method to get that binary data out as (the contents of) a hex file, suitable for writing to your ATmega328P.

When you create an AVRAssembly object you can pass an initial_context object, a dict or anything that can be passed to dict.update(), and it will be added to the execution context for your asm code. Typically you would pass m328P_def.defs to include those definition for your code to use.

Parameters:initial_context (dict or anything that can be passed to dict.update()) – Context to include to make things available to your assembler code. (m328P_def.defs or some other useful functions for example.)
accumulator = None

Internal output data structure. This holds the byte strings created in pass2().

assemble(text)[source]

Assemble the string asm source code.

Parameters:text (str) – Assembly source code.
Return type:None
assemble_file(filename)[source]

Assemble asm source code from a named file.

Parameters:filename (str) – File name of an assembly source code file.
Return type:None
context = None

This is the execution context for your assembly file. It is used as the namespace for exec or execfile for parsing and running your code.

Because it’s a defaultdict and the default factory function returns intbv objects any name (identifier) that you use in your code that is not previously defined will automatically generate a new variable binding.

That is how labels work: you simply use a label and it gets its own intbv object. Then when you use the label() directive on that label the intbv gets updated with the actual current output address, and that value will be used to assemble the proper bit patterns in pass2().

data = None

Internal intermediate data structure. This holds the output of the methods in the InstructionsMixin used to create the byte strings in pass2().

here = None

Current output address of the assembly process.

pass2()[source]

Second pass of the assembly process.

Once the asm source code has been assembled into the intermediate form (by assemble() or assemble_file()) this method converts it into binary strings.

Return type:Mapping of addresses to strings (binary data) this data structure is the end result of the assembly process, just before emitting the strings in e.g. Intel HEX format for burning to a chip.
to_hex(f)[source]

Convert the assembled machine code to Intel HEX file format.

Parameters:f (filename or file-like object) – The HEX data will be written to this destination.
class pigeon.assembler.pyavrasm.DirectivesMixin[source]

Bases: object

These are directives, assembler functions that don’t correspond to op codes but instead do some sort of other function.

db(*values)[source]

Lay down bytes in the program image. Integers 0 <= n <= 255 and strings are accepted.

Parameters:values (iterable of int and/or string values) – Values to assemble.
Return type:None
define(**defs)[source]

Update one or more names in the execution namespace. The main difference between using this function and simply setting a variable in your asm code is that this function automatically converts integer value(s) into intbv object(s).

Parameters:defs<name>=<value> pairs
Return type:None
dw(*values)[source]

Lay down unsigned 16-bit integer values in the program image.

Parameters:values (iterable of int) – Integer values to assemble.
Return type:None
label(label_thunk, reserves=0)[source]

Create a symbolic label at the current output address of the assembly process. If reserves is given (and greater than zero) that many bytes are reserved by adding the value to the current output address.

Parameters:
  • label_thunk (intbv) –

    An intbv object serving as a container for a pointer value to an address in your assembly program.

    When you mention a previously unused name in your asm code the execution context will automatically provide that name with a new intbv object initialized to zero.

    Because this intbv object will be (re-)used in other parts of your program wherever the name is used it becomes a container for the eventual value (a thunk) of the address.

    When you use this directive with a given named address thunk (label) it fills in the value of the current output address of the assembly process.

  • reserves (int) – Reserve this many bytes by increasing the current output address of the assembly process.
Return type:

None

org(address)[source]

Set the current output address of the assembly process to address. If address isn’t an intbv it is converted to one.

Parameters:address (intbv, int, or symbolic label.) – Location in program.
Return type:None

«  Writing out Binary Data   ::   Contents   ::   Pigeon Compiler  »