This page describes the VSP's JavaScript assembler and the langage conventions necessary for writing software.
The VSP assembler is split into two parts :
The VSP instructions are very simple but they are not always practical, so some opcodes introduce a few modifications. The are usually harmless and don't impact the architecture, but they make the instructions more handy, thus helping the instruction stream to be more efficient.
The assembly langage's goal is to hide these architectural details from the mind of the software developper. Thus comes the first rule : the destination register is the last in the line.
and that's all there is to say about the subject of "input syntax".
The numbers are always output in hexadecimal and are accepted in 3 formats :
Numbers are mostly used in contexts where the number of significant bits is bounded to the size of the container (usually, the optional immediate field). When more bits are input, the assembler keeps only the desired number of LSB, and discards the MSB. The following example shows how the number is truncated : db 1234h.
The ability to output arbitrary numbers is critical for many uses so the assembler has the following three pseudo-instructions :
An unbounded number of litteral numbers is accepted (since 2007-04-04 and the limit depends on the JavaScript engine, not on the assembler's design). The floating assembler's window will not display all the digits when more than 32 bits are given but they are available through software (by defining the emit_bin() function).
Example : db 12h 34h 56h 78h 9Ah BCh DEFh
emit_bin()'s output :
The instruction-level assembler recognizes the following symbols, and rejects anything else :
There are two types of aliases : form aliases (see FLAG_ALIAS_RR) and instruction aliases (they are listed under the opcode map). This is what we talk about here.
Internally, they can be used like normal instructions, but they provide different forms and/or different semantics. However, they use real opcodes of other instructions. The substitution is handled at the assembly level and the disassembly probably won't infer the originally assembled alias. So don't be surprised if instructions like NOT or NEG assemble correctly, but the disassembly returns a different opcode.
Despite the very simple instruction format, the assembly language instructions appear with several different forms. This is due usually to reasons like :
The forms can be decomposed into three main groups :
All these forms may seem complex (at first sight) but they make the VSP's assembly langage source code easy to read and write, by sticking more to the semantic of the instruction than on the instruction's binary structure.
The availability of all these forms is also a compromise between flexibility of implementation and completeness of description (of both the assembler and disassembler). The current system allows new forms and syntaxes to be added or removed without changing the whole structure, thus making development and experimentation faster.
This is the most usual form, with only two registers used, for example add d2, r3.
Some instructions (CLR and JMP) need only one operand (which is either a source or destination register).
This is also a syntax shortcut of RR for some instructions, when one wants both source and destination to be the same register. For example, bswap r1 will encode R1 in both the src1/dest and src2 fields of the instruction. In this case, the opcode has the FLAG_ALIAS_RR attribute.
Some instructions don't need any parameter, for example nop or inv. This form encodes to a short instruction because there is no immediate data. Use the I or X forms to force a long instruction (depending on the opcode and if the value of the immediate field is ignored or not).
This form is used when immediate data are included in the instruction stream like this : add d2, 234, r3.
As noted above, some instructions do not make sense with more than one register operand
or field. The register might be used as both source (depending...)
and (always) destination (the immediate field must be stored somewhere).
For example, mov 2, d3
This IR form also means "Immediate if Register" for the conditional instructions.
This is technically the same thing as the previous IR form.
It is only needed by the PUT instruction, because the "destination" is the Special Register whose number is
given as an immediate 16-bit number (so it must come last in the instruction).
For example, put d3 2754
Some instructions could need only one long immediate parameter, or must be writable in "long" form. However, there is no such instruction yet, most of them simply ignore the Imm16 field (see the X form).
The "I" form is also used by the inconditional SKIP and Q instructions. This is the equivalent of the above IR form, but without register field. The 2-bit immediate number represents
This is the Imm16 extension of the above IR form,
used by some conditional skip instructions
(SZ/SNZ)
and conditional jump instructions
(QZ/QNZ).
The other conditional instructions
(SO/SNE,
SNO/SE,
SS,
SNS,
QO/QNE,
QNO/QE,
QS and
QNS)
make no sense with the immmediate 16-bit field because the condition is already XORed
by the negation field.
For example, the instruction qz 2, a4 21
will switch to queue #2 if register A4 is equal to 21.
Several instructions don't make sense with the optional imm16 field. However, the form could be useful for padding purpose, for example. The "-X" forms use the question mark ("?") to indicate that the long version of the instruction is desired, and the assembler will fill the remaining bits with adapted values (see below).
This form is the extension of the FORM_ALONE form.
For example, nop ?
will fill two half-words, and an apropriate value of the Imm16 field is generated
by the assembler.
This form is only used by the SKIP
and Q instructions
and is the "long" equivalent to the above I form.
For example, skip 1 ?
This form is only used by the BSWAP instruction
and is the "Ignore" version of the R form
(so the assembler's forms combinations are exhaustive).
For example, BSWAP R1 ?
This form is used by many instructions that only make sense
with the RR form. This form allows them
to be extended to a long instruction where the Imm16 field is ignored.
For example, LZB R1 R3 ?
Just like the above IR form,
this form is used by the conditional skip instructions
(SO/SNE/SNO/SE,
SS/SNS
but not SZ/SNZ)
and the conditional jump instructions
(QO/QNE/QNO/QE,
QS/QNS
but not QZ/QNZ)
because a long instruction form makes no sense (the opcode already contains a negation field).
For example, SNS 2 R3 ?
Many opcodes have "flags", which modify the instruction's behaviour, or make it more precise. They are used by all kinds of software, particularly the assembler, the disassembler and the (future) instruction decoder.
This is a modifier of the instruction form, not a form itself,
and it is needed for hardware simplicity.
This flag indicates that the operands are swapped internally in order
to keep the assembly-level instructions easy to read and implement.
Behind the scene, the destination register becomes the first operand,
so in the instruction shr d2, r3 :
This flag is used by the BSWAP instruction, where a register operand can be omitted when the source register is the same as the destination register. The SRC1 and SRC2 are then written with the same value, but you only need to write it once.
Normally, all uninitialized data, fields or values are cleared (zero).
However, when certain instruction fields fields are not used (by FORM_ALONE, FORM_R, FORM_I, ...), these fields are set to values chosen by the assembler, for the purpose of power reduction.
Toggle minimization and toggle spacing could help reduce the power consumption and EMI emissions. The VSP assembler will be able to compute proper values rather easily. Padding Imm16s and NOPs can be enhanced this way, too. The possible reduction is quite low, but maybe could reach 5% when fetching instructions from external SDRAM ? Gotta try with and without, and even when EMI/toggles are maximized, just for testing it.