yasep/changes.txt (was : vspsim/updates.txt) The latest changes are at the bottom of the file. __________________________________________________________________________________ 2006-09-02 : - changed the last SWH to SHH (Store Word High to Store Half-word High) - moved/Split the JavaScript files to either JScore and JSgui, so the JScore files can be used without GUI or without the Mozilla environment (with a CLI for example) - removed "spacing:" in CSS because Firefox seems to dislike it... 2006-09-04 : - added JScore/vsp_disasm.js and updated ISM/asm.html, added a few range checks as well. - started a prototype code that saves data to files (under the user's authority). 2006-09-06 : - merged ISM and doc into doc - enhanced the save and file functions, they need to be better developped and merged. - CSS cleanup - moved text from doc/opcode_map.html to doc/assembly.html and started the description of the assembler 2006-09-06 - renamed doc/vsp_instruction2.gif to vsp_class_alu.gif, added vsp_class_jmp.gif - added the first CTL class instructions (nop, mov...) and vspsim/doc/ISM/NOP.html 2006-11-10 - starting ploped2 2006-11-18 - ploped now has a dedicated directory under /test/ploped_proto and drawsegment is moved there. - jmp instruction class still ... not finished. Some issues must be solved - trying to merge load and save functions in a single module. 2007-03-28 - the JMP instruction class is still in limbo - assembly is bork, i don't remember how i did this. - /doc/opcode_map.html#NOP didn't work, fixed by changing a stupid test. - initiated the dynamic instruction tree in /doc/opcode_map.html but kept it inside comments. - the floating asm thing did not work outside some of directories (in a fixed list), i removed the directory parsing thing in JSgui/floating_asm.js and moved the prefix to the calling page. Doing otherwise would have been too difficult in the long term. ==> In fact, now, every DHTML file should include in the header, and the footer is the usual which does the rest. Without the "prfx" variable, the directories are unreachable. 2007-04-01 - simplified the /ISM/*html files a bit more (thank you, sh/grep/sed) - PUT needs the creation of the "FORM_RI" form so i added it. - some clarification for the default values of unused register fields in JScore/vsp_asm.js (see doc/assembly.html#def ) - opcodes could use a short form, where the src2 field contains a 4-bit integer instead of a register number. Useful for EU and GET/PUT but not very... handy yet. It's something to keep in mind but the VSP's opcode space is too small, only few opcodes could use it (CTL group ?) The next architecture should have a better Imm/reg flag (reg/imm5/imm16). - initiated test/test_opcodes.html - the assembler can put the encoded instructions somewhere through the "emit_bin(data, size)" function (if defined) - DB, DH & DW pseudoinstructions are bork, while trying to extend the format. - the JMP class is still bork 2007-04-02 - test/test_opcodes.html works 2007-04-04 - DB/DH/DW pseudo-instructions now accept an unbounded list of numbers. 2007-04-05 - working on the first jmp/skip instruction group adding FORM_Q, FORM_CQR, FORM_CQRI, FORM_CIR, FORM_CIRI - adding ALIAS_IR ? - assembly.html#CIR/CIRI/Q/CQR/CQRI missing, vsp_asm/disasm.js incomplete, /ISM/J-JN-S-SN.html missing, aliases missing 2007-04-06 - added the minlength and opt_h fields to JScore/txt_int.js int2hex() because the skip length argument doesn't need the trailing 'h' - added FLAG_SKIP, FORM_QR, FORM_QRI, FORM_IRI 2007-04-07 - the jump instructions are still ... not satisfying. The fields must be moved around, again... - table_opcode[] is gaining a wider definition, to support the aliases (this didn't work because of safety measures). It should allow the aliases to have a different form than the base opcode. - JSgui/vsp_messages.js is even better (there was some race condition somewhere) - added opcode_aliases[] in vsp_opcodes.js (and populated in vsp_jmp.js) - added : ignore field (used later for EMI reductions) - test/test_opcodes.html working 100% (but with uneven syntax) --> vsp_asm.js and vsp_disasm.js are in sync --> gotta make the .js files cleaner by making all temp. vars local (to prevent name collisions and clogs) Probable future instructions : * min and max instructions are ... quite easy to make. route the 2 operands to ASU, perform substraction, XOR the Carry bit with the instructions's MIN/MAX field. the result (1/0) will enable/disable the write of the src2 into src1/dest ---> The value of src2 must be routed through another data path than ASU. * CMOV : conditions : S, Z, O, negation, of src2 value=0/imm16 written to src1/dest --> version inconditionnelle : clear * also missing : Jmp/Skip R,R (Z/C) --> FORM_(Q/I)RR | FORM_(Q/I)RI Z = Zero C = Carry / Above - missing : ISM/(J/S)(N)(S/O/E/Z).html 2007-04-09 - opcode_map.html displays all the flags - JSgui/decoration.js removed, contents moved elsewhere. - separation of Forms and Flags, because the room is getting tight. --> parts of vsp_forms.js moved to vsp_flags.js --> most files changed to take the new attribute into account --> constants ALL_FORMS and ALL_FLAGS are useless now, so they're removed. - adding the -X forms (to indicate a long instruction form where the imm16 value is ignored) --> candidates are : NOP, HALT, RETI, INV, JMP, SKIP, JO, JS, JNO, JNS, SO, SS, SNO, SNS --> new forms are X, QX, IX, QRX, IRX, RX, RRX --> new flag: FLAG_IGNORE_IMM16 - int2hex is a bit... annoying (always needing -1 in the min/max length arguments) - removed all commas in the example and disassembled codes. This makes life easier and the examples are less confusing. - vsp_disasm is too ... spaghetty. the nested if/else structure is too fragile. A better method should be used. - doc/opcode_map.html#flags and doc/opcode_map.html#forms are ok ? the "instruction classes" should be... less strict ( there's quite a waste of encoding space. ) ? the JS loading time could be reduced, when clicking on links, by using some AJAX tricks. Later, as the JS will become larger, it might save some time. 2007-04-11 - cleanup of the flags' documentation (doc/assembly.html etc.) - the core files are now listed in JSgui/js_list.js which is now loaded by most files --> now it's possible to work on the instruction group/class issue without breaking everything. - vsp_eu.js split to group_eu.js group_asu.js, group_rop2.js, group_shl.js, group_ie.js and group_misc.js - creating vsp_cmov.js - removing ADD1/SUB1/ADDC1/SUBB1 ? min/max pour les mov négatifs inconditionnels ? ? add/create a "FIELD" attribute ? (like the FORM_ and FLAG_ attributes) ? drop the use of the "instruction classes" and create more flexible "groups" of variable lenghts ? how to fill the unused conditions in the JMP group ? ? extension du champ skip à 4 bits ? ? forme Q à virer ? ? FLAG_JMP à virer, remplacé par FLAG_SKIP (???) ? MULTSTEP/DIVSTEP ? ? FLAG_QUEUE pour Qxx 2007-04-13 - JSgui/js_list.js now accepts an argument and detects an already defined add_message() - found a stooooopeed bug in vsp_disasm.js / unknown_opcode() - adding vsp_groups.js - prefix is now auto-incremented inside JScore/vsp_opcodes.js:NewOpcode() and the value parameter is removed. - stroke of genius ! ADDSx/SUBSx : skip 0 to 3 half-words if carry is generated. --> no need of useless ADDC/SUBB instructions, or of carry flag !! but could create other problems. - updated doc/vsp.html completely, asm/disasm and tests are BORK - need to be updated : ISM/*.html (touch'ed) and dump_opcode_table.js 2007-04-14 - doc/assembly.html updated (Qxx forms removed) ? Skip field is 2 bits but could be 4 bits. I prefer to keep it short for now, for HW reasons, it can still be extended later. ? instead of "Ignore Imm16"-like flags, create other flags READ_SRC1 WRITE_SRC1 READ_SRC2 WRITE_SRC2 USE_IMM16 USE_IMM0 <-- when Imm/reg=0 USE_Q USE_SKIP_CARRY USE_SKIP_COND USE_CONDITION That would simplify many, many things... Then the "ignore" mask would be ~(USE_..|USE_...|USE_...) ? instruction CIP : Current Instruction Pointer (R, IR) CIP (+Imm16) => R ? CQ -> R ? SR : MSB = 1 -> public space MSB = 0 -> private space 2007-04-15 - I tried to validate it but it breaks many things... i'll retry later. one day. http://validator.w3.org/check?uri=http://f-cpu.seul.org/whygee/vspsim/doc/opcode_map.html - added the color code to index.html - before rebuilding asm/disasm, i want to incorporate the USE_... flags. - should the tested register be SRC2 or SRC1 ?? many changes are still going to happen ! - removing FLAG_IGNORE_IMM16 - doc/vsp.html and doc/assembly.html must be reviewed because the operand order has been altered (misc, shl, ie). - /ISM/ incomplete ? inverser le sens des opérandes de shift pour faire des masques 1 bit / N bits à 1 ? - the computation of encoded_ignored_fields in vsp_asm is changed 2007-04-16 - "NOT" aliased to NAND - change the opcode_map.html code to include stats from the aliases (ie : NOT does not have the same forms as the NAND instruction) ? new instructions : "CQ", CIP, SCQ CQ : sets the destination register to the value of CQ SCQ : skips instructions according to the value of CQ (CQ=1 -> skip 1 half-word ?) CIP : reg dest = CIP(+imm16) ? new groups : PFQ flags (post-increment etc.) SMT (create/kill processes, semaphores) BITSTREAM ? ? There is no certain way to know whether an instruction is long or short just by knowing its form, for FORM_IR and FORM_I ? PFQ flag : burst/word-by-word (for I/O) 0 nothing (default) 1 nohing, no burst 2 read postinc 3 write postinc 4 read postdec 5 write postdec 6 pseudostack (read postdec, write postinc) \ read+write in the same instruction does not change the pointer 8 stack (read postinc, write PREdec) / 2007-04-17 : 10000 lines of code ! 3772 lines of Javascript, 4605 of HTML - asm/disasm/test_opcode : OK - replacing FLAG_QUEUE and FLAG_SKIP with USE_Q and USE_SKIPCOND - doc/assembly.html : flags not complete - change the MIN/MAX instructions to allow Imm/Reg (the Imm becomes the value to compare, not the value set) - opcode MOV moved to group_ie (where it looks most similar because both SWP and SRC1orIMM) - NEG alias : works, not nicely but just enough not to care. ! code examples for the conditional instructions are missing 2007-04-18 ? Skip Length : should be incremented more, counting the skip instruction's length a compromise would be to increment by 2 instead of 1. - ISM/* ok, ADDSx/SUBSx modified to account for the size of the skipping instruction ? generation of the ISM/ pages should be even more automated ! - unused bits in Imm16 are not taken into account in vsp_asm :-/ * SHL group uses only 5 bits * SB uses only 8 bits 2007-04-26 ? conditional instructions for semaphores ? ? Skip range reduced to either 1 or 2 HalfWords ? ? FIFO for the decoder, to remove instruction alignment issues ? - I've read several "old" microprocessor manuals (70s era) and found the RCA1802 (think : Voyager probe) very interesting. You could think of the VSP's CQ register (2 bits) as a sort of P register as in the RCA1802 (4 bits). The use of skips (long and short) looks similar too :-) In fact it provides with a lot of yummy coding techniques and hardware tricks ! So the VSP architecture does not only look like a mixture of the CDC6600's CPU and PPUs. Oh, and I didn't know that the Cray 2 had a specific "foreground" 32-bit CPU with 8 32-bit registers. ? shift operand and add ? (think ARM or x86's LEA) ? store the calling CQ somewhere ? in the calling CQ's pointer MSB ? ? serial boot : shift register in the instruction register ? 2007-05-13 - cleanup in the JS syntax (using some stuffs from http://javascript.crockford.com/ ) ? the X form could use numbers too ? ? creation of the tools/ directory ? (then move asm.html there) ############################################################################################### 2007-08-29 : from vspsim to yasep - vspsim/vspsim.css -> yasep/yasep.css - yasep/test/mksources/mksources.html created 2007-09-12 : "rebranding" VSP to YASEP - /vspsim/index.html reviewed - yasep/doc copied but not reviewed. ? I need a compatibility check with JS compilers - more work on mksources, i'll have to write a tokenizer. 2007-11-05 : "tonton edition" - doc, JScore, JSgui, ISM review : ok, some tests too - adding some flags to pass test_opcodes ROL, ROR, SAR, SB, SHL and SHR fail because the assembler truncates the 16-bit immediate constant. * either the asm does not truncate (but still issues a warning) * either the test uses new vector for these specific instructions [this is much better and safer for later] -> creation and support of the flags IMM16_5LSB and IMM16_8LSB - test_opcodes.html : ok \o/ 2007-11-06 : - test/test_eu.html update, with the new instruction structure ? MIN/MAX with aliased form aRI ? - implementation of MIN/MAX.action did not exist, it's ok now (ISM/MIN-MAX updated too) - MIN/MAX renamed to SMIN/SMAX, which are signed comparisons. creation of unsigned versions UMIN/UMAX - the doc is getting a bit outdated (missing forms and flags) and too large, yasep.html and assembler.html must be split into files. 2007-11-07 : - doc/ must be checked, changed file names must be propagated to other files ? missing : ASCII and string constants 2007-12-25 : - some cleanup in the doc, ISM etc. - added substitution of "MOV 0, Rx" with "CLR Rx" - trying to modify the SKIP instructions, to allow -7 to +8 jumps - SCQ modified, PFQ group populated - the Q field could be better used, allowing rotation of the queues - upgraded test_eu.html and dump_opcode_table.js to allow certain stuff to work if the code associated to an opcode is a real function. - other inconsistencies in the asm and disasm 2008-01-05 - moved test/test_eu.html and test/test_opcodes.html to benches/ 2008-07-30 - FF3 update (the floating asm window now uses negative coordinates when hidden because FF3 renders the frames differently) 2008-08-02 - renamed /doc to /docs to bypass a default Apache directory alias 2008-11-17 - added optional, 32b-only and 16b-only flags - added YASEP/YASEP16/YASEP32 pseudo-instructions to the assembler - updated docs/ and ISM/ a bit 2008-11-29 - added docs/yasep16-32.html and docs/SR.html - docs/yasep.html needs a lot of (re)work ! - Wikipedia just made me discover the M32R architecture that has similar instructions, but it was designed before 1997... - relaxed instruction alignment rules : "padding NOPs" are no longer necessary as 32-bit instructions are now 16-bit alignable. - The 2-bit Q register is dropped. R0 = next instruction's address (NIP) R1 = "control register" (auto-inc/dec etc.) Jump to register, skip (-8/+8), abs. memory address are kept but changed a bit... 0h: NIP ("Next IP", replaces A0) 1h: ST ("Status", replaces D0) 2h: A1 \ Q1 3h: D1 / 4h: A2 \ Q2 5h: D2 / 6h: A3 \ Q3 7h: D3 / 8h: A4 \ Q4 9h: D4 / Ah: A5 \ Q5 Bh: D5 / Ch: R0 Dh: R1 Eh: R2 Fh: R3 - The ASU group contains those instructions : ADD ADDS1 ADDS2 ? SUB SUBS1 SUBS2 ? 8 new instructions : compare and skip 1 or 2 half-words if (or if not) carry CMPS1 CMPS2 CMPNS1 CMPNS2 CMPU1 CMPU2 CMPNU1 CMPNU2 1 bit : distance 1 bit : signé/non signé 1 bit : négation de condition - removed : RETI ADDS3 SUBS3 group_queue (Q QZ QE QNZ QNE QO QNO QS QNS) group_pfq (CQ SCQ CIP) group_mov ==> SMIN, SMAX, UMIN, UMAX moved to MISC - group_misc (BSWAP, EXPND, MATCH, BMASK) is deactivated - the 16-bit code is not completely rewritten (group_ie.js, group_misc.js...) - conditions "always" : comparison ? - reduction of the number of conditional instruction types 2008-11-30 - missing : Jump&Link, trap - all references to D0 register must be changed to something else in the code examples. - NIP is preferred to CIP because it makes "loopentry" as easy as MOV NIP dest and Jump&Link uses the same mechanism : it copies NIP to the desired register. it's a bit like an "exchange" instruction... - issue with thread switch/restoration : * the "cached" value of the memory contents registers (D1 to D5) is not updated automatically when the new thread starts. * several reads must be performed to get the (N)IP and ST registers (well, with 2 read ports, NIP and ST are read in one cycle) and then the memory values (5 registers, 2 accesses can be done in 1 cycle, so 3 cycles are necessary) - corrected the Opera-related "bug" of the floating asm window 2008-12-18 - created the FPGA/Actel directory, but it's unpopulated yet. - benches/test_eu16.html ok ( group_ie.js completed in 16-bit version) - update of docs/yasep.html - "load linked" ? (à la MIPS pour les load/store atomiques) - missing : ABS instruction ! - trouble with ST register in 16-bit configuration : not enough bits for all the pairs, so update is only for Read AND write combined ? - could ST be put somewhere else ? and updated with a specific instruction ? - dropping pointer auto-update temporarily. 2008-12-22 - todo : merge most JS files so there are less transactions when a page is loaded 2008-12-27 - another OPERA "bug" found, after_load() must contain the JS code that must be executed after the floating assembler has finished initialisation. ==> for the opcode map or the testbenches, the Floating Asm Window is not initialised so the page continues initialisation. ==> for the others, the FAW is used and finished the page load, but OPERA does things in a different order. Files are correctly loaded but the variables are not initialised in the same order -> the opcode table is not declared... The solution is the use of after_load(). 2008-12-28 - Just got this idea : renumber the register so they are easier to group ==> "holes" (the memory windows) are gathered, they are easier to address and save and the register banks can be grouped as 3 arrays of 4 entries... 0h: R0 1h: R1 2h: R2 3h: R3 4h: NIP ("Next IP") 5h: MD 6h: A1 7h: A2 8h: A3 9h: A4 Ah: A5 Bh: D1 Ch: D2 Dh: D3 Eh: D4 Fh: D5 - Status should be renamed to "MODE" so it is less misleading. - the 5 queues are a bit annoying, maybe 4 are enough ? - some SRs are required for extended capabilities on Q2 & Q3 : - stride (value added to the A register when D is accessed - limit register - base register ==> modulo addressing for circular buffers etc. - also stack overflow and underflow registers are welcome for Q4 & Q5, with a pointer to a handling routine 2009-01-03 The 16-bit ALU is nearing completion in VHDL and many modifications will become more effective. Already, a 132MHz clock rate indicates that my conservative margins have worked beyond expectations. Also, 1/3 of the gates are pipeline gates, i'll try to do less aggressive separations of the datapaths. Yet the ALU takes around 760 tiles, i'll shave 15 or 30 at most. Also, 4 units are ready : ASU, ROP2, IE and SHL. The other functions are less critical : multiply, optional operations, eventually a muticycle divide unit... through the SRs. So the 4 groups of ALU operations deserve special treatment and a new form of instruction is created, with a 4-bit immediate value. Encoding becomes a bit more complex than the single bit that indicates a long or short instruction. Many instructions have 2 forms but the ALU instructions now have 3. With 2 bits, 1/4th of the opcodes are useless : bit 7 5 0 0 OP-reg-reg=>reg 0 1 OP-reg-im4=>reg 1 0 OP-reg-reg-im16=>reg 1 1 ??? The currently chosen solution is to create a reg-reg-reg form using 4 of the additional 16 bits but 12 bits remain unused :-/ This remains quite simple (the 3rd register operand becomes the destination register so the CDP in the beginning of decoding is not changed) and code density can be boosted, less temporary registers are needed. Note : the 3 LSB of the MODE register become critical because they can be modified with a shor (16-bit) instruction with imm4. The 4th bit is sign extended so bit 3 to 15 are less frequently used. The intended uses are synchronisation, critical sections, memory spinlocks (a la MIPS) ? The new opcode map is now split into 2 parts : * 4 groups of 8 ALU instructions that support RR, Rim4, RRim16 and RRR 1 bit of the opcode is used for the 2 additional forms. * 8 groups of 8 instructions, including 4 groups for the conditional jumps and the rest is reserved. they support only RR and RRimm16. ROP2: AND/OR/XOR/ANDN/ORN/XNOR/NAND/NOR ADD, SUB, ADDS1, SUBS1, ADDS2, SUBS2, MIN, MAX SHL : SHR, SHL, ROR, ROL, SAR ==> 3 opcodes left : MUL IE : MOV, SB, LSB, LZB (16 bits) and the 32 bits versions : SH, SHH, LSH, LZH SMIN/SMAX (signed) are not there anymore. Comparison is left for the conditional groups, mirored at some suitable address. The conditional instructions are stripped down too : the queue-related group is removed. Condition also comprises both comparisons (through ASU) and equality (through ROP2 and OR-combine). conditions : Zero, LSB, MSB, none, above, below, above|equal, below|equal all conditions can be negated. Call => jump & link Memory mapping : add a flag in the translation table to differentiate between code and data => as a 17th address bit. RWX attributes can more or less be emulated this way. Make sure that the SR-mapped multiply operations can be "restarted". Current solution : a SR register holding the result. Both operands are given by a xMULx instruction. Reading and writing back the SR_MUL_RESULT should preserve the state of an interrupted routine. A "modified" bit could be useful ? note : AVR32 has ... 58 instruction forms :-/// 2009-01-06 As the ALU stabilizes, other questions arise. The multiply unit is not yet certain but I have decided, at least for YASEP16, to make one exception to the simplicity rule, in order 1) to keep the gate count low 2) to keep the SW simple. The cost is a more complex datapath, as an additional cycle is dynamically inserted in front of the ASU. Another cost is the reservation of 2 SRAM blocks. The cost in logic tiles is then reduced to two 8-bit adders. In fact I use the 2 SRAM blocks to break down the early stages of the computations, it takes roughly 3ns to output 4 8-bit partial results. Two pairs are then combined by the 8-bit adders, and the 2 12-bit results are combined by the existing ASU16 logic in the normal pipeline. In term of instructions, at least 2 are necessary : - 1 instruction initialises the SRAM blocks (a special iterative routine must be executed when the CPU starts) - 1 instruction for multiply itself. There are only 3 instructions left in the ALU sub-group, along the SHL instructions. A 3rd instruction could be created, which multiplies another part of the input word, though this would add another MUX in the critical datapath (it's still OK). Later, when YASEP matures, these instructions will be replaced with MUL8, MUL16 and MUL32, or something like that. I'm not there right now so I just try to get to the point and have something done. Remember that YASEP is a moving target :-) ABS Rx : 4 bytes, 1 cycle Rx = 0-Rx if MSB(Rx)=0 (don't forget to detect ABS(0x8000) !) Note : the "RRR" instruction format could be extended with a conditional code ? (using "shadow properties") ==> bits 16 of the 18-bit register set can store Zero ? and what about bit 17 ? Remaining bits of the RRR form should decode to NOP NOP = MOV R0 R0 Hence 0h: R0 1h: R1 2h: R2 3h: R3 ... ==> R0 = 0 Also, why not 6 or 8 "normal" registers ? 4 "queues" (incl. stacks) => 8 regs, IP/NIP => 1 reg Mode : can be removed ? (1 LSB remains in the IP) ==> 7 registers are free To do @ JScore : - change the registers - add/update the forms and options - add support for the new extended forms - add MUL8 -- the skip length variable and the skip enable flag must be added for the simulator (see JScore/group_asu.js) -- LSB of the (N)IP == skip/accept next instruction -- jump & link... -- ROP2's opcode order will change -- assignation of SRAM blocks : * 2 (or 3?) for the register set * 2 for multiply * 2 (or 3) for the address remapping lookup ==> 1 or 2 left on A3P250, not enough for cache :-/ -- Max nr of thread context cache in SRAM : 8 ? -- after reset, and during trap/exception mode : address translation is off -- address remap LUT has only 1 port so need to read back the value --> added MUX in the datapath - opcode_map.html : add an option for Y16/Y32 only - How to fill the condition field in the RRR form : reg nr + bit nr : 4 + 5 bits => 9 bits + negation => 10 bits + zero => 11 bits skip long/short : 1 bit condition on/off : ? ==> CMOV is removed -- option for imm4 with the RRR form so conditional code can use a small displacement / Since NIP is register-accessible, JMP can be removed ! \ SKIP can be removed too if RI4R with condition is possible ! but jmp and skip have short versions \o/ The 4 forms could be extended to the rest of the opcode map ? difference between JMP and SKIP : SKIP is a short relative (forward) jump, JMP is an absolute jump JMP and SKIP can compare 2 values and use the full 2 operands to enable/avoid the program control change. however a field is still needed for the register source/skip amount, and comparison with ... immediate ? And what about Jump And Link ? There are only 32 opcodes for these :-/ JMP unconditional with negative condition == Jump & Link \o/ Conditional form : 2 register banks, the 2nd is fed with the cond. address a cycle later => condition computation takes 1 sub-cycle (2 max) while normal computation proceeds in parallel in 3 subcycles * Register map : R0 R1 R2 R3 R4 R5 R6 (mode ?) IP/NIP D0\ A0 A0/ A1 D1\ A2 A1/ A3 D2\stack2 D0 A2/ D1 D3\stack D2 A3/ D3 "MODE" and other registers : as "shadow" of Dx during context swap * Dx is written both to the register set and memory * Dx is read back when context is restored, from Ax * Mode is written to memory when context is switched * Mode is read back but sent to a "shadow" register 2009-01-07 Conditional word of RRR form : 4 : Rdest \ same places as other register fields 4 : tdest / (as it is piped one cycle later) 1 : imm4/reg on Rsrc of the previous half-word 2 : Condition Always Zero LSB MSB 1 : negate condition (not always : ?) ==> same place as with other conditional instructions Multiply : needs a scratch register / accumulator. 8x16 multiply : R0 x R1 => R2-R3 (R4 = scratch) YASEP16 : 6 instructions ; 2nd half MUL8H R0 R1 R2 SHR 8 R3 ; adjust between R2 and R3 SHL 8 R2 ; first half MUL8L R0 R1 R4 ADDS1 R4 R2 ADD 1 R4 16x16 Multiply : R0 x R1 => R2-R3 (R4=scratch) YASEP16 : 14 instructions MOV 0 R3 ; middle bytes MUL8H R0 R1 R2 MUL8H R1 R0 R4 ; Notice the exchange of operands ADDS2 R4 R2 MOV 100h R3 ; pre-carry SHR R2 8 R4 ; adjust between R2 and R4, SHL 8 R2 OR R4 R3 ; and put the carry back into R3 ; lower byte MUL8L R1 R0 R4 ADDS1 R4 R2 ADD 1 R3 ; higher byte ROL 8 R1 MUL8H R1 R0 R4 ADD R4 R3 With YASEP32, the larger registers make the carry etc. less troublesome. But YASEP32 will probably include a fused 16x16 multiplier...... and YASEP16 definitely needs MUL8H, which only adds a MUX to one LUT input. 2009-01-18 - Enhancing the inline assembly : the contents of the URL is regenerated with the contents of the text (to prevent copy/paste incoherences), and assembly and disassembly are separated a bit (different code and color) - A new function is created in JSgui/link_opcodes.js, which further reduces writing efforts and size. - name of the 4 forms : - short register (normal) - short immediate (new) - long immediate (normal) - long register, long register conditional (new) 2009-01-23 - A new interactive assembler interface is appearing in yasep/test/listed. - a new directory appears : tools /docs/asm.html and /docs/opcode_map.html moved there. ==> all links to the assembler should be of class "asm", and to the opcode map : "opcode" (the reduces the risks of incoherencies and there is less work to do if something moves or changes again) - from now on, every tool or aspect should be designed in such a way that they can be integrated in a