EHSM2012
 
 
Overview of the YASEP
Toward the YASEP2013 milestone
 

 
 

Yann Guidon
2012/12/29


Who ?

Yann Guidon (F-CPU, YGDES...)

and friends : Pierre Pronchery (DEFORA),
Laura (YGLLO), Troy (q3u.be) ...

And you ?


 

What?

 


 

When?

 

Why?

 


Threats

 

« There can be no Free Software without Free Hardware »

In 2012, can you buy any new computer, install F/LOSS on it and/or keep it safe and under total control ?

  • "secure boot", TCPA, UEFI : Mandated by MS to run Win8
    => MS owns your computer
     
  • "Intel Active Management Technology (AMT) is hardware-based technology for remotely managing and securing PCs out-of-band."
    Hackers and governments will 0wn your PC too, even when it's "off".

  • Patents and «innovation»

    MIPS : 82 "essential" patents
    (plus 498 sold to Allied Security Trust, "non practicing entity")

    Intel, IBM, SUN/Oracle...

    ARM : http://whitequark.org/blog/2012/09/25/why-raspberry-pi-is-unsuitable-for-education/

    Smartphone : "There are over 250,000 patents and 5 million claims at play inside your pocket."

     

    Can a microprocessor be successful and not encumbered by patents ? SPARC : LEON & OpenSPARC


    How ?

     

    ==> Snowball effect : each new little tool validates the whole system
    so it's « quite coherent »

     

    Yet Another Small Embedded Processor

     

  • Original
  • Libre (Affero GPL)
  • Somewhat in contradiction with F-CPU
  • Highly configurable
  • Domain: real-time control, 5 to 50 MIPS
  • Lightweight Web2.5 Framework
      JavaScript tools integrated in a window manager
  • YASEP2011

     

    As of 2012/12 :

  • 10 years: the ISA is now quite stable
  • Official celebration at JMLL
  • The microYASEP is coded both in 16-bits and 32-bits
  • The microYASEP simulates in JavaScript and VHDL
  • First run in a FPGA : march 2012
  • The framebuffer (graphic output) works in JavaScript and VHDL too (but VHDL only on 32-bits linux)
  • YGWM runs almost everywhere
  • Raspberry Pi : the JavaScript simulator almost runs under the webkit-based Midori browser

  • YASEP2013

     

    What's expected :

  • Reach the potential users
      (end of "submarine mode": presentations and workshops)
  • Yearly freezes (instead of 2-years cycles)
  • better FPGA implementations
  • GNL, compiler
  • more tutorials, translations and features
  • Browser-based In Circuit Programming and Debug (ICP/D)

    And a working tasks tracker...


  •  
    DeforaOS's assembler supports:
  • amd64
  • arm (LE, BE)
  • dalvik (Android)
  • i386 and compatible
  • java (bytecode)
  • mips (LE, BE)
  • sparc
  • sparc64
  • yasep (16 and 32-bits since 2011-11-28)

     
    Microkernel based on BSD?


  •  

    The YASEP's programming model :
     


    Most of the 46 opcodes can use conditions or immediate field
    Fonction
    00h 20h 40h 60h 80h A0h C0h E0h
    Groupe 00h
    CTL
    NOP 00h
    ALONE
    !Cnd !WR !snd
    !si4 !I16
    CRIT 20h
    i
    !Cnd Opt !WR
    !snd !si4 !I16
    GET 40h
    RR,iR,IR
    sIR !snd I20
    PUT 60h
    RR,Ri,RI
    !Cnd !WR
    IN 80h
    RR,Ri,RI
    PRE
    OUT A0h
    RR,Ri,RI
    !Cnd !WR PRE
    CALL C0h
    RR,iR,IR
    sIR !snd I20
    CALL2 E0h
    IRR,RRR,iRR
    PRE
    04h
    SMT
    HALT 04h
    ALONE
    !Cnd !WR !snd
    IPC 24h
    RR,Ri,RI
    !Cnd !WR PRE
    IPE 44h
    RR,Ri,RI
    !Cnd !WR PRE
    IPR 64h
    RR,Ri,RI
    !Cnd !WR PRE
    08h
    IE
    MOV 08h
    RR,iR,IR
    sIR !snd I20
    IB 28h
    RRR,iRR
    rD3
    EZB 48h
    RRR,iRR
    ESB 68h
    RRR,iRR
    MOVH 88h
    RR,iR,IR
    sIR Y32 !snd
    IH A8h
    RRR,iRR
    Y32 C rD3
    EZH C8h
    RRR,iRR
    Y32 C
    ESH E8h
    RRR,iRR
    Y32 C
    0Ch
    SHL
    SHR 0Ch
    RR,iR,IRR,
    IR,RRR,iRR
    SAR 2Ch
    RR,iR,IRR,
    IR,RRR,iRR
    SHL 4Ch
    RR,iR,IRR,
    IR,RRR,iRR
    ROR 6Ch
    RR,iR,IRR,
    IR,RRR,iRR
    ROL 8Ch
    RR,iR,IRR,
    IR,RRR,iRR
    10h
    ROP2
    AND 10h
    RR,iR,IRR,
    IR,RRR,iRR
    aIRR
    ANDN 30h
    RR,iR,IRR,
    IR,RRR,iRR
    aIRR
    NAND 50h
    RR,iR,IRR,
    IR,RRR,iRR
    aIRR
    OR 70h
    RR,iR,IRR,
    IR,RRR,iRR
    aIRR
    ORN 90h
    RR,iR,IRR,
    IR,RRR,iRR
    aIRR
    NOR B0h
    RR,iR,IRR,
    IR,RRR,iRR
    aIRR
    XOR D0h
    RR,iR,IRR,
    IR,RRR,iRR
    aIRR
    XORN F0h
    RR,iR,IRR,
    IR,RRR,iRR
    aIRR
    14h
    ASU
    ADD 14h
    RR,iR,IRR,
    IR,RRR,iRR
    aIRR C
    SUB 34h
    RR,iR,IRR,
    IR,RRR,iRR
    aIRR C
    CMPU 54h
    RR,iR,IR
    !WR C Z
    CMPS 74h
    RR,iR,IR
    !WR C Z
    UMIN 94h
    RR,iR,IRR,
    IR,RRR,iRR
    aIRR aWR
    UMAX B4h
    RR,iR,IRR,
    IR,RRR,iRR
    aIRR aWR
    SMIN D4h
    RR,iR,IRR,
    IR,RRR,iRR
    aIRR aWR
    SMAX F4h
    RR,iR,IRR,
    IR,RRR,iRR
    aIRR aWR
    18h
    MUL
    MUL8L 18h
    RR,iR,IRR,
    IR,RRR,iRR
    aIRR Opt
    MUL8H 38h
    RR,iR,IRR,
    IR,RRR,iRR
    aIRR Opt
    MULI 58h
    RR,iR,IR
    !Cnd Opt !WR
    LUT8 78h
    RR,iR,IRR,
    IR,RRR,iRR
    aIRR PRE
    1Ch
    RSV
    INV FCh
    ALONE
    !Cnd !WR !snd
    !si4 !I16

    Instruction encoding :

     


    Short instructions : 2 forms

    (The destination register is always written last.)

    ADD R2,R1
    R1 <= R1 + R2
    1514131211 109876 543210
    src/Imm4 src/dest opcode Reg court
    R2 R1 ADD 0 0
     
    ADD [-8..7],R1
    R1 <= R1 + imm4
    1514131211 109876 543210
    src/Imm4 src/dest opcode Imm4 court
    2 R1 ADD 1 0

    Legend:
  • SND : Source (Negated) and Destination
  • SI4 : Source or Immediate (4 bits)

  • Long instructions
     
    ADD R2,[-32768..32767],R1
    R1 <= R2 + Imm16
    31302928 27262524 23222120 19181716 1514131211 109876 543210
    Imm16 src/Imm4 src/dest opcode Imm16 long
    12345 R2 R1 ADD 0 1

     

    Extended instructions
    ADD R1,R2,R3   ;   R3 <= R1 + R2
    31302928 27262524 23222120 19181716 1514131211 109876 543210
    dest3 Reg src/Imm4 src/dest opcode Ext long
    R3 0 R2 R1 ADD 1 1
    ADD R1,[-8..7],R3   ;   R3 <= R1 + Imm4
    31302928 27262524 23222120 19181716 1514131211 109876 543210
    dest3 Reg src/Imm4 src/dest opcode Ext long
    R3 1 5 R1 ADD 1 1

    Predicates (conditional execution) :
    ADD R1,R2,R3 LSB0 R4
    Ifi R4 is even the R3 <= R1 + R2
    31302928 27262524 23222120 19181716 1514131211 109876 543210
    src cond dest3 cond Reg src/Imm4 src/dest opcode Ext long
    R4 R3 LSB0 0 R2 R1 ADD 1 1

    Conditions :
    NZ (register is Not Zero) Z (register is Zero)
    Bit set R1 bit clear R1
    LSB1 (odd) LSB0 (even)
    MSB1 (negative) MSB0 (positive)
    If SRC4=PC :
    Always Never
    Carry NoCarry
    Zero Not zero

    Register Auto-update

     
    ADD R1+ R2,R3
    R3 <= R1 + R2, R1 <= R1 + 1
    31302928 27262524 23222120 19181716 1514131211 109876 543210
    dest3 update Reg u src/Imm4 src/dest opcode Ext long
    R3   0   R2 R1 ADD 1 1

    00 no update
    01 post-increment
    10 pre-decrement
    11 post-increment
     
    4 bits are available to
    modify src/dest and/or dest3

     
    To be defined in 2013

    The memory is accessed in 2 phases through 5 register pairs: A1-D1, A2-D2, A3-D3, A4-D4, A5-D5 (« Point and shoot »)

    Better performance, scalability and security than Load/store CPUs

    Read :
      mov 1234h A1 ; point to address 1234h
      mov D1 R1    ; copy the memory contents into R1 
    

    Write :

      mov 1234h A1 ; point to address 1234h
      mov R1 D1    ; write R1 to mémoire
    
  • IB and IH emulate "STORE BYTE" and "STORE HALF-WORD"
  • EZB, ESB, EZH and ESH emulate "LOAD BYTE" and "LOAD HALF-WORD"

  • Out-of-word half-word accesses set the carry flag.

     

    The microYASEP :
     


    Some cheap commercially available platforms:

    General diagram of the microYASEP :

    In the beginning, the YASEP was "VSP", a basic RISC core.
     

     
    The cache L0 (SDRAM line buffer) was merged with the register set
    => "no load/store" architecture.

    The register set :
     
    #0: PC  1 PC
    #1: R1  5 normal registers
    #2: R2 
    #3: R3 
    #4: R4 
    #5: R5 
    #6: D5  5 data registers:
    directlty access
    the buffers

    5 address registers:
    control the
    buffers

    #7: A5 
    #8: D4 
    #9: A4 
    #10: D3 
    #11: A3 
    #12: D2 
    #13: A2 
    #14: D1 
    #15: A1 

    From the VSP to the YASEP :

    Uses SRAM instead of SDRAM,
    no buffer or aliasing detection,
    much smaller to fit in cheap FPGAs

    Destination Register


    microYASEP
     

    The single register set is read during the two phases :