Short and practical introduction to FPGA, Verilog and Verilator and few words about SystemVerilog

→ Version en français ici

Warning, I’m a less than one month beginner in Verilog, Verilator et FPGA, studied as a hobby, if there are some mistakes I will correct it. You can contact me on The Fediverse.

Table of Content


* Introduction
** Real world full process
** Things to know and understand
** How to code a FPGA
* Verilog
** Values
** Types
** Gates
** Modules
** Simple example, writing an “and” gate
** Initial and always blocks
* Simple example with Verilator
** Make the test with Verilator
** Tracing example and GTKWave
** About Verilator examples
** Basic practical example with Verilator
* Further reading

The text in strong are here to help diagonal reading.

Introduction

Still continue to go to lower layers with the world of FPGA (Field Processors Gateway Array). This is a reprogrammable development electronic tool used to build and test processor. After the processor is validated in FPGA, you can start to build ASIC (Application-specific integrated circuit), actual hardwired processors that we use every-days in our computing devices. FPGA are also used as is in several industrial appliance (avionics, audio or video processing, etc…) for their parallelism, so the fact they are faster than a general purpose ASIC and a piece of software in these cases, and the ability to update them easily in case of problem. This post is a little introduction about FPGA, the popular IEEE standard Verilog HDL (“Hardware description language”) language and how to test it with free and open source software (FOSS) Verilator simulator. If you want to use VHDL, GHDL is FOSS simulator for VHDL.

Real world full process

The steps to implement a circuitry on an FGPA after it’s design are (Opensource softwares are given as reference for each step):
* Implementation of the logic (on an HDL, “Hardware description language”), so Verilog here, any text or code editor can be used (I use amphibian VIM and NeoVIM and sometime Geany light GUI but powerful IDE.
* SimulationVerilator here, making testbench, it’s already a good step in verification too.
* Formal verificationYoSYS is an open source verification and synthesizer system, so it’s more formal verification than Verilator. We will not go through this point here.
, Synthesis — making routes and placement of logic blocks, with physical and time constraints (see basic principle and today implementations) blocks from HDL script and output a bitstream in the FPGA format, YoSYS again.
* FlashingopenFPGAloader make the job on most FPGA.
* Real world testing on FPGA.

We will only test on simulator here, for understanding the basis. When you are ok, with this, you can go further, choose a FPGA, and start working with it.

Things to know and understand

The main concept to understand and be able to manipulate to build a basic integrated circuit are:
* 0, 1: electricity moving across a wire or not, that can be intrepted as 1 and 0 sate, true and false, etc…
* Boolean operators (for logical gateway), that is the main tool to manipulate them. You don’t need to have other notions of electronics. Most basics one are OR, AND, NOT, and XOR (exclusive OR). You can made all this operator with a NAND gateway (AND+NOT), but this not the more efficient way of doing them. If you master algebraic properties of booleans and know how to simplify them, you will be able to produce simpler circuit and do more efficient one, but it is not required to make circuitry.
* ALU (arithmetics logical unit), allowing simple additions
* flip-flop: special circuitry used to memorize information and change it depending on input and clock edge.
* latch: sequential logic circuitry does not depend on clock
* clock: Time, operation are time dependant and need to be synchronized to have a coherent behaviour. Clocks are like the drummer in a music band, It give the rhythm basis necessary to every time related construction, other instruments, singers, and dancers follow his time signals. Called clock signals here.
* Multiplexers (MUX) that can aggregate several lines in one, and demultiplexer that made the opposite.
These functions are all integrated in each logical block of a FPGA.

Other important part of a FPGA are:
* One to few PLL (Phase-locked loop) that allow to have secondary clocks ticks at frequencies multiples of the main clock or to sync on external clocks.
* A flash memory that allow to keep the bitstream on the board after a shutdown, it will be automatically loaded at start.
* Lots of peripherals I/O blocks to communicate with the outside world.

How to code a FPGA

There are three main methods for developing FPGA circuitries, and both method can be mixed:
* Drawing them with graphical tool representing this logical doors and other basic tools mentioned (flip-flop, latch, MUX, etc).
* Using a HDL (hardware description language), like Verilog or VHDL for the most famous. Compilers are here to build all the stuff behind this languages, as there are assembler to transform human easily readable assembly code to machine code, or higher level language (C/C++, Python, Lua, JavaScript, FORTRAN, BASIC, Pascal, Camel, LISP, etc) compiler or interpreter to convert them to machine code. There are also higher level language like Chisel, that are higher level description languages.
* Using OpenCL (Open Computing Language), designed mainly for intensive computing, that can be spread simultaneously on GPGPU and CPUs.

When using HDL, tools are here to automatically computing the routing circuitry for you. Tools like VTR (Verilog To Routing) and NextPNR, optimize their placement for you.

Symbiyosys is a verification tool.

OpenFPGAloader is a FOSS software used to program/flash/fuse the FPGA itself.

You can do very interesting things only few hours after start study of HDL, it allows to design simple circuitry with code, near from some higher level languages, with variables, constants, registers, conditional tests and basic booleans transformations, additions and bits manipulations. A register is already itself a circuit of several, basic logical gates. They are generaly made as ASIC inside FPGA, to accelerate circuit and reduce power usage.

Verilog

So we choose here Verilog (Official IEEE Std 1364-2001 specifications), it is well spread with lot of tools, VHDL is also a very famous one, anyway it is always interesting to look at alternatives, that could have interesting aspects. I choose it as a starting language, because used in few projects and examples I’m interested to, and, there are all the FOSS tools needed to write, simulate, verify. There are also several light RISC-V implementations for embedded (SERV and PicoRV32 (Github) (including Sipeed Lichee Tang (20€ FPGA board) implementation (I actually choose a 12€ Sipeed Tang Nano 4K as first board, The Tang Nano (without 4K) cost 3€ but only have 1000 logical blocks. There are also the very convenient and efficient Verilator simulator and VTR (Verilog to Routing, route optimizer.

Values

By defaults, numbers are integers, it is possible to force the numbers of bits (before the single quote ' character) and choose signed (s character, signed meaning able to use minus sign) or unsigned (default). Values can be in b binary, o octal, h hexadecimal, and even force d decimal (case by default). The datas can also be given in string format.

32'hFFFF00FF  // 32 bits unsigned hexadecimal opaque yellow (full red+green+alpha) RGBA
8'b00101010   // 42 (8bit unsigned binary)
8'b0010_1010  // 42 (8bit unsigned binary) easier to read version
8'sb1010_1010 // -86 (8bit signed binary)
-8'sd42       // -42 (8bit signed decimal)
12'o0754      // -rwxr-x-r-- rights (12 bits unsigned octal) on Unix compatible filesystems.
"Hello"       // a character string

Constants can be defined by preceding them by simple quote followed by define 'define:

'define myvalue -8'sd42;

Types

There are two different class of types, network and logic. We cover here :
* wire, that is of network type, just implementing a wire between components.
* reg, that is a logical type, register, that can be set or unset to memorize values.

wire w; // a simple wire between 2 components called w
reg r0; // a 1 bit memory called r0 (Register 0)

Value of a wire is set dynamically by the components linked to it, but it is possible to hard set them a permanent value with the assign function:

assign a = 1;

There are also higher level types that will use several of previous items, as for example:

integer i,j; // Two signed 32bits integers called i and j
real num;    // one 64bits float called num
time t;      // one 64bits unsigned integer representing time (in clock ticks)
realtime rt; // one 64bits float representing time

About time, it is interesting to know that #integer can be used to wait n clock ticks. No semi-column (;) must be put after the value, instead the instruction that follow the delay is generally wrote on the same line. For example:

#20 out = a & b;// wait 20 ticks then compute out value from a AND b

Simple bit elements, wire and reg can be organized in vectors, their bits are noted from MSB (most significant bit) to LSB (less significant bit).

wire [7:0] Bus0; // a 8 bits bus called Bus0 connected to the system
reg [31:0] R0;   // a 32 bits register called R0

It is possible to use arithmetic expressions to define them, in this cases, using array (see bellow) is better:

reg[8*256:1] string;   // 256 bytes string, * to simplify usage, start from 1, as will not be accessed by bit
reg[16*16-1:0] sprite; // a 16×16 monochrome sprite, better to use -1 and start by 0

After these declaration, we can access to unique bits or groups of bits the following way:

Bus0[0];   // Access to LSB of Bus0
Bus0[7:5]; // Access to 3 MSB of Bus0 (defined as 8 bits from 0 to 7)

Verilog also allows to concatenate several bits ranges to another one. For example, here, to convert from 16 bit Little Endian to 16 bit Big Endian data format:

BigEndian[15:0] = {LittleEndian[7:0], LittleEndian[15,8]};

If the destination bus is 16 bits here, then [15:0] is not mandatory for the left part as we concatenate two 8 bits values).

And all this types can be organized in arrays.

Both complex types (in the general meaning, not mathematics complex numbers) and vectors can be organized in arrays.

reg[7:0] cursor[0:7];            // 8x8 pixels monochrome cursor bitmap, Here, cursor is the name
reg[15:0] sprite[0:15];          // 16*16 pixels monochrome sprite bitmap
reg[31:0] palette[0:15];         // 16 RGBA8 colours palette
reg[31:0] FrameBuffer[0:307199]; // 640×480 RGBA8 screen frame-buffer

Arrays are noted from their lower address to higher one.

Gates

Here is available logical gates (or operators) in the language, they are bitwise, but can be applied bit-to-bit to all bits of a bit vector:

~a     // NOT
a & b  // AND
a | b  // OR
a ^ b  // XOR
a ~^ b // XNOR, can also be wrote ^~

Gates Look-up tables

There are also two bitwise shift operators available:

>> // Right shift
<< // Left shift

Modules

Verilog is organized by module (like classes in object programming, or an integrated circuit on a PCB). A module has:
* A module name
* It’s interface between its inside and outside world (think to public variables or methods in object programing or IC pins in electronics), is a list of ports defined between parenthesis as for functions in a functional language module module_name (wire a, wire b, wire out); ... endmodule.
* By default, declaration are logic blocks, executed in parallel.
* For executing code sequentially and conditionally, always (and initial in testbenchs) sub-block must be used (see below for more details).

Simple example, writing an “and” gate

AND
Basic operators are already present in Verilog, but see here how it could be wrote. In Verilog boolean functions, the left argument is the output. So an and gate could be reimplemented this way:

module and(f,a,b); // The ; is not an error here
  output f;
  input a,b;

  assign f = a & b; // ^ is an AND gate
endomdule

This example should be put in an "and.v" file.
* The default type is wire
* Since Verilog 2005, input output can be put in the module header like this:

module and(output f, input a,b);

This example is a permanent unconditional gate, you can use them to do full circuitry, but Verilog is an high level language allowing to code more complex things a simple way. The always blocks is here for more complex gatewares.

Initial and always blocks

There are two types of blocks to know to start in Verilog:
* always block can be synthesized, so used in circuitry and executed each time conditions are met, so triggered by an event given between parenthesis always @(event).
* initial block can’t be synthesized, so used in testbench only and executed only one time and unconditionally. They allow to give initial values of the test. It is possible to have several initial blocks. This also allow to test/prototype quickly a simple function with text output, before put in a circuitry always block.

They both allow to:Posedge and negedge on clock signal
* execute code sequentially
* if else conditionals
* while, for, loops

A typical event to trigger an always can be, for example, a positive clock edge (posedge) or negative clock edge (negedge)

always @(posedge clk)
  ...
end

but can be a conditional logic test

always @(a or b)
  ...
end

Simple example with Verilator

A simple example given with Verilator:

module our (clk);
  input clk;  // Clock is required to get initial activation
  always @(posedge clk)
    begin $display("Hello World"); $finish;
  end
endmodule

Here, the always block is triggered at first positif clock edge (posedge), it print ($display) an “Hello World” string on the standard output, and finish ($finish) the simulation.

It is possible to test and validate lot of things with Verilator before going to further in FPGA process (Formal verification, converting to bitstream (that can be very slow with routing process) and flashing the FPGA). A good start is to make a copy of the above given example, and to modify the top.v file.

Make the test with Verilator

The example are installed on:
* Arch Linux or Manjaro: /usr/share/verilator/examples/
* Debian or Ubuntu: /usr/share/doc/verilator/examples/

So to test it, simply copy the directory to your home directory, where you will have write access, and then, build and execute with a simple make (Verilator transpile it in C and then compile it, this allow to have a very fast simulation):

On Arch Linux based distribution:

cp -a /usr/share/verilator/examples/make_hello_c ~/

On Debian based distribution:

cp -a /usr/share/doc/verilator/examples/make_hello_c ~/

Then on any distribution:

cd ~/make_hello_c
make

make allows to rebuild from the sources if needed, and then execute the testbench immediatly.

The output of the bench itself:

code>-- RUN ---------------------
obj_dir/Vtop
Hello World!
- top.v:11: Verilog $finish
-- DONE --------------------

The following output text (I didn’t paste here) is about compilation and suggestiong to use next tutorial.

Tracing example and GTKWave

GTKwave is companion tool to Verilator. It allows you to see the chronogram resulting from the simulation, to understand the temporal functioning of the circuit, and to see what could be the source of a possible problem. It uses the .vcd (Value Change Dump) file, for tracing. It is used as source to draw the chronogramme (see make_tracing_c example, output can be found in the logs/ subdir. The file is logs/vlt_dump.vcd, created after using the make command).

To use the tracing example with GTKwave, you need to install the gtkwave package, then, as for first example, duplicate the directory to a place you can write into:
* On Arch Linux based distribution:

cp -a /usr/share/verilator/examples/make_tracing_c ~/

* On Debian based distribution:

cp -a /usr/share/doc/verilator/examples/make_tracing_c ~/

Then on any distribution:

cd ~/make_tracing_c
make
gktwave logs/vlt_dump.vcd

GTKwave

The text output during the make phase give interesting informations too:

-- RUN ---------------------
obj_dir/Vtop +trace
[1] Tracing to logs/vlt_dump.vcd...

[1] Model running...

[1] clk=1 rstl=1 iquad=1234 -> oquad=1235 owide=3_22222222_11111112
[2] clk=0 rstl=0 iquad=1246 -> oquad=0 owide=0_00000000_00000000
[3] clk=1 rstl=0 iquad=1246 -> oquad=0 owide=0_00000000_00000000
[4] clk=0 rstl=0 iquad=1258 -> oquad=0 owide=0_00000000_00000000
[5] clk=1 rstl=0 iquad=1258 -> oquad=0 owide=0_00000000_00000000
[6] clk=0 rstl=0 iquad=126a -> oquad=0 owide=0_00000000_00000000
[7] clk=1 rstl=0 iquad=126a -> oquad=0 owide=0_00000000_00000000
[8] clk=0 rstl=0 iquad=127c -> oquad=0 owide=0_00000000_00000000
[9] clk=1 rstl=0 iquad=127c -> oquad=0 owide=0_00000000_00000000
[10] clk=0 rstl=1 iquad=128e -> oquad=128f owide=3_22222222_11111112
[11] clk=1 rstl=1 iquad=128e -> oquad=128f owide=3_22222222_11111112
[12] clk=0 rstl=1 iquad=12a0 -> oquad=12a1 owide=3_22222222_11111112
[13] clk=1 rstl=1 iquad=12a0 -> oquad=12a1 owide=3_22222222_11111112
[14] clk=0 rstl=1 iquad=12b2 -> oquad=12b3 owide=3_22222222_11111112
[15] clk=1 rstl=1 iquad=12b2 -> oquad=12b3 owide=3_22222222_11111112
[16] clk=0 rstl=1 iquad=12c4 -> oquad=12c5 owide=3_22222222_11111112
*-* All Finished *-*
- sub.v:29: Verilog $finish
[17] clk=1 rstl=1 iquad=12c4 -> oquad=12c5 owide=3_22222222_11111112

You can see here, each steps of the execution with the clock (clk) switching between 0 and 1 state.

The time is in tick, and in testbench, can be defined by timescale directive, that allow more precise resolution granular time. We don’t need to use it for now.

The .vcd (Value Change Dump) file is defined in top.v:

$dumpfile("logs/vlt_dump.vcd");

And you will have all the step in the logs/annotated/ dir and a coverage file in logs/coverage.dat:

-- COVERAGE ----------------
verilator_coverage --annotate logs/annotated logs/coverage.dat
Total coverage (2/31) 6.00%
See lines with '%00' in logs/annotated

-- DONE --------------------

The last one is defined in sim_main.cpp.

About Verilator examples

In each example, there is a small c++ code interface, modifying c++ code isn’t needed for basic testings. When basic concept are understood, c++ part can be tuned to interface with system libraries, to simulate devices communications, audio, graphics, etc.

In these simples starting examples, the most interesting ones for beginning are all available in Verilog:
* make_hello_c Verilog Hello World with Makefile
* make_hello_sc Verilog Hello World using SystemC with Makefile
* cmake_hello_c Verilog Hello World with cmake.
* cmake_hello_sc Verilog Hello World using SystemC with cmake.

SystemC is set of C++ classes and macros which provide an event-driven simulation interface, simulate concurrent processes in a real-time environment.

The other examples are (with their variants cmake|make and c|sc):
* make_tracing_c for a better understanding at start and for debugging purpose.
* make_protect_lib For creating a DPI protected library. DPI (Direct Programming Interface), is an interface between SystemVerilog and functions in a language like C or C++.

System Verilog is an advanced evolution of Verilog so it’s also a HDL and sometime view as a HVL (Hardware Verification Language). It add several types, including block types, has some C aspects, and can be programmed in object-oriented programming.

Some common file extensions with Verilog, System Verilog and related simulators are:
* .v as Verilog.
* .vc (verilog ???) or .f as File, used for large projects, containing arguments to give to Verilator (or other simulator), including, flags, includes directory, linked libs for the simulation.
* .vcd as Value Change Dump file, contains tracing output of simulation for analysis (see below).
* .vo as Verilog Output file.
* .sv as System Verilog,

Basic practical example with Verilator


So we will reuse the simple make example (you can choose cmake example instead in this case you will have to call cmake instead of make.

I copied make_hello_c example in my work directory:

cp -a /usr/share/verilator/examples/make_hello_c verilator_test
cd verilator_test

and then just edited top.v ( vim, emacs, gedit, or any text editor of your choice can be used) and replaced it’s content.

Example of simple module after my test, I would like to understand how to access to a simple (verilog) registers, different kind of datatypes, including sprite one (the one displayed beside the title of this section), you can download this version of top.v here:

module top;
 reg [8*11:1] str1;
 reg [8*25:1] str2;  // filled by spaces at left
 reg a,b,c;
 reg [7:0] sprite[0:7];  // use one byte by line
 reg [15:0] sprite2;  // sprite as bitfield
 integer i;

 initial begin
   str1 = "Hello World"; // string initialisation tests
   str2 = "Hello World";

   a = 1'b1; // bit initialisation tests
   b = a^1;
   c = a^0;

   sprite[0] = 8'b10011000; // sprites initialisation tests
   sprite[1] = 8'b00100100;
   sprite[2] = 8'b01000010;
   sprite[3] = 8'b10011001;
   sprite[4] = 8'b10011001;
   sprite[5] = 8'b01000010;
   sprite[6] = 8'b00100100;
   sprite[7] = 8'b00011000;

   sprite2[ 7:0] = 8'b10011000;
   sprite2[15:8] = 8'b00100100;

   $display ("str1 = %s", str1); // strings display
   $display ("str2 = %s", str2);

   $display ("a = %d", a);       // bits display
   $display ("b = a^1 = %d", b);
   $display ("c = a^0 = %d", c);

   for ( i=0; i<8; i=i+1) begin  // sprites display
    $display ("sprite[%2d] = %b",i,sprite[i]);
   end
   for ( i=0; i<2; i=i+1) begin  // try to read bitranges, including out of bounds
    $display ("sprite2[8*i(%1d) +: 8] = %8b",i,sprite2[8*i +:8]);
    $display ("sprite2[8*i(%1d) -: 8] = %8b",i,sprite2[8*i -:8]);
    $display ("sprite2[4*i(%1d) -: 8] = %8b",i,sprite2[4*i -:8]);
   end
 end 
endmodule

To test it, just, apply a make, here in blue are comment of the output added in blog post only:

make
-- VERILATE & BUILD -------- 
[...]  This part is the compilation log
-- RUN ---------------------   This part is interesting one
obj_dir/Vtop
str1 = Hello World
str2 =               Hello World    str2 with filled 25 chars
a = 1
b = a^1 = 0
c = a^0 = 1
sprite[ 0] = 10011000    Eight lines of the sprite
sprite[ 1] = 00100100
sprite[ 2] = 01000010
sprite[ 3] = 10011001
sprite[ 4] = 10011001
sprite[ 5] = 01000010
sprite[ 6] = 00100100
sprite[ 7] = 00011000
sprite2[8*i(0) +: 8] = 10011000      some bitrange access tests
sprite2[8*i(0) -: 8] = x0010010      We go out of registers here with x displayed
sprite2[4*i(0) -: 8] = x0010010
sprite2[8*i(1) +: 8] = 00100100
sprite2[8*i(1) -: 8] = 01001100
sprite2[4*i(1) -: 8] = xxxxx001

Further reading

Relatively complete documentation online:

About Verilog:
* A complete Verilog manual at Chip Verify
* Verilog TUTORIAL for beginners on ReferenceDesigner.com
* Verilog overview in few presentation screens (PDF) at euler.ecs.umass.edu
* Verilog Quick Reference Card.pdf
*
Verilog HDL Quick reference Guide
* French: Verilog syntax at hdl.telecom-paristech.fr

Testbenchs, SystemC and System Verilog:
* How to write a basic verilog TestBench, using a C++ template. The site FPGA tutorial has 3 main subjects, VHDL, Verilog & System Verilog.
* Tutorials about Systemverilog and SystemC.
* Some tutorials about System Verilog.
* French: System Verilog in 13 minutes on sen.enst.fr
* Verilog Simulation with Verilator and SDL using SDL library to simulate VGA graphics output, based on Verilator and some part of SystemVerilog.

Some interesting FPGA projects:
* From Nand To Tetris, all the step to do in few exercises a computer based on a 16 bit CPU, using only NAND gate (interesting game), then building a Tetris with this small system. Their use their own HDL, but there are already some port as examples, as this one in Verilog using Ikarus Verilog simulator (older free software simulator), one in Verilog for opensourced Lattice ICE40 FPGA (tested on an open harware Olimex board) and Opensource tools (YoSYS as synthesizer), in Verilog for De0-nano (Altera Cyclone IV FPGA).
*
Tang nano MIDI Sounder, a MIDI synthesizer, with nice 8 bits like sound.
* Speech256, speech synthesizer.
* (Japanese) SERV RISC-V procesossor for Tang Nano
* (Japanese) Tetris for Tang Nano in Verilog
* ZipCPU a blog for this CPU with a nice Verilator tests bench examples, using C++ templates for generic testbench.
* UART, Serial Port, RS-232 Interface FPGA implementation (both in Verilog and VHDL).
* Consolite, a light console in Verilog for FPGA, assembly programmable. An emulator is also available.
* FloPoCoFPU for FPGA devcelopped by INRIA.

For Free software loving hackers:
* Apicula project to open Gow1n FPGA array bitstreams