Simple direct-mapped cache simulation on FPGA




This article is a part of a course work for first year bachelor students of Innopolis University. All work is done in a team. The purpose of this article is to show an understanding of the topic, or to help to understand it using simulation.




Git repository link




Principle of work but from the user side should look like:


  • To write any data in memory, you need to access the RAM with data and address in which we want to write.
  • To access the data, we have to adress to cache. If the cache cannot find the necessary data, then it accesses the RAM by copying data from there.

When working with Verilog, it should be understood that each individual block of the program is represented as a module. As you know, the cache is not an independent part of fast memory, and for its proper operation it needs to take data from another memory block — RAM. Therefore, in order to simulate the work of the cache at the FPGA, we have to simulate whole RAM module which includes cache as well, but the main point is cache simulation.


The implementation consists of such modules:


  • ram.v — RAM memory module
  • cache.v — Cache memory module
  • cache_and_ram.v — module that operates with data and memory.
  • testbench.v and testbench2.v — module to show that main modules work perfectly.

RAM module:


Code
module ram();

parameter size = 4096; //size of a ram in bits

reg [31:0] ram [0:size-1]; //data matrix for ram

endmodule

Description

Module represents memory which is used as RAM. It has 4096 32-bit addressable cells to store some data.





Cache module:


Code
module cache();

parameter size = 64;        // cache size
parameter index_size = 6;   // index size

reg [31:0] cache [0:size - 1]; //registers for the data in cache
reg [11 - index_size:0] tag_array [0:size - 1]; // for all tags in cache
reg valid_array [0:size - 1]; //0 - there is no data 1 - there is data

initial
    begin: initialization
        integer i;
        for (i = 0; i < size; i = i + 1)
        begin
            valid_array[i] = 6'b000000;
            tag_array[i] = 6'b000000;
        end
    end

endmodule 

Description

So the cache contains more than just copies of the data in
memory; it also has bits to help us find data within the cache and
verify its validity.





Cache and RAM module:


Code
module cache_and_ram(
    input [31:0] address,
    input [31:0] data,
    input clk,
    input mode, //mode equal to 1 when we write and equal to 0 when we read
    output [31:0] out
);

//previous values
reg [31:0] prev_address, prev_data;
reg prev_mode;
reg [31:0] temp_out;

reg [cache.index_size - 1:0] index; // for keeping index of current address
reg [11 - cache.index_size:0] tag;  // for keeping tag of ceurrent address

ram ram();
cache cache();

initial
    begin
        index = 0;
        tag = 0;
        prev_address = 0;
        prev_data = 0;
        prev_mode = 0;
    end

always @(posedge clk)
begin
    //check if the new input is updated
    if (prev_address != address || prev_data != data || prev_mode != mode)
        begin
            prev_address = address % ram.size;
            prev_data = data;
            prev_mode = mode;

            tag = prev_address >> cache.index_size; // tag = first bits of address except index ones (In our particular case - 6)
            index = address % cache.size;       // index value = last n (n = size of cache) bits of address

            if (mode == 1)
                begin
                    ram.ram[prev_address] = data;
                    //write new data to the relevant cache block if there is such one
                    if (cache.valid_array[index] == 1 && cache.tag_array[index] == tag)
                        cache.cache[index] = data;
                end
            else
                begin
                    //write new data to the relevant cache's block, because the one we addressing to will be possibly addressed one more time soon
                    if (cache.valid_array[index] != 1 || cache.tag_array[index] != tag)
                        begin
                            cache.valid_array[index] = 1;
                            cache.tag_array[index] = tag;
                            cache.cache[index] = ram.ram[prev_address];
                        end
                    temp_out = cache.cache[index];
                end 
        end
end

assign out = temp_out;

endmodule 

Description

Represents operations for work with data in memory modules. Gets input on each clock positive edge. Checks if there are new inputs — depending on the mode (1 for write/0 for read) executes relevant operations. If mode is 1(write):
•Write data to address then check whether input address exists in cache, if so — replace the data, else standstill.
If mode is 0(read):
•Check whether input address exists in cache, If so — return the data, else get the data from ram. Refresh the address in the cache with new data.


Testbenches:


Code1
module testbench;

reg [31:0] address, data;
reg mode, clk;
wire [31:0] out;

cache_and_ram tb(
    .address(address),
    .data(data),
    .mode(mode),
    .clk(clk),
    .out(out)
);

initial
begin
    clk = 1'b1;

    address = 32'b00000000000000000000000000000000;         // 0
    data =    32'b00000000000000000011100011000000;         // 14528
    mode = 1'b1;

    #200
    address = 32'b10100111111001011111101111011100;         // 2816867292 % size = 3036
    data =    32'b00000000000010000000100001010101;         // 526421
    mode = 1'b1;

    #200
    address = 32'b00000000000011110100011111010001;         // 1001425 % size = 2001
    data =    32'b00000001100000110001101100010110;         // 25369366
    mode = 1'b1;

    #200
    address = 32'b10100111111001011111101111011100;         // 2816867292 % size = 3036
    data =    32'b00000000000000000011100011000000;         // 14528
    mode = 1'b1;

    #200
    address = 32'b00000000000011110100011111010001;         // 1001425 % size = 2001
    data =    32'b00000000000000000011100011000000;         // 14528
    mode = 1'b1;

    #200
    address = 32'b00000000000011110100011111010001;         // 1001425 % size = 2001
    data =    32'b00000000000000000000000000000000;         // 0
    mode = 1'b0;

    #200
    address = 32'b10100111111001011111101111011100;         // 2816867292 % size = 3036
    data =    32'b00000000000000000000000000000000;         // 0
    mode = 1'b0;

    #200
    address = 32'b00000000000000000000000000000000;         // 0
    data =    32'b00000000000000000011100011000000;         // 14528
    mode = 1'b0;
end

initial
$monitor("address = %d data = %d mode = %d out = %d", address % 4096, data, mode, out);

always #25 clk = ~clk;

endmodule 

Code2
module testbench2;

reg [31:0] address, data;
reg mode, clk;
wire [31:0] out;

cache_and_ram tb(
    .address(address),
    .data(data),
    .mode(mode),
    .clk(clk),
    .out(out)
);

initial
begin
    clk = 1'b1;

    address = 32'b00000000000000000000000000000000;         // 0
    data =    32'b00000000000000000011100011000000;         // 14528
    mode = 1'b1;

    #200
    address = 32'b10100111111001011111101111011100;         // 2816867292 % size = 3036
    data =    32'b00000000000010000000100001010101;         // 526421
    mode = 1'b1;

    #200
    address = 32'b00000000000000000000000000000000;         // 0
    data =    32'b00000000000000000011100011000000;         // 14528
    mode = 1'b0;

    #200
    address = 32'b10100111111001011111101111011100;         // 2816867292 % size = 3036
    data =    32'b00000000000010000000100001010101;         // 526421
    mode = 1'b0;

    #200
    address = 32'b00000000000011110100011111010001;         // 1001425 % size = 2001
    data =    32'b00000001100000110001101100010110;         // 25369366
    mode = 1'b1;

    #200
    address = 32'b00000000000011110100011111010001;         // 1001425 % size = 2001
    data =    32'b00000001100000110001101100010110;         // 25369366
    mode = 1'b0;

    #200
    address = 32'b10100111111001011111101111011100;         // 2816867292 % size = 3036
    data =    32'b00000000000000000011100011000000;         // 14528
    mode = 1'b1;

    #200
    address = 32'b00000000000011110100011111010001;         // 1001425 % size = 2001
    data =    32'b00000000000000000011100011000000;         // 14528
    mode = 1'b1;

    #200
    address = 32'b00000000000011110100011111010001;         // 1001425 % size = 2001
    data =    32'b00000000000000000000000000000000;         // 0
    mode = 1'b0;

    #200
    address = 32'b10100111111001011111101111011100;         // 2816867292 % size = 3036
    data =    32'b00000000000000000000000000000000;         // 0
    mode = 1'b0;
end

initial
$monitor("address = %d data = %d mode = %d out = %d", address % 4096, data, mode, out);

always #25 clk = ~clk;

endmodule

Description

To run a testbench, load all files into the ModelSim project and run a simulation of one of the testbench files.


Комментарии (13)


  1. old_bear
    06.12.2018 17:37
    +3

    Зачем убирать под спойлеры текст? Его тут и так, скажем мягко, не слишком много, что девальвирует изначально неплохую идею по написанию англоязычных статей.
    P.S. Код выглядит как типовой вариант «перепишем этот С в синтаксисе HDL». Я бы постеснялся такое выкладывать, откровенно говоря.


  1. Mogwaika
    06.12.2018 17:43

    Is this code for fpga or for simulation only? It isn't useful for fpga implementation.


    1. 32bit_me
      06.12.2018 18:27

      Obviously, a code with the initial keyword can't be implemented in hardware.


      1. nerudo
        06.12.2018 20:26

        Конечно же это не так. Initial — это механизм задания начальных значений, который может быть использован на тех платформах, где это поддерживается.


        1. 32bit_me
          06.12.2018 20:53

          На каких конкретно?


          1. nerudo
            06.12.2018 20:56
            +1

            Ну так все современные FPGA умеют инициализировать триггеры и большинтсво видов блочной памяти. С асиками не знаю, там отдельный разговор.


            1. 32bit_me
              06.12.2018 21:03

              То есть, вы можете предоставить проект для любой современной ПЛИС, например Cyclone 10 (это современная ПЛИС, не так ли) в котором триггеры будут инициализироааться в блоке initial, и этот проект будет синтезироваться и работать на девборде.
              Ну ок, скинете ссылку на гитхаб, посмотрим.


              1. nerudo
                06.12.2018 21:37

                Конечно могу. Такса стандартная — 100$/час. Впрочем могу предложить облегченный вариант, достаточно будет если вы съедите свою шляпу.

                module tffmy
                (
                	input clk,
                	output [31:0] sr_out
                );
                	reg [31:0] sr;
                
                	initial
                	begin
                		sr = 32'hdeada550;
                	end
                	
                	always @ (posedge clk)
                	begin
                		sr <= {sr[0], sr[31:1]} ;
                	end
                
                	assign sr_out = sr;
                
                endmodule
                

                set_global_assignment -name FAMILY "Cyclone IV E"
                set_global_assignment -name DEVICE EP4CE6F17C8
                set_global_assignment -name TOP_LEVEL_ENTITY tffmy
                set_global_assignment -name LAST_QUARTUS_VERSION "16.1.0 Lite Edition"
                


                Post-route simulation
                image


                1. 32bit_me
                  06.12.2018 22:05
                  -1

                  О как. До чего дошел прогресс.


                1. Mogwaika
                  06.12.2018 22:10

                  А за час вы много успеете понаписать? И где и за что столько платят, если не секрет?


      1. Mogwaika
        06.12.2018 20:59

        Я про остаток имел в виду, что он в общем случае не будет работать и про дурацкое присваивание в always блоке.


  1. proton17
    06.12.2018 23:49
    +1

    Мой совет — читайте больше профлитературы на английском.


  1. justhabrauser
    07.12.2018 01:15

    SLY_G и marks со своими левыми переводами наверняка рвут на себе всё подряд — «а что, так можно было?»