Characterizing a standard cell library for large scale design of memristive based signal processing
2021; Institution of Engineering and Technology; Volume: 16; Issue: 1 Linguagem: Inglês
10.1049/cds2.12076
ISSN1751-8598
AutoresAbubaker Sasi, Arash Ahmadi, Majid Ahmadi,
Tópico(s)CCD and CMOS Imaging Sensors
ResumoIET Circuits, Devices & SystemsEarly View ORIGINAL RESEARCH PAPEROpen Access Characterizing a standard cell library for large scale design of memristive based signal processing Abubaker Sasi, sasi@uwindsor.ca Department of Electrical and Computer Engineering, University of Windsor, Windsor, Ontario, CanadaSearch for more papers by this authorArash Ahmadi, Corresponding Author aahmadi70@gmail.com orcid.org/0000-0001-5094-5967 Department of Electrical and Computer Engineering, University of Windsor, Windsor, Ontario, Canada Correspondence Arash Ahmadi, Department of Electrical and Computer Engineering, University of Windsor, Windsor, Ontario, N9B 3P4, Canada. Email: aahmadi70@gmail.comSearch for more papers by this authorMajid Ahmadi, ahmadi@uwindsor.ca Department of Electrical and Computer Engineering, University of Windsor, Windsor, Ontario, CanadaSearch for more papers by this author Abubaker Sasi, sasi@uwindsor.ca Department of Electrical and Computer Engineering, University of Windsor, Windsor, Ontario, CanadaSearch for more papers by this authorArash Ahmadi, Corresponding Author aahmadi70@gmail.com orcid.org/0000-0001-5094-5967 Department of Electrical and Computer Engineering, University of Windsor, Windsor, Ontario, Canada Correspondence Arash Ahmadi, Department of Electrical and Computer Engineering, University of Windsor, Windsor, Ontario, N9B 3P4, Canada. Email: aahmadi70@gmail.comSearch for more papers by this authorMajid Ahmadi, ahmadi@uwindsor.ca Department of Electrical and Computer Engineering, University of Windsor, Windsor, Ontario, CanadaSearch for more papers by this author First published: 07 May 2021 https://doi.org/10.1049/cds2.12076AboutSectionsPDF ToolsRequest permissionExport citationAdd to favoritesTrack citation ShareShare Give accessShare full text accessShare full-text accessPlease review our Terms and Conditions of Use and check box below to share full-text version of article.I have read and accept the Wiley Online Library Terms and Conditions of UseShareable LinkUse the link below to share a full-text version of this article with your friends and colleagues. Learn more.Copy URL Share a linkShare onEmailFacebookTwitterLinked InRedditWechat Abstract In recent years, the use of memristors in circuits design has rapidly increased and attracted research interest. Advances have been made to both the size and the complexity of memristor designs. Therefore, computer aided design tools are required to handle memristor-based large-scale designs. A comprehensive automatic framework for the design and synthesis of large-scale memristor-complementary metal-oxide-semiconductor (CMOS) circuits is described herein. This framework provides a synthesis approach that can be applied to all memristor-based digital logic designs. In particular, it is a proposal for a characterization methodology of memristor-based logic cells to generate a standard cell library file for large-scale simulation. The proposed architecture is based on RRAM and ReRAM redox-based devices and the memristor ratioed logic design approach. The proposed framework is implemented in the Cadence Virtuoso schematic-level environment and was verified with Verilog-XL, MATLAB, and the electronic design automation synopses compiler after being translated to the behavioral level. The proposed method can be applied to implement any digital logic design. Nevertheless, it is perfectly suitable for signal processing applications that require MATLAB functions to produce text files with hex values in order to overcome the limitations of the simulation environment. A framework is deployed herein for design of the memristor-based parallel 8-bit adder/subtractor and a 2D memristive-based median filter. Both proposed designs memristor-based adder/subtractor and memristive median filter have significant power reductions of 66% and 16% respectively, when compared to the same designs using CMOS technology. 1 INTRODUCTION Although the conventional complementary metal-oxide-semiconductor (CMOS) technology scaling limitation was extended using FinFET architecture, FinFET is facing significant challenges for different reasons such as doping damage, restriction in the logic chip design space, limitation of the electrostatics, and integration challenges [1, 2]. Therefore, substitutes to CMOS technology are in high demand. There are several alternative technologies, such as Double-Gate Tunnel FET [3], nanotube programmable devices [4], graphene transistors [5], and memristor devices [6]. Among those technologies, memristor devices are the most promising because of their great scaling ability, long-term data storage, low-power consumption, and CMOS compatibility [7, 8]. It is believed that these two terminal devices will play an essential role in the future fabrication of memory and information processing systems [9, 10]. Nevertheless, the number of memristor-based applications in today's circuit designs has been increasing exponentially. However, the design and mapping of large-scale memristor-based applications is a challenging task due to the lack of comprehensive high-level design tools and simulation platforms. Currently, circuit design tools like SPICE (H<) SPICE, ICAPS, and Cadence Virtuoso are not capable of providing designers with comprehensive design and simulation methodologies for memristors [11]. Xie et al. [12], presented a method for the automatic mapping of large-scale crossbar memristive-based Boolean logic circuits. This method involved the use of CMOS to control and drive the design. A programmable architecture for a large-scale neuromorphic-systems-based-memristive crossbar is proposed in [13]. The authors proposed a framework for deep learning networks based on the programming of spin electronics (spintronic devices). The framework mapping blocks consist of memristors and transistors to mimic spindle behaviour. In [14], the authors introduced a design methodology for memristor crossbar architecture-based image compression. The author primary objective is to perform computational operations in a memristive crossbar and store the row-transformed image data in the same crossbar memory array. Therefore, the overall area, timing, and power of this architecture were reduced. The aforementioned methods were implemented based on memristive crossbars. Such design techniques presented real challenges, including those related to sneak path current and signal degradation. Moreover, memristive crossbar circuits require separate circuits to control input signals. Material implication logic is also implemented to map memristor-based Boolean logic [15, 16]. In these works, implication logic was employed to reduce the number of memristor devices and operating cycles. However, the use of such methods is limited only to Boolean function implementation. Moreover, memristor-based crossbar and implication logic design methods are not synthesisable using computer aided design (CAD) synthesis tools [17]. In addition, the above-mentioned design methods require sequential computational steps to achieve a logic gate operation. In such a process, execution of one logic computation requires more than one clock cycle. Considering these challenges, a hybrid memristor/CMOS logic design is the most applicable method because it is CMOS compatible, delivers an optimal solution to eliminating signal degradation, and can be synthesised and mapped using CAD tools. However, it is impractical to manually design memristor-based large-scale circuits using currently available methods due to design complexity and the limited number of memristors and transistors that CAD tools support [18]. Herein, a comprehensive automatic framework for the design and synthesis of large-scale memristor-CMOS circuits is proposed. This framework provides a synthesis approach that can be applied to all memristor-based Boolean logic designs. In particular, MATLAB, a high definition language (HDL) simulator, the Cadence Virtuoso environment, and Synopses software were utilised to implement parallel 8-bit adder/subtractor and 2D memristive based median filter. The filter was manually implemented on the Cadence Virtuoso schematic level and previously published in [19]. Brief details about choosing a proper memristor model are given in Section 2. Section 3 contains a description of cell library characterisation. Section 3.1 contains a discussion of the CAD tools used for the automatic implementation of the proposal. Section 4 provides case study and a discussion of the proposed simulation results, and finally, Section 5 concludes this paper. 2 MEMRISTOR BASED-LOGIC DESIGN 2.1 Memristor modeling All designs, simulations and cell characterisations for a memristor-based standard cell library in proposal were implemented using a metal-oxide-based resistive random access memory (RRAM) devices model [20] and redox-based resistive switching memories (ReRAM) model which was presented in [21]. The accuracy levels of both memristive models provides the realistic required switching behaviour. Both models are simulated using the Verilog-A model in the Cadence Virtuoso environment. ReRAM module is the first module used in this proposal, but during the logic cell characterization for delay, power dissipation, and input capacitance. The results were not that good as expected due to the using of CMOS transistor and due to the following factors that were considered when differentiating between the choice of RRAM and ReRAM devices: [1-] Device size and resistive layer: Both devices are designed based on metal oxides that consume power relatively little. RRAM device is preferred due to its small size, which is <10 nm, while the size the ReRAM device is 11 nm. In addition, the size of the metal has a direct effect on the capacitance of the device, which has a significant impact on the power dissipation and delay performance of the circuit. [2-] The amplitude of input voltage: It is important to utilize an appropriate supply voltage to obtain low power consumption and ensure high performance. However, having low voltage led to significant increases in the propagation delay, and significantly decreases the power consumption. Thus, the RRAM device only requires 2 V of input voltage supply, which is low compared to ReRAM which requires 4 V. For circuit testing and simulation, both Verilog-A models of RRAM that were presented in [20] and the ReRAM device that was presented in [21] were utilised to obtain the desired logic behaviour for the proposed design. The accuracy levels of both memristive models provide the realistic required switching behaviour. Both are simulated using the Verilog-A model in the Cadence Virtuoso environment. Due to the lack of real physical memristor device layout tools, it is important to choose an accurate memristor model [20] that simplifies the implementation of memristor-based applications and study cases for the creation of reliable simulations. RRAM and ReRAM devices were simulated based on the parameters shown in Table 1. In this proposal, two factors were considered when differentiating and choosing between RRAM and ReRAM (Pt/TaOx/Ta) devices to implement memristor-based logic gates at the behavioral level. TABLE 1. The simulation parameters for ReRAM and RRAM devices ReRAM PT/TAOx/TA device [21] Parameters Nmin (m−2) Nmax (m−2) Ninit (m−2) C31 (pAm/V) A (nm2) LDisc (nm) Value 0.308 5 5 6 3.14 4 RRAM device [20] Parameters I0(A) go (nm) gmax (nm) gmin (nm) L (nm2) v0 (m/s) Value 6.14 (e−5) 2.75 (e−10) 6 (e−12) 3.14 (e−14) 5 150 Abbreviations: ReRAM, redox-based resistive switching memories; RRAM, resistive random access memory. The first factor is device size and resistive layer: both devices were designed based on the small size of metal oxides that consume less power. Small devices consume less power than large ones [22]. Therefore, the RRAM device is preferred due to its small size, which is <10 nm, while the size of the ReRAM device is 11 nm. In addition, as seen in Equation (1), the size of the metal has a direct effect on the capacitance of the device, which has a significant impact on the power dissipation and delay performance of the circuit P = C L V 2 f (1)where CL, V, and f are load capacitance, voltage amplitude and frequency, respectively. The second factor is the amplitude of the input voltage. It is important to utilise an appropriate supply voltage to obtain low-power consumption and ensure high performance. Although having low voltage leads to significant increases in propagation delay, it significantly decreases power consumption. Thus, the RRAM device only has 2 V of input voltage supply, which is low compared to the ReRAM (Pt/TaOx/Ta) device, which has 4 V as shown in its I–V curve in Figure 1. FIGURE 1Open in figure viewerPowerPoint Memristor device I–V curve for redox-based resistive switching memories (ReRAM) Pt/TaOx/Ta valence change memory device with a bipolar triangular input voltage of 5 V [21] 2.2 Logic design approach Memristor ratioed logic (MRL) is a hybrid CMOS-memristor-based logic [23]. It is a voltage-based design approach, unlike MAD [24] and Mirrored [25] logics, which are memristive-based. The compatibility of memristor devices with CMOS increases circuit density and offers the best way to eliminate signal degradation in memristor logic of AND and OR gates. The CMOS inverter is added to output of memristor-based OR and AND gates to achieve the desired NOR and NAND logic [26]. In MRL, the voltages are perceived as logical states, i.e. high and low voltages, indicates logic '1' and '0' respectively, as shown in MRL AND and OR gate of Figure 2a,b, design structures. The voltages inputs Vin1 and Vin2 are applied to both memristors terminals that are connected in parallel, and each memristor's set end is attached to the output terminal. In the test of the AND gate circuit, if high voltage '1' and low voltage '0' are applied to terminals Vin1 and Vin2 respectively, then Vout can be determined as: V o u t = R o ff R o ff + R o n V h i g h ≅ V h i g h (2)and when if low voltage '0' is applied to both inputs terminals then Vout can be calculated as: V o u t = R o n R o n + R o ff V h i g h ≅ 0 (3) FIGURE 2Open in figure viewerPowerPoint (a) MRL-based AND gate and its resistance progression. (b) MRL-based OR gate and its resistance progression. MRL, memristor ratioed logic The MRL logic design approach was exploited to implement the proposed circuit designs. 3 SYNTHESIS METHODOLOGY AND IMPLEMENTATION Creating a memristor-based standard cell library is essential to exploring the potential of memristors in digital design using available CMOS synthesis tools. Using such tools requires an accurate cell characterisation method for memristor-based logic gates. Synthesis tools involve the use of characterised gates library files to facilitate logic optimisation, enhance design speed, and determine the area, timing, and power consumption. The characterisation process for any memristor-CMOS cells can be described as follows. 3.1 Input/output capacitance The measured capacitance values at each cell pin is the main factor used to estimate dynamic power and delay using synthesis tools. Input capacitance is calculated by measuring the charge flows into or out of each cell pin divided by the magnitude of the power supply. It can be mathematically formulated as follows: C p i n = 1 V d d ∫ t 1 t 2 i ( t ) d t (4)where i(t) is the current flow into the pin and Cpin is the pin capacitance, measured as the amount of charge passing through the pin at the input voltage (rising swing from 0 to VDD and from VDD to 0) divided by the voltage supply. In the memristor-based logic cells characterisation method, the characterising simulation utilises a net of inverters as standard capacitive load, which is serially connected to the output pin of the cell under characterisation. 3.2 Power measurement The logic transition of cell input pins which are deployed in the proposed method consumes energy. The value of energy consumed by the proposed circuit was measured by calculating the current passing through the zero-DC source that was connected to the VDD. Then the consumed current was integrated over each time transition using the Cadence Virtuoso calculator. The library table of each cell in the proposed design only contains energy values measured in joules, and the rest of the power consumption calculation was accomplished by the Synopsys synthesis compiler. The only measured power consumption in this method is dynamic power, which is mathematically described as follows: P D y n a m i c = α C V D D 2 f (5)where α, C, VDD and f are the switching activity factor, capacitance, voltage source, and operating frequency, respectively. 3.3 Delay measurement The non-linear delay simulation method was utilised to measure the propagation delay. With fan-out consideration, the delay measurement depends on the transition time at the cell input pin and the capacitance of the output pin. The specified slew threshold for the cell is set to be between 30% and 70% of the power supply magnitude. In addition, it was defined as the time the signal rises from 30% to 70% and falls from 70% to 30% of its VDD. 3.4 Area estimation As illustrated in [27, 28] the memristor device can be fabricated on the top of the CMOS transistors. Therefore, the area was estimated depending on the size of the inverters utilised in each cell. 3.5 CAD tools for automatic implementation To prove the functionality of the framework and test the feasibility of the automatic implementation of a memristor-based digital design approach, several steps were taken, as illustrated in the design flow in Figure 3. FIGURE 3Open in figure viewerPowerPoint Flow chart displaying design flow based on Synopsys EDA tool for proposed framework. CMOS, complementary metal-oxide-semiconductor; EDA, electronic design automation; HDL, hardware description language 4 MEMRISTOR BASED-LOGIC DESIGN 4.1 Memristor modelling In the first step, the behavioural functions of the implemented cells at the schematic level were described using Verilog HDL and simulated using the Cadence NC-Verilog-XL simulator. The Verilog language can be used to read/write files from a storage environment. This feature makes it possible to design a test bench to read data from a storage device, generate stimulus signals for the Verilog test module, and write the results to a storage device. In the proposed framework, the signal processing applications require a MATLAB encoder and decoder. MATLAB function is needed to convert input signals into the form of hex arrays because Verilog only reads and writes ASCII character files, and then another MATLAB function is used to import the processed data encoded by the Verilog test bench to reconstruct it. In the second step, as shown in Figure 3a, after testing the design at the behavioural level, the implemented register-transfer level (RTL) was synthesised to the gate netlist level with the aid of a Synopsys Design Vision compiler. The design compiler uses a standard library that contains all information about the characteristics of logic cells to generate the final CMOS-based gate netlist file. In the third step, the generated CMOS gate netlist was carefully inspected to realise the logic cells used to build the CMOS-based design. After the logic cells are produced by the Synopsys synthesis compiler, equivalent memristor-based logic gates are implemented at the schematic level, tested, and characterised using the MRL design method. Hence, at this stage of the design, the characterisation process for memristor-based logic cells was obtained to build a standard memristor-based library for the Synopsys synthesis compiler, as presented in Figure 3b. The most important characterised cells involved in the proposal are AND, NAND, OR, NOR, multiplexer, and other defined Boolean function circuits, as shown in Table 2. The built library provides a synthesis tool with information about cell logic function, area, input/output capacitance, delay, and power consumption. TABLE 2. list of logic cells involved in the design of the adder/subtractor Cells Description Equation MOSFET Memristors AND2X1 Logic AND for two inputs Z = (A.B) 0 2 OR2X1 Logic OR for two inputs Z = A + B 0 2 NOR2X1 Logical NOR Z = ( A + B ) ¯ 1 2 MXI2X1 Two inputs multiplexer with inverted output Z = ( S ¯ . B ) + ( S . B ) ¯ 2 6 XOR2X1 Logic exclusive OR for two inputs Z = (A ⊕ B ¯ ) + ( A ¯ ⊕ B) 5 8 XNOR2X1 Logic exclusive NOR for two inputs Z = (A ⊕ B ¯ ) + ( A ¯ ⊕ B) 4 8 AOI21X1 Logical inverted OR of one AND gate and an additional input Z = ( A 0 . A 1 ) + B 0 ¯ 1 4 NAND2X1 Logical NAND of two inputs Z = ( A . B ) ¯ 1 2 NOR3X1 Logical NOR of three inputs Z = ( A + B + C ) ¯ 2 4 INVX Logical inversion of single input Z = A ¯ 1 0 Abbreviation: MOSFET, Metal-Oxide-Semiconductor Field Effect Transistor. 4.2 Case study 1 In this section, a memristor-based parallel 8-bit adder/subtractor is designed and analysed using the proposed framework. It was implemented at both the schematic and behavioural levels. In other words, the implementation was done to establish and validate the design of the adder/subtractor at the schematic level using a Cadence Spice Spectre simulator, NC-Verilog at the behavioural level, and Synopsys Synthesis tools using Design Vision at the synthesis level. As part of the design, the adder/subtractor was chosen to clarify the proposed framework, design, and simulation. The following is a brief description of the framework. The memristor-based 8-bit adder/subtractor was implemented with seven cascaded combinations of 1-bit memristor-based full adders. The schematic design of the adder/subtractor circuit is exhibited in Figure 4a. The 1-bit memristor-based full adder logic circuit consists of two memristor-based AND gates, one memristor-based OR gate, and two memristor-based XOR gates, as shown in Figure 4b. FIGURE 4Open in figure viewerPowerPoint The memristor-based adder/subtractor: (a) 8-bit Adder/subtractor schematic circuit; (b) Implemented Memristor-based 1-bit Adder/Subtractor The functionality of the designed adder/subtractor was proven by the simulation results in Figure 5. As illustrated in Figure 4a, the Sel line acts as a control signal to decide whether to use the adder or subtractor circuit modes. When Sel = 1, the Sel line acts as carry-in (Cin). Thus, all inputs of B will be reversed and 1 will be added to the LSB to determine the 2's complement. In addition, when Sel = 0, B XOR 0 will always produce B. Therefore, A and B will be added. FIGURE 5Open in figure viewerPowerPoint Simulation results of memristor-based 1bit adder and 1bit adder/subtractor: (a) MRL-based 1bit adder simulation results; (b) The simulations of the MRL-based 1bit adder/subtractor. MRL, memristor ratioed logic The adder/subtractor verilog description was verified in the Cadence NC-Verilog simulator, and the Synopsys compiler synthesises the RTL description and then converts the synthesised description to the optimised gate-level. The produced gates file consists of several logic cells some of which are listed in Table 2. The characterisation procedure for this proposed design was implemented at the schematic level by utilising Cadence spice Spectre. This characterisation process provides the required information for the memristor-based library that has been used by the Synopsys synthesis compiler to estimate the design area, delay, and power consumption. This information represented the design logic function and area. It also includes measurements of the design's input/output capacitance, delay, and power consumption. All this information was generated from the simulation of the memristor-based cells at the schematic level. 4.3 Case study 2 In this case study, the proposed framework was applied to implement a memristor-based median filter, which was manually implemented and tested only at the Cadence Virtuoso schematic level and previously published in [19]. Image processing is very useful and has been extensively used in the areas of medicine, film and video production, photography, remote sensing, military target analysis, and manufacturing automation and control [29, 30]. These applications usually require bright and clear images or pictures. Hence, corrupted, or degraded images need to be processed to improve human interpretation, enhance visual pictorial information, and modify the data structure used for image representation to optimise it for data storage, transmission, or other representations for autonomous machine perception. The main goal of any enhancement method is to obtain a more suitable result compared to the original. Digital images are represented as 2D arrays of numbers, where the value of each entry corresponds to the greyscale value of a pixel, ranging between 0 and 255 (255 being white). Thus, image enhancement techniques are transformed into 2D filtering operations. The 2D median filter replaces the value of each element based on the median value of its neighbour. The Sxy is a neighbourhood concept with eight elements immediately surrounding the median element. Thus, the mathematical representation of an image g (x; y) in the median filtering process is described as follows: S x y = m e d i a n { g ( s , t ) } ( s , t ) ∈ S x y (6) The implementation of a memristor-based median filter has two phases. The first phase is the schematic-level implementation. At this stage, the median filter is manually designed and a 3 × 3 window is applied to verify the functionality of the proposal. In the second phase, due to the high design complexity, automated synthesis tools are required to make reliable and accurate simulations. Therefore, using a standard memristor cells library is essential to improving the accuracy of synthesis tools when they estimate power, area, and delay. Thus, the memristor cells involved in schematic implementation are characterised to create a memristor-based standard cell library. 4.3.1 Schematic level implementation The sorting mechanism in this technique is to find the median pixel from the surrounding neighbourhood pixels. The execution steps of memristive median circuit detects the median pixel in a 3 × 3 window, and the simulation results for the circuit are shown in Figure 6. This design was implemented using seven three-input 8-bit memristor-based comparators. Each of these comparators consists of three memristor-based two-input 8-bit magnitude comparators. FIGURE 6Open in figure viewerPowerPoint Proposed memristive median simulation detection in 3 × 3 window The two input 8-bit comparators were implemented as illustrated in Figure 7, with two 4-bit memristor-based magnitude comparators to compare between two pixels (eight bits for each input). The schematic of this comparator is displayed in Figure 8, and it was implemented based on a memristor-based MRL logic structure as shown in Figure 9. The outputs of the two 4-bit comparators were compared again with those of the 2-bit comparator to find the largest pixel value between the two inputs. Then only one output value from the 2-bit comparator was split between two multiplexers and the other output was connected to the selector of the first multiplexer to decide which pixel had the maximum value, and the same output was inverted and connected to the second multiplexer to select the pixel with the minimum value. FIGURE 7Open in figure viewerPowerPoint Proposed memristor based median filter sorting circuit FIGURE 8Open in figure viewerPowerPoint Schematic view of the implemented 4-bit memristor-based magnitude comparator FIGURE 9Open in figure viewerPowerPoint Implemented memristor-based 4-bit magnitude comparator The proposed filter proceeds with nine inputs and determines the median value among them. This proposed architecture of the memristor-based median filter design was implemented and tested with Cadence Virtuoso environment at the schematic level using the memristor model presented in Verilog-A [21] and the parameters utilised for this model are shown in Table 1. 4.3.2 Automatic implementation To prove the functionality of the proposed filter, the first step was to describe the behaviour of the median filter algorithm that was implemented at the schematic level using Verilog HDL and simulate it using the Cadence NC-Verilog-XL simulator. Unfortunately, Verilog only reads and writes ASCII character files. Therefore, it is not capable of reading images in standard formats, such as BITMAP or JPEG, directly from disk [30]. To resolve this problem, it is necessary to define a new image format to be used with a design test bench. The new image must be a HEX file that only contains information about RGB/greyscale vectors for each pixel of the input image. The data from hex-files are applied as stimuli to the point operations blocks described in Verilog language. The HEX characters are then elegantly converted to binary format by the Verilog HDL simulator. In this part, the median filter implemented in Verilog was a behavioural model that removes the 'salt and pepper' noise of an input image and outputs the filtered image. The filtered image is then compared
Referência(s)