A near-threshold 7T SRAM cell with high write and read margins and low write time for sub-20 nm FinFET technologies

Mohammad Ansari a, Hassan Afzali-Kusha b, Behzad Ebrahimia, Zainalabedin Navabi a, Ali Afzali-Kusha a,b, Massoud Pedram b

a Nanoelectronics Center of Excellence, School of Electrical and Computer Engineering, University of Tehran, Tehran, Iran 
b Department of EE-systems, University of Southern California, Los Angeles, USA

1. Introduction

SRAM arrays occupy a large portion of the state of the art digital circuits such as microprocessors and system on chips (SoCs) [1]. The power consumption of these large memory arrays constitutes a considerable portion of the power consumption of the entire chip [2]. The main component of the power consumption in SRAM cells is the static power, which is proportional to the sub-threshold current. The current reduces exponentially with the supply voltage. To reduce the leakage power, scaling of the supply voltage to below the threshold regime decreases the power dissipation, it also degrades the sub-threshold operation regime [9]. Although working in near-threshold regime results in a sizeable reduction in the power consumption without experiencing the severe speed degradation of the sub-threshold operation regime [9]. Also, the changes of cell parameters when the temperature rises from −40 °C to 100 °C are investigated. Finally, the write margin as well as the read and hold SNMs of the cell in the presence of the process variations are studied at two supply voltages of 400 mV and 500 mV. The study shows that the proposed cell meets the required cell sigma value (6σ) under all conditions.

In this paper, a 7T SRAM cell with differential write and single ended read operations working in the near-threshold region is proposed. The structure is based on modifying a recently proposed 5T cell which uses high and low VTH transistors to improve the read and write stability. To enhance the read static noise margin (RSNM) while keeping the high write margin and low write time, an extra access transistor is used and the threshold voltages of the SRAM transistors are appropriately set. In addition, to maintain the low leakage power of the cell and increase the Ion/Ioff ratio of its access transistors, a high VTH transistor is used in the pull down path of the cell. To assess the efficacy of the proposed cell, its characteristics are compared with those of 5T, 6T, 8T, and 9T SRAM cells. The characteristics are obtained from HSPICE simulations using 20 nm, 16 nm, 14 nm, 10 nm, and 7 nm FinFET technologies assuming a supply voltage of 500 mV. The results reveal high write and read margins, the highest Ion/Ioff ratio, a fast write, and ultra-low leakage power in the hold “0” state for the cell. Therefore, the suggested 7T cell may be considered as one of the better design choices for both high performance and low power applications. Also, the changes of cell parameters when the temperature rises from −40 °C to 100 °C are investigated. Finally, the write margin as well as the read and hold SNMs of the cell in the presence of the process variations are studied at two supply voltages of 400 mV and 500 mV. The study shows that the proposed cell meets the required cell sigma value (6σ) under all conditions.

Corresponding author. Tel.: +98 2182084920; fax: +98 2188778690. E-mail address: afzali@ut.ac.ir (A. Afzali-Kusha).
while Section 3 describes the proposed 7T SRAM cell. In Section 4, we assess the efficiency of the cell by comparing its characteristics with those of the other cells. Finally, Section 5 concludes the paper.

2. Related works

Fig. 1(a) shows the schematic of a conventional 6T SRAM cell in which two back to back inverters are used to provide a positive feedback loop to hold the data. Two access transistors are also used to enable read and write operations. The 6T cell is widely used and accepted as the standard SRAM structure. This cell does not, however, work properly in sub-threshold regime while operating in the near-threshold regime degrades read and write stabilities in addition to reducing the cell speed (if the cell functions properly at all) [10]. Thus, modifying this conventional structure is required to support near- and sub-threshold operation regimes.

In order to provide the stability in the sub-threshold region, an 8T structure, shown in Fig. 1(b), has been proposed in [13]. The cell uses a read buffer to separate the read and write operations. The 8T cell employs a single-ended read operation scheme through RBL. During the read operation, similar to the hold state, the WWL voltage level remains at zero. In addition, the read buffer has almost no effect on the voltages of the Q and QB nodes of the cell (which are referred to as the internal nodes in the remainder of this paper), and hence, the 8T SRAM has equal read and hold stabilities. This circuit still suffers from high write time and low write stability [15]. To improve the write stability, a 9T SRAM cell which is based on the 8T cell has been proposed in [15]. The cell, shown in Fig. 1(c), uses a gating method called supply feedback (achieved by using transistor M9), which improves the write characteristics of the cell. The structure also decreases the leakage power consumption with respect to that of the 8T cell.

A 5T cell, which is obtained by eliminating one transistor of the conventional 6T cell, has been proposed in [14] (cf. Fig. 1(d)). Three different threshold voltages are used in transistors of the cell: a low \( V_{TH} \) (LVT), a high \( V'_{TH} \) (HVT), and a standard \( V_{TH} \) (SVT). In the cell, the right pull down of the conventional 6T is eliminated to prevent the read disturbance as the read operation is performed (through the left bitline in a single-ended manner) and decrease the static power consumption in the hold “1” state [14]. The elimination improves the write characteristics of the proposed 5T cell compared to that of the 6T cell. This is achieved by removing the competition between the positive feedback and the access transistor (M4) at node QB when writing “0” at node Q. To hold the logic “0” at node QB, the bitline (BLB) is discharged to ground. In addition, the access transistor (M4), which has low \( V_{TH} \), is double-sized while the minimum-sized pull up transistor (M5) has high \( V_{TH} \).

For the sake of space, we limited our study to the aforementioned cells which are more relevant to our work. There are some other proposed SRAM cells in the literature to work in the near/subthreshold region (e.g., [16,17]). The 7T structure in [16] needs additional auxiliary circuitry for proper operation. The 7T cell in [17] works based on single-ended read and write operations. The structure also needs an extra control signal FCS in addition to RWL and WWL. The feedback control signal (FCS) is data and operation state dependent. Hence, the implementation of this approach requires some circuitry (which has some overhead) for the generation and distribution of this control signal as well as extra dynamic power due to the dynamic line voltage change.

3. The proposed 7T cell

As mentioned in the previous section, both 6T and 8T cells are slow in writing and also their write margins are on the border line. In the case of the 9T structure, the write time and margin have been improved at the expense of some area increase. On the other hand, the 5T cell has the lowest write time and highest write margin while its read SNM (based on commonly used definition) is the lowest. In this work, we propose a 7T cell to improve the write time and margin as well as the read SNM without increasing the area considerably.

The proposed 7T SRAM cell, which is based on the 5T cell described before, is depicted in Fig. 2. Compared to the 5T structure, we have swapped Q and QB as well as BL and BLB. Also, the stored data of the cell is read through the BLB line. To increase the read
4. Results and discussion

In this section, we study the efficacy of the proposed SRAM cell in terms of its key characteristics. For the realization of the cell, the FinFET technology, which exhibits better short channel effects, is utilized. It is predicted to be the device of the choice for sub 14 nm nodes [1]. Fig. 4 shows the 3D schematic of the FinFET structure whose parameters including the gate length \( L_g \), fin width \( H_w \), and fin height \( H_h \) are given in the table in the inset of the figure. The study is performed by comparing the characteristics by those of the other cells obtained from HSPICE simulations by using the sub-20 nm technologies (20, 16, 14, 10, and 7 nm) as provided in the Predictive Technology Model [19]. The available technology models included low power and high performance transistor models. Since in the cases of the 5T and the 7T cell structures (see Fig. 1(d) and Fig. 2), the right pull down transistor has been eliminated, for the proper operation of the cell, both low and high \( V_{TH} \) devices from high performance and low power technology models of [19] have been used. In addition, as listed in Table 1, except for the access transistor \( M_4 \) in both the 5T and 7T structures which is double-sized (two fins are used), all other transistors in all the structures have one fin. For the 5T structure, the threshold voltages of M3–M5 were assigned based on Fig. 1(d). The transistors M1 and M2 use standard \( V_{TH} \) (SVT) in the original proposal of the cell [14]. However, as SVT is not available for the considered technology, for M1 and M2, we considered two cases of high performance (HP) and low power (LP) cells where low and high \( V_{TH} \) devices from high performance and low power technology models have been used. For a better comparison, in the cases of the 6T, 8T, and 9T structures, we used only low (high) \( V_{TH} \) devices for HP (LP) cells. In the technologies considered here, the nominal supply voltages ranged from 0.7 V to 0.9 V (for super-threshold operation regime). Given the threshold voltage of about 0.4 V for the low power technologies, we set the supply voltage of 0.5 V for all the simulations targeting near-threshold operation regime. The study includes both nominal and under process variation cases.

4.1. Nominal study

4.1.1. Write state

First, we consider the write stability metric (i.e., the write margin) for which different definitions have been suggested in the

![HSPICE simulation parameters](image)

![Schematic of our proposed 7T SRAM cell](image)

![Effect of low and high threshold voltage combinations of M1 and M2 on the RSNM of the 7T cell](image)

![Fig. 3. Effect of low and high threshold voltage combinations of M1 and M2 on the RSNM of the 7T cell.](image)

![Fig. 2. Schematic of our proposed 7T SRAM cell.](image)

![Transistor sizing for different SRAM cells.](image)

![Table 1: Transistor sizing for different SRAM cells.](image)
In this work, we use the difference between $V_{DD}$ and the minimum wordline voltage that can cause a successful write operation as the metric. This is called the combined wordline margin (CWLM) [20]. For the asymmetric cell structures considered in this work, there are different write “0” and write “1” margins. Similar to [21], we consider the minimum of these two margins as the write margin. For measuring the write margin of the 6T, 8T, 9T (5T and 7T) cells, based on their write operations, we sweep the WWL (RWL and WWL) with the same voltage from zero up to the voltage where the write operation occurs. The values of this metric for different structures at different technology nodes are shown in Fig. 5 for both LP and HP cells. The write stability metric for the proposed cell (which has no separate HP or LP implementation) is given in both LP and HP cell plots such that its performance can be compared to those of both types of the cells. It is evident that the proposed 7T cell has the highest write margin among all the cells. The write “0” (“1”) margin for the 5T (7T) cell is $V_{DD}$ since there is no active feedback in this case. The 7T cell has a higher write “0” margin (approximately equal to $V_{DD}$) compared to the write “1” margin of the 5T cell due to use of the write assisting transistor (M7 in Fig. 2). Recall that this transistor turns on only in the write mode improving the write operation. The 6T and 8T cells, which perform the write “0” and “1” operations symmetrically, have the lowest write margins because of the race between the access transistor and the transistors forming the feedback loop. The use of the supply feedback transistor M9 in the 9T cell (see Fig. 1(c)) makes the feedback loop weaker when the WWL signal is asserted. Hence, the 9T cell has a higher write margin than those of the 6T and 8T cells. Fig. 5 also shows that, there is no considerable change in the write margins as a function of physical scaling. This is justified by noting that the threshold voltages and more importantly the strength ratios (relative strengths) of transistors in the SRAM cells remain more or less the same in these technologies. The write margins of the HP cells are higher than those for the LP ones. This is due to the lower threshold voltages of (some of) the transistors involved in the write operation of the HP cells. In other words, for a given $V_{DD}$ level, the required voltage on WWL is lower (due to the use of lower threshold access transistors) making CWLM larger. It is worth noting that because the write margin of the proposed cell is almost equal to $V_{DD}$, one may be concerned about unwanted write to the half-selected cells in the same column of the cell that is being written to. This may not happen due to the fact that the half-selected cells require a substantially higher write time than that of the selected cell. Our simulation results show that, e.g., in the case of the 20 nm technology, the write “1” time is about 38 μs which is about 7 orders of magnitude larger than 6.7 ps required for the same operation in the case of the selected cell. This time is even substantially larger in the case of the write “0”. Additionally, to avoid the half-selected issue for the cells in the selected row (which is a common problem for near- and sub-threshold structures), it has been suggested that the whole row be written simultaneously [22]. Alternatively, one may use column-decoupled SRAM array in which the unselected cells in the same row have an inactive wordline [23].

Another important metric in the write state is the write time. The write time is measured from the point that the WL (WWL) signal is asserted to the point that the cell flips (Q and QB voltage values cross each other on their way to assume the opposite logic levels). Since the asymmetric cells have different write “0” and “1” times, we consider the maximum of the write “0” and “1” times as the write time for these cells. This metric is shown in Fig. 6 for different cells realized using different technologies. Again, both LP and HP cells are included in the study. Evidently as the technology scales down, the write time decreases due to the reduction of the capacitances. Furthermore, as expected, the HP cells have lower write times compared to those of the LP cells. In our proposed cell, the write operation is performed differentially through three access transistors, which makes the write operation very fast. Write “1” is performed very easily in our structure because it has no pull down transistor connected to node Q, which has stored “0”. The elimination of the pull down transistor reduces the node capacitance and increases the current that charges this node. For our proposed cell, write “0” is accomplished through two parallel access racing two series pull down transistors. Thus, node QB is rapidly charged to “1”, resulting in a short write time. Among these two write times, the write “0” takes longer for our proposed cell, and hence, its values are the ones that are reported in Fig. 6. Compared to the 6T cell, the 8T structure has a larger write time due to the higher parasitic capacitance at node QB (see Fig. 1(b)). In the 9T cell (see Fig. 1(c)), M9 helps achieving a faster write operation. During the write “1” operation, the node Q, which is connected to the gate of M9, becomes charged, weakening M9 (by reducing the absolute value of its overdrive voltage). Therefore, the source voltage of M6, which is the virtual $V_{DD}$ ($V_{DD}$), decreases while its gate voltage increases. This weakening of M6 helps obtaining a shorter time for M5 to discharge QB, which, in turn, initiates the positive feedback. At the start of the write “0”, M2 and M3 compete intensively. Because M9 is initially off, $V_{DD}$ is low making M3 weak which in turn yields a faster write operation. Therefore, the addition of M9 decreases the write time compared to that of the 8T cell. It is, however, more than that of the 6T cell due to the higher parasitic capacitances.
4.1.2. Read state

In the conventional 6T cell, during the read operation when $Q = “0\)”, the voltage of node $Q$ rises. If this voltage becomes larger than the trip point of the right inverter (M2–M4) the stored value in the cell flips. We have plotted the Read SNM (RSNM) of the cells in Fig. 7. The 8T cell, which works based on a single-ended read operation, has the largest RSNM originating from the use of the read buffer. On the other hand, the 5T structure has the lowest RSNM (for the case of reading a stored “0”) in spite of omitting the pull down transistor on the $QB$ node, which was to make the cell flipping less likely. This is due to the race between M2 and M1 during the read operation giving rise to a higher voltage for the $Q$ node. Therefore, for this cell, a smaller noise on the node $QB$ can flip the cell state, as evident in Fig. 8(a). In the case of our proposed cell, as explained in Section 3, the use of a high $V_{TH}$ access transistor and a low $V_{TH}$ pull down transistor increases the RSNM with respect to the 5T cell.

Fig. 6. Write time (maximum of the write “0” and “1” times) at different technology nodes for different SRAM structures in the cases of (a) LP and (b) HP cells.

Fig. 7. RSNM at different technology nodes for different SRAM structures in the cases of (a) LP and (b) HP cells.

Fig. 8. Butterfly curves for 7T, LP and HP 5T SRAM cells at the 20 nm technology node in the (a) read and (b) hold states.
A comparison of the butterfly curves of the 5T and 7T cells in the read state is shown in Fig. 8(a). The figure demonstrates that the proposed 7T cell has a much better read stability compared to those of both HP and LP 5T cells. The RSNM of the proposed structure is about the same as that of the 9T cell (which has the same buffer as the one used in the 8T cell while its RSNM is lower than that of the 8T cell due to M9). Fig. 7 also shows that, there is no considerable change in the RSNMs as a function of physical scaling for the same reason as mentioned for the write margin.

Another important metric for the read state is the read access time for which the read current may be used as a measure [24]. The read current is the current of the corresponding transistor that discharges the bitline during the read state. The difference between the voltages of the two bitlines enables the differential sense amplifiers. The differential sense amplifiers can be used in a single-ended read configuration using a reference voltage at the other input [25] for structures other than 6T. In all 5T, 6T, 8T and 9T structures, the read operation is done through two series nMOS transistors. Therefore, all of these structures have the same read current as shown in Fig. 9. In the proposed 7T cell, the read current is lower due to the inclusion of M6, which causes a stacking effect [26]. While the stacking effect decreases the read current, it also reduces the leakage current of the access transistors of the 7T cell substantially. This causes a higher $I_{on}/I_{off}$ ratio for the 7T cell as shown in Fig. 10. The higher $I_{on}/I_{off}$ ratio increases the sensing margin, the maximum cells per bitline, and the sensing timing window [27]. We also investigated the read access time (defined as the time needed to discharge the bitline by 10% of $V_{DD}$ [24,28]) of the cells assuming 64 cells per column. For these calculations, since the line (interconnect) capacitance values obtained from [1] is the dominant component of the total bitline capacitance, the (parasitic) capacitance of the cells has been ignored. The results are demonstrated in Fig. 11. Note that the 9T cell has the largest read access time (except for the 7T cell in the HP case) due to its larger cell height. In the case of our proposed 7T structure, as was the case for the read current, the read access time is the highest (except for the 9T cell in the LP case).

It should be noted that BLB should be precharged for the read operation while it is grounded during the hold state. This increases the read dynamic power similar to the write power. Since in SRAMs the leakage power is far more critical than the dynamic power, the increase in the dynamic power is not a major concern for this cell [29,30].

4.1.3. Hold state
In large SRAM arrays, most of the cells are very often in the hold state, where static (leakage) power dissipation is the only cause of the power loss [29]. Hence, the static power in the hold state should be minimized to decrease the total power consumption of the memory. The static power consumptions of the cells are drawn in Figs. 12 and 13 considering all of the leakage components (i.e. through both bitlines and the cell supply voltage). For the LP cells in the hold “0” state, the 6T structure has the lowest static power dissipation because all of the transistors are high threshold voltage devices, significantly reducing the leakage of the cell. The extra leakage from the read buffers causes the 8T and 9T cells to
have a little more leakage than the 6T cell. It is noteworthy that in the hold “0” state, the 8T and 9T cells dissipate approximately the same amount of static power. In our proposed 7T cell, the OFF pull down transistor is leaky due to being a low $V_{TH}$ device. Therefore, we added the series pull down (M6) to decrease the leakage through the pull down path. As shown in Fig. 12(a), this way the static power consumption resides between those of the 6T and 8T LP structures (except for the 10 nm technology) despite using a low $V_{TH}$ pull down transistor. For the 5T cell, in the hold “0” state, the main leakage path is the path from $V_{DD}$ to ground through M5 and M4 (see Fig. 1(d)). In this state, M5 is ON and low $V_{TH}$ transistor M4 has a large off current, causing the 5T cells to have a large leakage in the hold “0” state. For HP cells, the leakage currents of the transistors increase by about three orders of magnitude enlarging the leakage power of the cells. As shown in Fig. 12(b), the 8T and 9T cells consume the largest static power because of having three leakage paths. The next highest leakage current belongs to the 5T cell structure. A bit higher leakage current of 5T compared to that of the 6T cell is attributed to the use of two fins for the access transistor. The presence of high $V_{TH}$ devices in the leakage paths of our proposed cell makes it less leaky than the other HP cells. In the hold “1” state, as shown in
the leakage is higher than that of the 9T cell due to the lack of the structure, there are the same leakage paths as those of the 9T cell, lower than that of the 6T cell. Although in the 8T state, the main leakage path is the path from $V_{DD}$ through M5 and M4 (see Fig. 2). In this state, M5 is ON and lower than $V_{DD}$. This virtual $V_{DD}$ decreases the leakage of the 9T structure below than that of the 6T cell. Although in the 8T structure, the same leakage paths as of the 9T cell, the leakage is higher than that of the 9T cell due to the lack of the power gating transistor (M9). In our proposed cell, in the hold “1” state, the main leakage path is the path from $V_{DD}$ to ground through M5 and M4 (see Fig. 2). In this state, M5 is ON and low $V_{TH}$ transistor M4 has a large off current causing a large leakage in the hold “1” state for the 7T cell. It should be noted that whereas the proposed cell has a high hold “1” leakage power, the overall static power of a 7T memory array will be approximately 57% lower than that of a low power 5T array due to the fact that about 70% of cells in the cache memory array store “0” [31].

Table 2
Cell signaling scheme in different states for the proposed structure.

<table>
<thead>
<tr>
<th>State</th>
<th>BL</th>
<th>BLB</th>
<th>WWL</th>
<th>RWL</th>
</tr>
</thead>
<tbody>
<tr>
<td>Hold</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>Write “0”</td>
<td>0</td>
<td>$V_{DD}$</td>
<td>$V_{DD}$</td>
<td>$V_{DD}$</td>
</tr>
<tr>
<td>Write “1”</td>
<td>$V_{DD}$</td>
<td>0</td>
<td>$V_{DD}$</td>
<td>$V_{DD}$</td>
</tr>
<tr>
<td>Read “0”</td>
<td>0</td>
<td>$V_{DD}$</td>
<td>0</td>
<td>$V_{DD}$</td>
</tr>
<tr>
<td>Read “1”</td>
<td>0</td>
<td>$V_{DD}$</td>
<td>0</td>
<td>$V_{DD}$</td>
</tr>
</tbody>
</table>

Fig. 13(a), for the LP cells, the 5T structure, which has one OFF HVT transistor in each of the two leakage paths (path 1: M1 and path 2: M5), has the lowest leakage. The next structure with the lower power consumption is the 9T cell, which has three leakage paths with one OFF HVT transistor on each path (i.e., M1 on path 1, M5 on path 2 and M6 on path 3). Additionally, in the hold “1” state, the HVT pMOS on top is making the virtual supply voltage ($V_{DD}$) lower than $V_{DD}$. This virtual $V_{DD}$ decreases the leakage of the 9T structure below that of the 6T cell. Although in the 8T structure, there are the same leakage paths as those of the 9T cell, the leakage is higher than that of the 9T cell due to the lack of the power gating transistor (M9). In our proposed cell, in the hold “1” state, the main leakage path is the path from $V_{DD}$ to ground through M5 and M4 (see Fig. 2). In this state, M5 is ON and low $V_{TH}$ transistor M4 has a large off current causing a large leakage in the hold “1” state for the 7T cell. It should be noted that whereas the proposed cell has a high hold “1” leakage power, the overall static power of a 7T memory array will be approximately 57% lower than that of a low power 5T array due to the fact that about 70% of cells in the cache memory array store “0” [31].

Yet another important parameter in the hold state is the hold stability for which the hold SNM (HSNM) is used as a metric. Fig. 14 shows the HSNM values for different structures implemented using these technology nodes. The 6T and 8T cells have nearly the same HSNM since they virtually have the same topologies in the hold state. They consist of two back to back inverters forming a positive feedback loop. In the cases of these cells, the HSNM for both $Q=“1”$ and “0” are the same. In the 9T structure, when $Q=“1”$, M9 is OFF taking the transistors of the inverters to the sub-threshold regime. This makes the voltage transfer characteristics (VTC) of the inverters smoother in the hold “1” state [15], considerably degrading the HSNM. For this structure, the hold “1” limits the HSNM. Fig. 8(b) provides the butterfly curves of the 5T and 7T cells in the hold state at the 20 nm technology. In the 5T structure, the HSNM is limited by holding the $Q=“1”$ state because of the elimination of the right-side pull down transistor which reduces the hold “0” strength of the node $Q_B$. In our proposed cell, the SNM is limited by holding $Q=“1”$ where the OFF M6 weakens the pull down path, and hence, the positive feedback does not provide a strong $Q_B=“0”$ state. As mentioned before, for improving the HSNM, in the hold state, we ground both BL and BLB (as shown in Table 2) to provide another pull down path for the 7T cell. As shown in Fig. 8(b), the lower threshold voltages cause a lower left trip point ($V_{TLP}$) for the VTC in the HP 5T cell. This is translated to a smaller HSNM. The same trend is also observed in 6T and 8T structures when comparing the LP and HP counterparts. In the case of the 9T cell, the lower threshold voltage of M9 improves the feedback strength, and hence, the HSNM for both types of the cell are about the same. Fig. 14 also shows that, there is no large change in the HSNMs as a function of physical scaling. Again, this may be justified using the reasons as stated for the cases of the read and write margins.

4.1.4. Area
The layouts of the cells, drawn based on the design rules for the FinFET technology reported in [32], are shown in Fig. 15. The area of each structure is reported in the table in the inset of Fig. 15 where $\lambda$ is the minimum feature size assumed to be 1/2 of the gate length. The non-minimum sized transistor M4 (two fins) in the 5T and 7T cells as well as their asymmetrical structure of 5T, 7T, and 9T cause larger areas than expected (i.e., proportional to the transistor count) for these cells. The figures in the table show a smaller area/transistor for the proposed cell among these three asymmetric cell structures. It should be noted that LP and HP structures have the same layouts as the difference between the LVT and HVT transistors is only due to their gate work functions.

4.1.5. SRAM figure of merit
The diversity of SRAM parameters makes it difficult to designate a clear winner among all the ones that we have considered. A possible approach is to use a composite figure of merit for the SRAM cells. For instance, an SRAM electrical quality factor (SEQF)
which considers only stability and power has been suggested in [32]. This expression, however, does not consider the effect of the speed and area. Also, an SRAM quality factor ($Q$) has been proposed in [33]. Although this factor considered the power, stability, speed, and area metrics, it suffers from hummingous variations of the $Q$ values due to temperature changes. Here, we suggest a SRAM figure of merit (denoted by SFOM) which considers read access time ($t_{\text{read}}$) as the read speed metric, RSNM as the read stability metric, write time ($t_{\text{write}}$) as the write speed metric, and write margin ($WM$) as the write stability metric. The standby power and area ($A$) are also taken into account.

SFOM is defined as:

$$SFOM = \frac{(WM + RSNM)}{(t_{\text{write}} + t_{\text{read}}) \times (\log(P_{\text{avg}})) \times A}$$

where $P_{\text{avg}}$ is the average of the hold “0” and “1” power in pW.

Table 3 lists the SFOM values of the 5T, 7T, 8T, and 9T cells normalized to the SFOM of the 6T cell for different technologies. It should be noted that two sets of results have been presented for the 7T cell due to two different normalization factors of the 6T cell (LP and HP).

As shown in the table, in the case of the HP cells, our structure is the second best cell while for the LP cells, the 7T cell outperforms all others. Therefore, the suggested 7T cell could be considered as one of the better options for both high performance and low power technologies. We also included the results for the 100 °C temperature. As the results reveal, unlike the $Q$ factor, the temperature increase, does not change the SFOM substantially.

4.2. Study of temperature effect

In addition to the environmental parameters, due to the workloads of different parts of a digital chip, there can be large temperature gradients between different parts of the chip which may be close to or far from the hot spots of the chip [34,35]. This gradient may increase the temperature of the SRAM blocks. To look at the temperature effect, first, we have plotted the ON current (when $V_{\text{GS}}=V_{\text{DS}}=V_{\text{DD}}=0.5 \text{ V}$) and threshold voltage of the FinFET transistors for sub-20 nm technologies at –40 °C and 100 °C obtained from HSPICE simulations. The threshold voltages were assumed as $V_{\text{GS}}$ at $I_{\text{DS}}=300 \text{ nA}(W/L)$ and $V_{\text{DS}}=50 \text{ mV}$ [12]. The ON current plotted in Fig. 16(a) reveals its increase by
elevating the temperature while the absolute value of the threshold voltage shown in Fig. 16(b) decreases when the temperature increases (see, e.g., [5]). In fact, the threshold voltage decrease causes the ON current increase by the temperature (see, e.g., [36]). Also, note that the reduction in the threshold voltage of the pMOS devices is more than that of nMOS transistors.

Based on these observations, now, we study the metric variations caused by the temperature increase from $C_0$ to $C_1$. The results were obtained from

$$\Delta \text{Param} = \frac{\text{Param}_{@100\,^\circ\text{C}} - \text{Param}_{@-40\,^\circ\text{C}}}{\text{Param}_{@-40\,^\circ\text{C}}} \times 100$$

where \text{Param} is a SRAM parameter.

Fig. 17, which provides the write time and margin variations caused by the temperature increase from $-40\,^\circ\text{C}$ to $100\,^\circ\text{C}$. The results were obtained from

![Image](image1.png)

Fig. 16. (a) ON current and (b) absolute threshold voltage of FinFET transistors for sub-20 nm technologies at $-40\,^\circ\text{C}$ and $100\,^\circ\text{C}$.

In Fig. 19(a), (b), the effect of temperature increase on the hold stability of the cells is drawn. As previously mentioned, the threshold voltage reduction decreases the HSNM of the 5T, 6T, 7T, and 8T cells. In the case of the 9T structure, the reduction increases the hold stability. The variation of the RSNM of the cells due to the temperature increase is illustrated in Fig. 19(c), (d). The larger reduction in the pMOS threshold voltage increases $V_{\text{trip}}$ of the cell. The decrease in the absolute value of the threshold voltage reduction itself reduces the RSNM [18] while the increase in $V_{\text{trip}}$ increases the RSNM [24]. As the technology scales, because of the increase in the absolute value of the threshold voltage (see Fig. 16(b)), the effect of the $V_{\text{trip}}$ becomes more increasing the RSNM.

As shown in Fig. 20, the temperature rise causes the read current enlargement which is due to the ON current increase by the temperature (see also Fig. 9). Since we are working in the near-threshold region for high $V_{\text{TH}}$ transistors, the temperature effect is more dominant (the ON current has a relation to the threshold voltage which is similar to the sub-threshold exponential relation) and hence a large read current increment is seen. For low $V_{\text{TH}}$ transistors, however, the ON current mainly follows the linear relation with the threshold voltage (for velocity saturated devices) and hence the increase is smaller.

### 4.3. Study of process variation effect

In this Section, we study the impact of process variations on the SRAM cell characteristics discussed in this work. To account for the

<table>
<thead>
<tr>
<th>Technology</th>
<th>Temp. (°C)</th>
<th>5T</th>
<th>7T</th>
<th>8T</th>
<th>9T</th>
</tr>
</thead>
<tbody>
<tr>
<td>20 nm HP</td>
<td>25</td>
<td>2.16</td>
<td>1.80</td>
<td>1.37</td>
<td>0.64</td>
</tr>
<tr>
<td></td>
<td>100</td>
<td>3.35</td>
<td>2.58</td>
<td>1.73</td>
<td>0.94</td>
</tr>
<tr>
<td>20 nm LP</td>
<td>25</td>
<td>2.43</td>
<td>7.68</td>
<td>1.28</td>
<td>0.54</td>
</tr>
<tr>
<td></td>
<td>100</td>
<td>3.32</td>
<td>7.34</td>
<td>1.50</td>
<td>0.71</td>
</tr>
<tr>
<td>16 nm HP</td>
<td>25</td>
<td>2.07</td>
<td>1.76</td>
<td>1.37</td>
<td>0.62</td>
</tr>
<tr>
<td></td>
<td>100</td>
<td>3.04</td>
<td>2.50</td>
<td>1.69</td>
<td>0.90</td>
</tr>
<tr>
<td>16 nm LP</td>
<td>25</td>
<td>2.43</td>
<td>8.09</td>
<td>1.28</td>
<td>0.54</td>
</tr>
<tr>
<td></td>
<td>100</td>
<td>3.24</td>
<td>7.90</td>
<td>1.51</td>
<td>0.71</td>
</tr>
<tr>
<td>14 nm HP</td>
<td>25</td>
<td>1.98</td>
<td>1.67</td>
<td>1.36</td>
<td>0.60</td>
</tr>
<tr>
<td></td>
<td>100</td>
<td>2.82</td>
<td>2.40</td>
<td>1.65</td>
<td>0.84</td>
</tr>
<tr>
<td>14 nm LP</td>
<td>25</td>
<td>2.40</td>
<td>8.62</td>
<td>1.28</td>
<td>0.54</td>
</tr>
<tr>
<td></td>
<td>100</td>
<td>3.13</td>
<td>8.41</td>
<td>1.52</td>
<td>0.71</td>
</tr>
<tr>
<td>10 nm HP</td>
<td>25</td>
<td>1.72</td>
<td>1.69</td>
<td>1.36</td>
<td>0.61</td>
</tr>
<tr>
<td></td>
<td>100</td>
<td>2.67</td>
<td>2.42</td>
<td>1.63</td>
<td>0.85</td>
</tr>
<tr>
<td>10 nm LP</td>
<td>25</td>
<td>2.34</td>
<td>9.08</td>
<td>1.29</td>
<td>0.53</td>
</tr>
<tr>
<td></td>
<td>100</td>
<td>3.06</td>
<td>8.77</td>
<td>1.54</td>
<td>0.72</td>
</tr>
<tr>
<td>7 nm HP</td>
<td>25</td>
<td>1.85</td>
<td>1.67</td>
<td>1.37</td>
<td>0.61</td>
</tr>
<tr>
<td></td>
<td>100</td>
<td>2.56</td>
<td>2.36</td>
<td>1.63</td>
<td>0.84</td>
</tr>
<tr>
<td>7 nm LP</td>
<td>25</td>
<td>2.45</td>
<td>10.73</td>
<td>1.30</td>
<td>0.55</td>
</tr>
<tr>
<td></td>
<td>100</td>
<td>3.12</td>
<td>9.91</td>
<td>1.54</td>
<td>0.73</td>
</tr>
</tbody>
</table>
global process variation, we consider Gaussian distributions for $L_g$, $t_{si}$, and $H_{fin}$ with $3\sigma = 10\%$ of their nominal values and for the gate oxide thickness with $3\sigma = 5\%$ of its nominal value [1,37,38]. Additionally, the local variability is only assumed for $t_{si}$ and $L_g$ due to line edge roughness (LER) [37].

The required values for LER are taken from ITRS [1,37]. It should be noted that we assumed the gate last process in which the variability of the gate work function is negligible [39]. Also, the experimental results presented in [38,40] show that random dopant fluctuations can be ignored for FinFETs. To compare the variability of the cells, we used the cell sigma (CS) which is defined by dividing the mean of a parameter by its standard deviation [38]. Its value determines the minimum variation (e.g., $1\sigma$ or $2\sigma$ from the mean) needed to cause a write, read, or hold failure. Nowadays, six-sigma ($6\sigma$) yield or larger is required for large SRAM arrays [38]. The study was performed for both LP and HP cells at 25 °C. The results were obtained after 5000 Monte Carlo simulations using HSPICE. The write margin CS values for the worst case, which corresponded to LP cells, are shown in Fig. 21 at $V_{DD} = 0.5$ and 0.4 V. As the write margin of the proposed 7T structure is equal to $V_{DD}$ for most of the samples (which causes a very high CS value), we did not include its CS values in this figure. The results of our simulations reveal that all the cells have sigma values larger than six at $V_{DD} = 0.5$ V. The 6T and 8T structures have the least write margin CS due to the existence of a strong feedback loop. In the 9T cell, because of M9, the write operation is performed more easily than the 6T cell leading to a higher write margin CS. In the 5T structure, the weak feedback causes a higher write margin CS than that of the 6T cell. In our proposed cell, the existence of three access transistors makes writing even easier than the case of the 5T cell. Therefore, the write margin CS value for the 7T cell is the largest among all the cells. To study the impact of the supply voltage scaling on the process variation characteristics, we repeated the simulations for all the cases assuming a $V_{DD}$ of 0.4 V which revealed that CS values decrease in this case. The results demonstrate that the 6T and 8T (9T) cells do not meet the $6\sigma$ stability criterion except at 20 nm.

Fig. 17. The effect of temperature increase from −40 °C to 100 °C on the write time and margin for (a, c) LP and (b, d) HP cells. For the case of the 7T cell, except for the 7 nm technology, the write margin increase is zero.

Fig. 18. The effect of temperature increase from −40 °C to 100 °C on the static power for (a) LP and (b) HP cells. The upper (lower) bar of the stacks corresponds to the hold “1” (“0”) static power.
(20 nm and 16 nm) technology node. Again the CS values for the 7T cell are too large to be shown in the figure.

Next, the HSNM CS values of HP cells (as the worst case) at $V_{DD}=0.5$ and 0.4 V are shown in Fig. 22. One may expect that the minimum CS values occur at the 7 nm technology which is the most highly scaled node. The CS values for the 10 nm technology, however, are lower than the corresponding ones for the 7 nm technology node. This may be attributed to different parameters of the technologies such as $L_g$ to $L_x$ ratios (see Fig. 4) which reduces the effect of process variation in the 7 nm technology node (see, e.g., [41,42]). Since in the 9T structure, the feedback is weaker than the 6T and 8T cells, the CS is less than those of these cells. The 5T and 7T cells have the least HSNM CS values while the minimum values are still above six even at $V_{DD}=0.4$ V.

In addition, we did Monte Carlo simulations to assess the dynamic stability of the half-selected cells in the same column of the cell in which the opposite data is being written. The histogram of $(t_{write-disturb}/t_{write})$ ratio for our 7T proposed cell for the worst case condition (the 10 nm technology node) is shown in Fig. 23. As shown in the figure, the time for the unwanted data to be written in the cell in the hold state ($t_{write-disturb}$) is at least six order of magnitudes larger than the write time ($t_{write}$).
of the cell in the write state. Hence, by choosing an appropriate signal time during write, the write disturb may be eliminated completely.

In the case of RSNM, the CS values for the worst case (HP cells) at $V_{DD}=0.5$ and 0.4 V are shown in Fig. 24. In the case of $V_{DD}=0.5$ V, the 8T cell has the highest CS value due to using the read buffer. In the 9T cell, because of the weaker feedback, the RSNM CS (which is equal to its HSNM) is less than that of the 8T structure. On the other hand, in the 6T cell, due to the lack of a read buffer, the RSNM CS is considerably less than that of the 8T cell. The 5T structure has the least RSNM CS values because of not using the pull down transistor and read buffer. Its RSNM CS becomes lower than six at 10 nm and 7 nm technologies. Our proposed cell CS is higher than the 5T and 6T cells due to the higher average RSNM value. For the case of $V_{DD}=0.4$ V, the same explanations are valid. For this case, however, the 5T and 6T cells do not meet the stability criterion.

Finally, we present the results for the minimum operating voltage ($V_{min}$) for each cell in each technology in Table 4. This voltage corresponds to the minimum supply voltage at which the
CS values of the stability metrics including HSNM, RSNM, and write margin are above six \[41\]. The superscript number shows the parameter which violates the 6-sigma minimum stability requirement. As the figures in the table indicate, except for the 20 nm technology, the 7T cell has the lowest \(V_{\text{min}}\) among the LP cells. For the 20 nm technology, the 9T LP cell has the minimum \(V_{\text{min}}\) among the LP cells at the price of a higher area of 59\%. For this technology, the 7T has the next lowest \(V_{\text{min}}\). It should be noted that while the 8T and 9T HP cells have lower \(V_{\text{min}}\) values compared to that of the 7T cell, their power consumptions are still about 2\(\times\) higher than that of the 7T cell.

### 5. Conclusions

In this paper, we suggested a 7T SRAM cell structure for high stability and write speed. Additionally, the cell, which was based on a recently proposed 5T cell, had an extra access transistor as well as a footer transistor to reduce the static power. To assess the efficiency of the cell, its characteristics were compared to those of the 5T, 6T, 8T, and 9T structures. The comparative study was performed using HSPICE simulations at sub-20 nm FinFET technologies at temperatures of 25 °C, 40 °C and 100 °C. The simulations results showed that the cell had superior write speed and stability while decreased (increased) average static power consumption (RSNM) by at least 57\% (22\%) as compared to that of the 5T cell. Also, our structure had a moderate area among structures while having read SNM values around that of 9T cell and higher than those of the 6T cells. Also, we compared the characteristics of the cells in the presence of the process variations at two supply voltages of 0.4 V and 0.5 V. While the 5T and 6T (6T, 8T, and 9T) cells did not meet the read (write) stability yield requirement in some cases, suggested 7T structure met the 6\(\sigma\) yield requirement in all the cases. Finally, our proposed structure had the lowest minimum operating voltage among the low power cells for the 16, 14, 10, and 7 nm technologies.

### Acknowledgements

MA, HA, BE, and AA acknowledge the financial support by the Iranian National Science Foundation (INSF).

### References

Q. Li, T.T. Kim, A 9T subthreshold SRAM bitcell with data-independent bitline
S. Narendra, S. Borkar, V. De, D. Antoniadis, A. Chandrakasan, Scaling of stack
B. Ebrahimi, M. Rostami, A. Afzali-Kusha, M. Pedram, Statistical design
T. Matsukawa, Y. Lue, W. Mizubayashi, et al., Suppressing Vt and Gm variability
Y.L. Yeoh, B. Wang, X. Yu, et al. A 0.4 V 7T SRAM with write through virtual
edu/–/PTM/, 2012.
A. Makino, S. Nakata, H. Suzuki, S. Mutoh, M. Miyama, T. Yoshimura, S. Iwade,
H. Makino, S. Nakata, H. Suzuki, S. Mutoh, M. Miyama, T. Yoshimura, S. Iwade,
edu/–/PTM/, 2012.
A. Makino, S. Nakata, H. Suzuki, S. Mutoh, M. Miyama, T. Yoshimura, S. Iwade,
H. Makino, S. Nakata, H. Suzuki, S. Mutoh, M. Miyama, T. Yoshimura, S. Iwade,
A. Teman, L. Pergament, O. Cohen, A. Fish, A 250 mV 8 kb 40 nm ultra-low
power 9T supply feedback SRAM (SF-SRAM), IEEE J. Solid-State Circuits 46 (11)
Y.L. Yeoh, B. Wang, X. Yu, et al. A 0.4 V 7T SRAM with write through virtual
ground and ultra-fine grain power gating switches, in: Proceedings of IEEE
C.B. Kushwah, S.K. Vishvakarma, D. Dwivedi, Single-ended sub-threshold finfet
7T SRAM cell without boosted supply, in: Proceedings of IEEE ICCDIT, 2014,
pp. 28–30.
E. Sevincci, F.J. List, J. Lohstroh, Static-noise margin analysis of MOS SRAM
A. Teman, A. Mordkhabai, J. Mezhibovski, A. Fish, A 40-nm sub-threshold 5T
A. Teman, L. Pergament, O. Cohen, A. Fish, A 250 mV 8 kb 40 nm ultra-low
power 9T supply feedback SRAM (SF-SRAM), IEEE J. Solid-State Circuits 46 (11)
Y.L. Yeoh, B. Wang, X. Yu, et al. A 0.4 V 7T SRAM with write through virtual
ground and ultra-fine grain power gating switches, in: Proceedings of IEEE
A. Teman, A. Mordkhabai, J. Mezhibovski, A. Fish, A 40-nm sub-threshold 5T
A. Teman, L. Pergament, O. Cohen, A. Fish, A 250 mV 8 kb 40 nm ultra-low
power 9T supply feedback SRAM (SF-SRAM), IEEE J. Solid-State Circuits 46 (11)
Ali Afzali-Kusha received his B.Sc., M.Sc., and Ph.D. degrees all in Electrical Engineering from Sharif University of Technology, University of Pittsburgh, and University of Michigan in 1988, 1991, and 1994, respectively. From 1994 to 1995, he was a Post-Doctoral Fellow at The University of Michigan. Since 1995, he has joined The University of Tehran, where he is currently a Professor of the School of Electrical and Computer Engineering and the Director of Low-Power High-Performance Nanosystems Laboratory. Also, on a research leave from the University of Tehran, he has been a Research Fellow at University of Toronto and University of Waterloo in 1998 and 1999, respectively. He is a senior member of IEEE, and his current research interests include low-power high-performance design methodologies from the physical design level to the system level for nanoelectronics era.

Massoud Pedram, who is the Stephen and Etta Varra Professor in the Ming Hsieh department of Electrical Engineering at University of Southern California, received a Ph.D. in Electrical Engineering and Computer Sciences from the University of California, Berkeley in 1991. He holds 10 U.S. patents and has published four books, 12 book chapters, and more than 140 archival and 350 conference papers. His research ranges from low power electronics, energy-efficient processing, and cloud computing to photovoltaic cell power generation, energy storage, and power conversion, and from RT level optimization of VLSI circuits to synthesis and physical design of quantum circuits. For this research, he and his students have received six conference and two IEEE Transactions Best Paper Awards. Dr. Pedram is a recipient of the 1996 Presidential Early Career Award for Scientists and Engineers, a Fellow of the IEEE, an ACM Distinguished Scientist, and currently serves as the Editor-in-Chief of the ACM Transactions on Design Automation of Electronic Systems. He has also served on the technical program committee of a number of premiere conferences in his field and was the founding Technical Program Co-chair of the 1996 International Symposium on Low Power Electronics and Design and the Technical Program Chair of the 2002 International Symposium on Physical Design.