# On Designing Mixed-Signal Programmable Fuzzy Logic Controllers as Embedded Subsystems in Standard CMOS Technologies

Carlos Dualibe<sup>1,2</sup>

<sup>1</sup>Laboratorio de Microelectrónica, Universidad
Católica de Córdoba, Ob. Trejo 323,
5000-Córdoba, Argentina.
E-mail: dualibe@dice.ucl.ac.be

Paul Jespers<sup>2</sup> and Michel Verleysen<sup>2</sup>

<sup>2</sup>Microelectronics Laboratory, DICE, Université catholique de Louvain, Place du Levant 3,

B-1348 Louvain-la-Neuve, Belgium.

E-mail: jespers, verleysen@dice.ucl.ac.be

#### **Abstract**

A digitally - programmable analogue Fuzzy Logic Controller (FLC) is presented. Input and output signals are processed in the analog domain whereas the parameters of the controller are stored in a built-in digital memory. Some new functional blocks have been designed whereas others were improved towards the optimisation of the power consumption, the speed and the modularity while keeping a reasonable accuracy, as it is needed in several analogue signal processing applications. A nine-rules, two-inputs and one-output prototype was fabricated and successfully tested using a standard CMOS 2.4µ technology, showing good agreement with the expected performances, namely: from 2.22 to 5.26 Mflips (Mega fuzzy logic inferences per second) at the pin terminals (@CL=13pF), 933 µW power consumption per rule (@Vdd=5V) and 5 bits of resolution. Since the circuit is intended for a subsystem embedded in an application chip (@ $CL \le 5pF$ ) up to 8 Mflips may be expected.

## 1. Introduction

In the last years the application of Fuzzy Logic has been extended beyond the classical Process Control area where it has been employed from the beginning. Signal Processing, Image Processing, Power Electronics, seem to be others niches where this soft-computing technique can meet a broad range of applications. As real time processing mode need ever faster, more autonomous and less power consuming circuits the choice of on-chip controllers becomes an interesting option. Digital Fuzzy Logic chips provide enough performance for general applications but their speed is limited, if compared with their analogue counterparts. Furthermore, in real-time applications digital fuzzy processors needs A/D and D/A

converters to interface sensors and actuators, respectively. On the other hand, pure analog processors suffer the lack of suppleness since full analog programmability is only feasible in special technologies allowing analog storage devices (i.e.: floating gate transistors). However, in the frame of standard CMOS technologies, a trade-off between accuracy and flexibility is achieved when a finite discrete set of analogue parameters is provided. For instance, a voltage parameter can be settled by using a binary-scaled set of currents sources yielding a discrete set of voltage drops through a linear resistor. In such a case, it is possible to use a digital memory to store a given binary combination of the set of currents. This technique gives rise to the so-called Mixed-Signal analogue computation circuits [2].

It has been shown that analogue current-mode FLCs [2] lend themselves to simple rules-evaluation and aggregation circuits that can work at a reasonable speed. If some of the unwanted current-to-voltage and/or voltage-to-current intermediate converters can be avoided, the delay through cascaded operators may be even shortened and higher speeds achieved. This is interesting when fuzzifiers [2, 3] and defuzzifiers [4] circuits are being designed for these circuits interact normally with a voltage-mode controlled environment. On the other hand, to reduce die silicon area and power consumption some building blocks can be shared without altering functionality. As a result a relatively low-complexity layout can be obtained which leads to an additional gain of speed.

In this work, a low-power digitally programmable analogue Fuzzy Logic Controller (Mixed-Signal FLC) is introduced intended for embedded subsystems as it is required for analogue signal processing applications of medium-accuracy (i.e.: non-linear filtering [1], power electronics [9], etc). Keeping in mind the above exposed issues, new operators were designed while others were optimized achieving a flexible and high performance

controller notwithstanding the limits imposed by the technology that was used for the demonstrator.

## 2. Architecture of the controller

Because it offers a good trade-off between simplicity and accuracy, a zero-order Sugeno architecture (consequents are singletons) was chosen. Figure 1 shows the block diagram of a two-inputs one-output controller, highlighting the three well-known basic fuzzy operations (fuzzification, rules-evaluation and defuzzification) being performed concurrently. The MIN inference method may MIN (A, B,...) = 1 - MAX (1-A, 1-A)be also stated as: B,...). Therefore, for the general case of a q-inputs, mrules controller, a set of Complementary Fuzzy Membership Functions (CFMF) per input, being shared by several rules, followed by m q-inputs MAX operators perform the two first operations. After complementing the outputs of the MAX operators the firing degree of each rule is provided in the form of a current signal Ii.

$$Vo = k \frac{\sum_{i=1}^{m} \alpha i * Ii}{\sum_{i=1}^{m} Ii},$$
 (1)

where k is a voltage-dimension constant defined by the transfer function of the divider itself.

#### 2.1. Complementary fuzzy membership functions

Low-power fuzzy controllers need relatively low tranconductance values for their membership function circuits. Consequently, CMOS triode transconductors can be used to meet that requirement smartly. The circuit of the complementary fuzzifier is depicted in figure 2 a). It is composed by two almost linear regulated-cascode transconductors (ML1, ML3, DAL - MR1, MR3, DAR) each one controlling one edge of the CFMF whose shape is nearly an inverted trapezoid. Transistors ML2, MR2 have fixed large sizes, so that their gate-voltage-



At the last stage, each Ii current is replicated (n+1) times via unit gain mirrors, where n stands for the resolution of the singleton discrete-value  $\alpha i$  of the consequents of the rules. This last is codified accordingly with the state of the switches: Cn-1.....C0.

Finally a common current-mode Digital to Analogue converter (D/A), used as a weighting operator, together with an analogue divider takes care of the computation of the Averaged-Weighted Sum (AWS), rendering the defuzzified output value Vo equal to:

overdrive (Vgs-VTn) can be neglected. Reference voltages VKL, VKR define the knees where conduction begins falling towards zero or rising towards Io respectively in each transconductor. Slopes and knees are independently programmable.

The drain-to-source voltage drops Vds of transistors ML1, MR1 are kept constant over a wide range of the input voltage Vin, and their magnitudes are fixed by means of the artificially increased offset voltages of the differential amplifiers DAL, DAR. Since these offsets are smaller than the saturation drain-source voltage Vds<sub>sat</sub> of transistors ML1, MR1, the same are constrained to

operate in the triode region. Thus, their transconductance gm, defining the slopes of the trapezoid, is given by:

$$gm = \frac{\partial Id}{\partial Vgs} = \mu Cox \frac{W}{I} Vds.$$
 (2)

of a set of binary scaled currents yielding 32 equally spaced voltages drops through a MOST-only grounded linear resistor. Figure 2 c) shows some measured curves. Note that we could easily get  $N\approx9.5$  whereas the input range of Vin reaches to 3V.



Fig. 2 - a) Complementary Fuzzy Membership Function (CFMF) Circuit. b) Differential Amps.DAL, DAR. c) Some measured transfer function 'Iout vs. Vin' of the CFMF using a HP4145B instrument.

In figure 2 b), the schematic of differential amplifiers is shown. By sizing  $(W/L)_{M1} > (W/L)_{M2}$  a voltage offset between their inputs ( V- - V+ = Vds ) is established and it is linearly controlled by the voltage source Vs as follows:

$$Vds = \left(\sqrt{\frac{(W/L)_{M5}}{2(W/L)_{M2}}} - \sqrt{\frac{(W/L)_{M5}}{2(W/L)_{M1}}}\right) (Vdd - |VTp| - Vs). (3)$$

In this way slopes can be electrically tuned, which is an advantage when analogue storage is available compared to the typical four transistors CFMF operators [2,3]. In the last, input transistors are saturated and tail current Io must be fixed (Io  $\equiv$  logical '1'). Even if in both cases slopes are discretely programmed via a set of different sized input transistors, calling N the ratio between the maximum and minimum desirable slopes, the ratio between the maximum and minimum transistor size needed in our case, from (2), becomes N. For the second case the last ratio is equal to N<sup>2</sup>, thus, for a given range of slopes, the whole set of saturated input transistors would demand an increased amount of silicon surface. Moreover, in this version of the controller we have performed a combination between a few discrete values of Vs and input transistor sizes in order to optimize the slopes range capability at a low cost in terms of silicon area. Finally, a piece-wise expression for the output current Iout of the CFMF is roughly given by (4), where gm<sub>L1</sub> and gm<sub>R1</sub> result after combining expression (2) and (3) for each transconductor of the CFMF. In the actual implementation each knee voltage VKR and VKL are obtained by means

### 2.2. Multiple-input MAX operator

The Winner-Take-All circuit presented in [5] was adopted for the MAX operator, but some modifications were introduced. The circuit depicted in figure 3 a) is composed by a current - controlled voltage sources (M1, M2, M3, M4 and M5) connected to a common node Vc and fighting to impose their own voltage, which is proportional to their controlling current source. Transistors Mc1, Mc2, connected as a cascode-diode and common to all cells, convey the highest current at the output. Gate voltages of transistors M1 belonging to the losers fall and those transistors switch off. Diodeconnected transistors M2, M3 guarantee a voltage level of at least 2VTn at the gate of loser transistors M1. In this way the recovering time delay of these cells (i.e. when any of them pass from loser to winner) is improved. Since transistors M4, M5 are cascoded an accurate replica of the winner current is ensured.



Fig. 3 - a) Multiple-Input MAXIMUM circuit. b) Weighting D/A circuit

#### 2.3. Consequents and Defuzzifier

*Singletons:* for the consequent of each rule a discrete singleton α smaller than 1 is given by:

$$\alpha i = (Cn-1)_i \ 2^{-1} + (Cn-2)_i \ 2^{-2} + \dots + (C0)_i \ 2^{-n}$$
, (5)

where i ranges from 1 to m and coefficients (Cn-1)<sub>i</sub>.....(C0)<sub>i</sub> adopt binary values. In figure 1, the outputs of the (n+1) current mirrors of the whole m-consequents set are column-wise summed to give the following (n+1) values:

$$(\sum \text{Ii})$$
;  $(\sum (\text{Cn-1})_i \text{Ii})$ ;.....;  $(\sum (\text{C0})_i \text{Ii})$ . (6)

Except for the above first term, all the others are weighted and summed in the common D/A whose circuit is shown in figure 3 b). Therefore, the output current Iout of the D/A is equal to:

$$(\sum (Cn-1)_i \ 2^{-1} \ Ii) + ... + (\sum (C0)_i \ 2^{-n} \ Ii) = \sum \alpha i * Ii. (7)$$

First ideas on common weighting can be found in [11]. Most approaches found in the literature to perform this operation use one individual weighting operator per rule [2] [6]. With the common D/A used here, a big saving of silicon area can be obtained compared to the local D/A approach. On the other hand, also in our case, the input capacitance of each consequent is reduced by a factor (2<sup>n</sup>/n+1). Additionally, since the layout of the whole defuzzifier becomes smaller routing capacitances are also diminished. As a result, a considerable gain of speed can be achieved. Moreover, in order to improve the matching properties and consequently the accuracy of the converter, the same can be built using non-minimum size transistors without expending too much of silicon area.

Analogue Divider: a novel current-input voltage-output divider [4] was specially designed to carry out the division

operation in formula (1). The circuit is shown in figure 4 a). With equal sized transistors in each row of the circuit, the division is actually performed by transistors M1, M2, M3 at the bottom layer, all of them being constrained to operate in the triode region. The drain-to-source voltage drops Vds of those transistors are matched thanks to common-gate connected transistors M4, M5, M6 that convey the same current. This is guaranteed by the upper PMOS cascode-mirror (M7 to M12). While Vb1 and Vb0 are fixed bias voltages, transistor M3 gate voltage Vout is self-adjusted so that the drain current of M6 matches the current imposed by the PMOS cascoded-mirror branch M9, M12. In this way, the following relation holds [4]:

$$(Vout - Vbo) = Vo = (Vb1 - Vbo) \frac{IN}{ID}.$$
 (8)

Thus, if Vout is referred to Vbo a two-quadrant divider is obtained. Since this divider behaves as a transresistor, there is no need for extra interface converter circuits neither at the inputs [6] nor at the output [7, 8]. Figure 4 b) displays some measured characteristics using a HP4145B equipment. Figure 4 c) ilustrates the measured relative errors. The output offset (when IN=0) is lower than 1.6 mV.

#### 3. Experimental results

In the fabricated two-inputs, one-output, nine-fixed rules controller, there are three fuzzy labels available per four-parameters CFMF programmable (2x5 bits for knees and 2x4 bits for slopes). Consequents' singletons are 5-bits programmable. Tail current Io was set to 10µA. Input voltages range from 1.5V to 4.5V. With Vbo=1.7V and Vb1=2.7V the output voltage ranges between those two values. Figures 5 a) and b) show the simulated and measured output surfaces respectively for a particular setting of the controller. The RMSE between these surfaces remains in 27mV (2.7%). Figures 5 c) and d) illustrate the relative error surface between measured and simulated output values and the distribution of these errors, respectively. Notice from the last figure that most relative errors are concentrated inside a band of  $\pm 3\%$ . Also the dispersion between samples due to process fluctuations was characterized and the result from 6 measured prototypes is shown in figure 7 a). Varying from point to point at the output surface, the Standard Deviation features a peak of 62.5mV (6.25%) and a mean value of 35mV (3.5%).



Fig. 4 - a) Transresistance Divider circuit. b) Measured (x) and calculated (-) curves of the divider for  $0 < IN < 10 \mu A$  while ID ranges from 10  $\mu A$  to 30  $\mu A$  by  $4\mu A$  steps. c) Measured relative error of the divider.



Fig. 5 - DC test results: a) Simulated output surface. b) Measured output surface. c) Measure relative error surface. d) Relative error distribution. RMSE=27mV (2.7%).

The transient behavior of the controller was typified by measuring the total input/output delay for small and large step signals applied at one of the input while biasing the other with a constant voltage level. In figure 6 a) the amplitude of the input step was set to  $\Delta V$ in = 500 mVpp whereas the output reacts in 190ns (for the 90% of the steady state value) with a 100 mVpp pulse. In figure 6 b) the former experience is repeated but making sweep one

input along the whole voltage input range ( $\Delta Vin = 3V$ ). In this case a 500mVpp pulse is settled at the output in 450ns (90%). Therefore, the speed of the controller ranges from 2.22 to 5.26 Mflips for an estimated output load capacitance of CL=13pF. Extrapolating these results for CL $\leq$ 5pF the delay should range from 125ns to 235ns. Consequently up to 8 Mflips would be achieved inside the chip. This last feature must be taken as a proof of the

optimal strategies adopted for the design, namely: the avoidance of intermediate voltage-to-current and/or current-to-voltage converters, the reduced complexity of the defuzzifier and the simplicity of the divider.

Figure 7 b) shows the microphotograph of the controller. Its core occupies 3040 x 1500  $\mu m^2$  including digital storage circuits that represent almost the 50% of the total area. The measured power consumption rises to 13.4 mW (core: 8.4mW - buffer: 5mW) for Vdd=5V. Table 1 summarizes the attained performances in this prototype.

consumption, reduced storage capacity needed and even shortened internal delays.

Experimental results confirm that this controller is suitable for low-power embedded subsystems for applications with bandwidths below 8Mhz. The use of fast controllers with small number of rules has been reported in several real-time analog applications [1] [9] [10] and their requirements are fairly fulfilled by this prototype.



Fig. 6 - Step response of the controller: a) Small input step amplitude ( $\Delta Vin = 500mV$ ). Vertical scales: 0.5V/div input - 0.2V/div output. Time scale: 100ns/div. b) Large input step amplitude ( $\Delta Vin = 3V$ ). Vertical scales: 1V/div input - 0.2V/div output. Time scale: 200ns/div.

#### 4. Conclusions

It has been shown that Mixed-Signal techniques can trade accuracy for a flexibility improvement while holding the advantages of the analogue circuits for massive, parallel and fast computation together with the feasibility of digital circuits for storage. The employment

of widespread digital memory circuits to store the digital representation of the controller's parameters simplifies extremely the on-chip programming strategy for implementing in a standard CMOS technology. In this way, our prototype behaves as a static RAM for programming purposes whereas signal processing is carried out in the analog domain with a reasonable resolution.

Sharing functional operators, particularly fuzzifiers labels and consequents weighting D/A, and performing optimal blocks interfacing by the avoidance of intermediate signal converters (i.e.: current-to-voltage and/or voltage-to-current converters), have played an important role during the design step. Practiced in depth these general guidelines led to an improved modularity, reflected in smaller silicon area, lower power

## 5. Acknowledgements

M. Verleysen is a research associate of Belgian National Fund for Scientific Research (FNRS). Authors want to thank to Universidad católica de Córdoba at Argentina and to ARAMIS Belgian association for their financial support.

#### 6. References

- [1] Mancuso M., D' Alto V., De Luca R., Poluzzi R. and Rizzotto G., "Fuzzy logic based image processing in IQTV environment", *IEEE Trans. on Consumer Electronics*, 1995, Vol. 41, N° 3, pp. 917-923.
- [2] Rodríguez-Vázquez A., Vidal F., Linares B. and Delgado M., "Analog CMOS Design of Singleton Fuzzy Controllers", in the 3<sup>rd</sup> International Conference on Industrial Fuzzy Control Intelligent Systems, (Houston, Texas), December 1993.
- [3] Guo S., Peters L. and Surmann H., "Design and application of an analog fuzzy logic controller", *IEEE Transaction on Fuzzy Systems*, 1996, N° 4, pp. 429-438.
- [4] Dualibe C., Verleysen M. and Jespers P., "Two-quadrant CMOS analogue divider", *Electronics Letters*, June 1998, pp. 1164 1165.

- [5] Lazzaro J., Ryckebusch S., Mahowald M. and Mead C., "Winner-take-all networks of order N complexity", *Proc. 1988 IEEE Conf. on Neural Information Processing Natural and Synthetic*, 1988, Denver, pp. 703 711.
- [6] Huertas J., Sanchez-Solano S., Baturone I. and Barriga A., "Integrated circuit implementation of fuzzy controllers", *IEEE JSSC*, 1996, Vol. 31,  $N^{\circ}$  7, pp. 1051 1058.
- [7] Liu D., Huang Y., and Wu Y., "Modular current-mode defuzzification circuit for fuzzy logic controllers", *Electronics Letters*, vol .30, August 1994, pp. 1287 1288.
- [8] Marshall G. and Collins S., "Fuzzy logic architecture using subthreshold analogue floating-gate devices", *IEEE Transactions on Fuzzy Systems*, vol. 5,  $N^{\circ}$  1, pp. 32 43, February 1997.
- [9] Franchi E., Manaresi N., Rovatti R., Bellini A. and Baccarani G., "Analog synthesis of nonlinear functions based on fuzzy logic", *IEEE Journal of Solid State Circuits*, Vol. 33, N°6, pp. 885-895, June 1998.
- [10] Ramírez-Barajas J., Dieck-Assad G. and Soto R., "A Fuzzy Logic Based AGC Algorithm for a Radio Communication System", in *Proc. of the IEEE International Conference on Fuzzy Systems*, pp. 997-980, Alaska, 2000.
- [11] O. Landolt, "Low Power Analog Fuzzy Rule Implementation Based on a Linear Transistor Network", MicroNeuro'96, pp.86-93, 1996





b)

Fig. 7 - a) Standard Deviation distribution for 6 samples. b) Microphotograph of the chip: FLC and testing CFMF.

| Fuzzy Logic Controller |                                                        |  |                                        |                                         |    |
|------------------------|--------------------------------------------------------|--|----------------------------------------|-----------------------------------------|----|
| Technology:            | CMOS-2.4µ                                              |  | Programmability:                       | M.F.Slopes: 2x4bit                      | ts |
| Complexity:            | 9-rules @ 2-input @<br>1-output                        |  |                                        | M.F.Knees: 2x5bit<br>Singletons: 5bits  |    |
| Power Supply:          | 5 V                                                    |  | Total storage capacity needed:         | 153 bits                                |    |
| Power Consumption:     | Core: 8.4mW                                            |  | Input/Output delay (90% steady-state): | Small signal: 190ns                     |    |
|                        | Buffer: 5mW                                            |  |                                        | Large signal: 450ns                     |    |
| Area:                  | Analog: 2.3mm <sup>2</sup> Digital: 2.2mm <sup>2</sup> |  | Standard deviation among samples:      | Max: 62.5mV (6.25%<br>Mean: 35mV (3.5%) |    |
| Accuracy:              | RMSE: 27mV (2.7%)                                      |  | (6 Prototypes)                         | 1.10am 35m (3.570)                      |    |

Table 1 - Summary of the performance of the proposed Mixed-Signal Fuzzy Logic Controller.