Specialized Arithmetic and Approximate Computing

Logarithmic-based Arithmetic for Convolutional Neural Networks

The recent breakthroughs in Machine Learning applications, especially in Deep Neural Networks (DNNs), have caused significant progress in image classification and speech recognition applications. From the milestone network LeNet for the MNIST handwritten digit recognition to ResNet achieving better than human classification, DNNs have been continually studied and improved to perform well even for the large-scale image classification. The networks developed in this progress, such as AlexNet, VGG and GoogLeNet, showed the trend where the amount of computations augmented as the number of layers increased for better accuracy. With such a large and growing number of computations, as well as the rising application of ML techniques to many areas, it is vital to develop efficient processing hardware units for DNNs. Particularly, these networks involve many Multiply-ACcumulate (MAC) operations, and it is important to implement efficient multipliers as they consume large amounts of power and area.

For this purpose, several works have leveraged the importance of the Leading Ones, as the most significant ‘1’s of the operands perform the highest contributions to the final result of operations. This is the basis for employing logarithmic arithmetic, unbiased and truncated arithmetic or substituting costly multiplications by additions and shifts, which are good for reducing energy consumption but at the expense of accuracy.

In this sense we are investigating the effect of truncating the mantissa within Mitchell’s-based logarithmic multipliers, as well as employing the one’s complement transform in order to approximate negative numbers without incurring delay penalties. The utilization of these techniques has a large impact on both power and area, but at the expense of losing some accuracy. It must be noted that as the operands are smaller in absolute terms, the error grows. In other words, the relative error is high for the smallest operands, but in absolute terms the difference is not high because the operands are small. In general, this will not have a large impact on the CNNs, as small results will not contribute considerably to the MAC output of each convolution and fully connected node.

This line is an ongoing collaboration with the ACAG group, lead by Prof. Nader Bagherzadeh at University of California at Irvine (UCI). Some of the resulting code from this collaboration can be found in the log-arithmetic repository.

Efficient Algebraic Operators for Cryptosystems

The extraordinary increase of electronic transactions and the necessity for secure communications that prevent unauthorized access to the information to be transmitted has made the research of cryptographic protocols and algorithms of fundamental interest. Since its introduction in 1976, public-key cryptography (PKC) has been an essential component for secure communications. Since then, several PKC cryptosystems have been proposed where their security criteria depend on the high computational complexity of some mathematical problems. Unfortunately, this complexity makes particularly difficult to implement PKC on embedded systems with constrained resources and on low-energy devices such as identification tags and sensor-network nodes. For these reasons, in recent years a new research field has been developed in order to find cryptographic mechanisms for devices with constrained capabilities.

Computational complexity of symmetric or asymmetric cryptographic algorithms greatly depends on the involved arithmetic operations. Modular finite field (prime or binary extension) arithmetic is used in cryptographic applications, where multiplication is considered the most important and complex operation. Different approaches and architectures can be considered for the implementation of these specialized prime or binary extension field arithmetic operations, each of them with different complexities (area, delay) and energy efficiencies. Efficient implementations of arithmetic operations in the underlying finite field thus greatly influence in the complexity and power consumption of a given cryptosystem implementation. Therefore, the study of different cryptographic mechanisms and the study of different specialized modular arithmetic operations to be used with such cryptosystems is necessary in order to optimize the trade-off among complexity and energy efficiency. The objective of this research is to obtain efficient modular arithmetic operations suitable for cryptosystems that could be implemented on devices with constrained capabilities.