Award Date


Degree Type


Degree Name

Doctor of Philosophy (PhD)


Electrical and Computer Engineering

First Committee Member

Sarah Harris

Second Committee Member

Shahram Latifi

Third Committee Member

R. Jacob Baker

Fourth Committee Member

Evangelos Yfantis

Fifth Committee Member

Kathryn Hausbeck Korgan

Number of Pages



This dissertation describes the implementation of several neural networks built on a field programmable gate array (FPGA) and used to recognize a handwritten digit dataset – the Modified National Institute of Standards and Technology (MNIST) database. A novel hardwarefriendly activation function called the dynamic ReLU (D-ReLU) function is proposed. This activation function can decrease chip area and power of neural networks when compared to traditional activation functions at no cost to prediction accuracy.

The implementations of three neural networks on FPGA are presented: 2-layer online training fully-connected neural network, 3-layer offline training fully-connected neural network, and two solutions of Super-Skinny Convolutional Neural Network (SS-CNN). The 2-layer online training fully-connected neural network was built on an FPGA with varying data width. Reducing the data width from 8 to 4 bits only reduces prediction accuracy by 11%, but the FPGA area decreases by 41%. The 3-layer offline training fully-connected neural network was built on an FPGA with both the sigmoid and the proposed D-ReLU activation functions. Compared to networks that use the sigmoid function, the proposed D-ReLU function uses 24-41% less area with no loss to prediction accuracy. Further reducing the data width of the 3-layer networks from 8 to 4 bits, the prediction accuracy only decreased by 3-5%, with area being reduced by 9-28%. The proposed sequential and parallel SS-CNN networks perform state-of-the-art (99%) recognition accuracy but with fewer layers and less neurons than prior works, for example, the LeNet-5 network. Using parameters with 8 bits of precision, the FPGA solutions of this SS-CNN show no recognition accuracy loss when compared to the 32-bit floating point software solution. In addition to high recognition accuracy, both of the proposed FPGA solutions are low power and can fit in a low cost Cyclone IVE FPGA. Moreover, these FPGA solutions have maximally 145× faster execution time than software solutions, even despite running at 97× to 120× lower clock rate.

Thus, FPGA implementations of neural networks offer a high-performance, low-power alternative to traditional software methods, and the proposed novel D-ReLU activation function offers additional improvements to performance and power savings. Furthermore, the hardware solutions of the proposed SS-CNN provide a high-performance, hardware-friendly, and power efficient solution compared to other bulky convolutional neural networks.


Activation function; FPGA; Hardware acceleration; Image recognition; Machine learning; Nueral network


Electrical and Computer Engineering

File Format


File Size

1.0 MB

Degree Grantor

University of Nevada, Las Vegas




IN COPYRIGHT. For more information about this rights statement, please visit