Faculty Profile - Pinalkumar Engineer

Leave-One-Out Validation in Machine Cross-Learning

Radhika Sreedharan, Jigna Prajapati, Pinalkumar Engineer, Deep Prajapati

Book Chapters , Ethical Issues in AI for Bioinformatics and Chemoinformatics, Pages: 56-71, 2023

Abstract

Automatically adapting new productive approaches from the previous effort is one of the traditional ways of improving upcoming challenges. Machine learning (ML) is very popular for the same reason. Machine learning also learns from past experiences using past data. The important mechanism is automation in all procedures. On the basis of historical data, machine learning will be able to predict related outcomes by various mathematical models on data-driven approached. There are many types of algorithms to support the ML framework. Model efficiency plays an important role for better outcomes. Leave-one-out cross-validation is one of the techniques for checking the efficiency. The processes are executing N times for deriving better outcomes.

A Memory Efficient Run-Time Re-Configurable Convolution IP Core for Deep Neural Networks Inference on FPGA Devices

Swati, Ranajoy Sadhukhan, Mitul S Nagar, Pinalkumar Engineer

Conference Papers IEEE International Symposium on Smart Electronic Systems (IEEE – iSES), 2023

Abstract

A review on Beamspace Channel Estimation Algorithms in Wireless Communication.

Smita Daware, Shweta Shah, Pinalkumar Engineer

Conference Papers 2023 IEEE 7th Conference on Information and Communication Technology (IEEE CICT 2023)

Abstract

Quantization Effects on a Convolutional Layer of a Deep Neural Network

Swati, Dheeraj Verma, Pinalkumar Engineer

Conference Papers Congress on Control, Robotics, and Mechatronics (CRM), 2023

Abstract

Over the last few years, we have witnessed a relentless improvement in the field of computer vision and deep neural networks. In a deep neural network, convolution operation is the load bearer as it performs feature extraction and dimensionality reduction on a large scale. As the models continue to go deeper and bulkier for better efficiency and accuracy there is a rapid increment in storage requirements too. The problem arises when performing computation with efficient numerical representations for embedded devices. Transitioning from floating-point representation to fixed-point could potentially reduce computation time, storage requirements, and latency with some accuracy loss. In this paper, an analysis of the effects of quantization of the first convolutional layer on the accuracy, and memory storage requirement with varying bit-width for fixed-point integer values of network parameters has been carried out. The approach adopted is post-training quantization with a mixed-precision format to avoid model re-training and minimize accuracy loss by using root-mean-square-error (RMSE) as a performance metric. Various combination has been analyzed and compared to find the optimal precision to implement on a resource-constraint device. Based on the analysis, the suggested bit-width of I/O data for this implementation is selected as <10,5> and mid-data be <20,10> instead of <16,8> and <32,16> respectively. This combination of bit-widths has reduced memory consumption such as BRAM by 10%, DSPs by 98.6% and FFs by 40.27% with some accuracy loss.

Parallelizing Non-Neural ML Algorithm for Edge-based Face Recognition on Parallel Ultra-Low Power (PULP) Cluster

Mitul Sudhirkumar Nagar, Rahul Kumar, Pinalkumar Engineer

Conference Papers 12th Mediterranean Conference on Embedded Computing (MECO), 2023

Abstract

The multi-core parallel ultra-low power (PULP) cluster architecture allows the IoT edge node to shift toward near-sensor computing. In this paper, non-neural Eigenfaces-based face recognition (FR) is examined on an octa-core PULP cluster. It is possible to achieve high accuracy in the Eigenfaces-based algorithm without using a large data model. It is observed that the Eigenfaces-based face recognition algorithm achieved 93% accuracy on the PULP platform with a 4.55× lesser model size compared to the state-of-the-art SqueezeNet1.1-based FR algorithm on GAP8 platform. Parallelization of Eigenfaces-based face recognition is done to achieve maximum speed-up on multi-core, reducing recognition time. Furthermore, DMA-based communication between the fabric controller and multi-core cluster reduces the recognition time by 50× at the cost of a little degradation in speed-up on the multi-core. By adopting this technique, 165 faces per second are recognized with 93% accuracy on octa-core PULP cluster, which is 7.85× faster than a single core RISC-V with DMA. Compared to the ARM Cortex-M7 architecture, the multi-core PULP cluster reduces recognition time by 89.89%. These results make the multi-core PULP cluster an efficient choice for Eigenfaces-based face recognition on the edge.

Smart Farming Ingredients: IoT Sensors, Software, Connectivity, Data Analytics, Robots, Drones, GIS-GPS

Jigna Bhupendra Prajapati, Roshani Barad, Meghna B Patel, Kavita Saini, Dhvanil Prajapati, Pinalkumar Engineer

Book Chapters Applying Drone Technologies and Robotics for Agricultural Sustainability, Pages: 211-221, 2023

Abstract

Smart farming uses information and communication technologies in various fields of agriculture. It refers to the use of information and data management technologies in agriculture. Smart farming leads towards high productive and sustainable agricultural production. Smart farming provides the farmer many advantages for decision making for better management. Smart farming technologies collect precise measurements of factors that determine farming outcomes. It enables agriculture more reliable, predictable, and sustainable. It also improves crop health, reduces the ecological footprint of farming, helps feed the increasing global population, provides food security in climate change scenarios, and achieves higher yields while reducing operating costs. It's also needed to meet the needs of the growing population. There are many technological devices, such as IoT, software support, connection, data analytics, robots, drones, and GPS, which is useful to enhance the quantity and quality of agriculture production with minimizing labor.

Novel approaches to design 32-bit MAC unit for edge computing devices

Gyaneshwar Rathore, Mitul Nagar, Pinalkumar Engineer

Conference Papers 4th National Conference on VLSI, Signal Processing and Communications NCVSCOMS-2020

Abstract

Modern-day applications like the Internet of Things (IoT), machine learning, and arti cial intelligence collect a massive amount of data to process. Next-generation processors need to be more e-cient in processing enormous amounts of data to extract features. IoT is an emerging eld of technology, and edge computing is a potential research eld nowadays. Customized architecture with digital signal pro-cessing (DSP) operations reduce data transmitted to a higher node for processing. Multiply and accumulate (MAC) unit is an essential part of the many modern-day processors. Multiplier design is required to be optimized to reduce partial product terms and area. The addition is a frequently used operation in multiplication as well as MAC operation. In this paper, novel approaches are taken into account to optimize the MAC unit to improve performance and latency while extending on soft processor architecture for edge computing devices.

Image Communication Using Quasi-Cyclic Low-Density Parity-Check (QC-LDPC) Code

Dharmesh Patel, Pinalkumar Engineer, Ninad Bhatt

Conference Papers International Conference in Advances in VLSI and Embedded Systems (AVES)-2019 Pages: 211-221

Abstract

Proposed quasi-cyclic low-density parity-check code (QC-LDPC) efficiently communicates image at comparable peak signal-to-noise ratio (PSNR) with less signal-to-noise ratio (SNR). Encoding of the image is done using the Gauss–Jordan elimination method while decoding is done using the min-sum iterative message-passing algorithm using the code length 1152. Hardware design of decoder is based on fully parallel architecture for achieving more throughput.

Scalable implementation of particle filter-based visual object tracking on network-on-chip (NoC)

Pinalkumar Engineer, Rajbabu Velmurugan and Sachin Patkar

Journal Paper Journal of Real-Time Image Processing: March 2019

Abstract

Particle filter algorithms have been successfully used in various visual object tracking applications. They handle non-linear model and non-Gaussian noise, but are computationally demanding. In this paper, we propose a scalable implementation of particle filter algorithm for visual object tracking, using scalable interconnect such as network-on-chip on an FPGA platform. Here, several processing elements execute parallelly to handle large number of particles. We propose two designs and implementations, with one optimized for speed and other optimized for area. These implementations can easily support different image sizes, object sizes, and number of particles, without modifying the complete architecture. Multi-target tracking is also demonstrated for four objects. We validated the particle filter-based visual tracking with video feed from a Petalinux-based system. With image size of 320×240, frame rates of 348 fps and 310 fps were achieved for single-object tracking of size 17×17 and 33×33 pixels, respectively, with a reasonable low-power consumption of 1.7 mW/fps on Zynq XC7Z020 (Zedboard) with an operating frequency of 69 MHz. This makes our implementation a good candidate for low-power, visual object tracking using FPGA, especially in low-power, smart camera applications.

Multithreaded Image Processing Using ReconOS on Reconfigurable Computing System

Syam Sanal, Pinalkumar Engineer

Conference Papers International Conference on Emerging Trends and Innovations In Engineering And Technological Research (ICETIETR), 2018 Pages: 1-5

Abstract

This paper presents the implementation of Particle filter based Object tracking using ReconOS on Reconfigurable Computing System. Multithreading can be used to improve the performance of complex Image processing algorithms, but their sequential execution is a barrier which can be tackled by the effective use of FPGAs. In order to accomplish the desired performance, an operating system, ReconOS is used on an ARM based CPU-FPGA hybrid platform. ReconOS extends communication and synchronization primitives of operating systems like mutexes, semaphores, condition variable and message boxes to reconfigurable hardware. ReconOS provides the advantage of mapping the particle filter algorithm into reconfigurable hardware and accessing the data from software threads. Thus providing improved performance, portability and unified appearance as well as transparency to the object tracking application.

Design and implementation of quasi cyclic low density parity check (QC-LDPC) code on FPGA

Dharmesh J. Patel, Pinalkumar Engineer

Conference Papers International Conference on Wireless Communications, Signal Processing and Networking (WiSPNET), 2017, Pages: 181-185

Abstract

Low density parity check (LDPC) code is a widely used error correcting code in various applications such as Wi-Fi, Wi-Max and Digital Video Broadcasting — Satellite — Second Generation (DVB-S2). Proposed work is focused on LDPC decoder design using soft decision iterative message passing for a code length 222 bits, 546 bits, 642 bits, 648 bits and 1152 bits that gives BER of 10−5 with coding gain of approximately 4 dB. Proposed work shows fully parallel design on isolated shifted identity matrices based on structured construction quasi cyclic low density parity check code (QC-LDPC) that give throughput of 2 Gbps.

Dual Screen Mobile device with Solar Charging

Pinalkumar Engineer, Lokesh Sharma

Patents Application No: 201621010692, Granted on 03/09/2021, Term: 20 years from 29/03/2016, Indian Patent Office

Abstract

Parameterizable FPGA Framework for Particle Filter Based Object Tracking in Video

Pinalkumar Engineer , Rajbabu Velmurugan and Sachin B. Patkar

Conference Papers 28th International Conference on {VLSI} Design, {VLSID} 2015, Bangalore, India, January 3-7, 2015. pp. 35--40.

Abstract

Real-time particle filter based object tracking in videos on embedded platforms (FPGA) is challenging because of its resource usage and computational complexity. Furthermore, minor changes to the algorithm will need changes in the hardware. To address these issues, we propose a parametrizable FPGA framework for particle filter based object tracking algorithm. This parametrizable implementation can be used for various image sequences, object sizes and number of particles. By changing few parameters, this parametrization leads to appropriate changes in hardware resources resulting in efficient real-time operation of the algorithm. Experimental results show better tracking from the implementation and the proposed architecture can run particle filter algorithm for a color video sequence with 650 fps on average.

Framework for Application Mapping over Packet-Switched Network of FPGAs: Case Studies

Vinay B. Y. Kumar, Pinalkumar Engineer , Mandar Datar, Yatish Turakhia, Saurabh Agarwal, Sanket Diwale and Sachin B. Patkar

Conference Papers Second International Workshop on FPGAs for Software Programmers (FSP 2015), September 1- 4, 2015

Abstract

The algorithm-to-hardware High-level synthesis (HLS) tools today are purported to produce hardware comparable in quality to handcrafted designs, particularly with user directive driven or domains specific HLS. However, HLS tools are not readily equipped for when an application/algorithm needs to scale. We present a (work-in-progress) semi-automated framework to map applications over a packet-switched network of modules (single FPGA) and then to seamlessly partition such a network over multiple FPGAs over quasi-serial links. We illustrate the framework through three application case studies: LDPC Decoding, Particle Filter based Object Tracking, and Matrix Vector Multiplication over GF(2). Starting with high-level representations of each case application, we first express them in an intermediate message passing formulation, a model of communicating processing elements. Once the processing elements are identified, these are either handcrafted or realized using HLS. The rest of the flow is automated where the processing elements are plugged on to a configurable network-on-chip (CONNECT) topology of choice, followed by partitioning the 'on-chip' links to work seamlessly across chips/FPGAs.

Mean-Shift Algorithm: Verilog HDL Approach

Rahul V. Shah, Amit Jain, Rutul B Bhatt, Pinalkumar Engineer and Ekata Mehul

Conference Papers Proceedings of the Third International Conference on Trends in Information, Telecommunication and Computing, 2013. Pages 181-194

Abstract

FPGA implementation of particle filter based object tracking in video

Sumeet Agrawal, Pinalkumar Engineer, Rajbabu Velmurugan, Sachin Patkar

Conference Papers International Symposium on Electronic System Design (ISED), 2012, Pages: 82-86

Abstract

There is a continuous requirement of enhancing the computation speed with minimum resources to improve performance of signal processing algorithm. This paper proposes an architecture and implementation of a modified color histogram based Particle filter for object tracking in video. This architecture implements weight calculation and histogram calculation in a highly parallel form. The proposed architecture occupies less resource saving by effective memory utilization. The performance of the algorithm is demonstrated using a single object scenario.

Modified architecture for real-time face detection using FPGA

Suraj Das, Atit Jariwala and Pinalkumar Engineer

Conference Papers 3rd Nirma University International Conference on Engineering (Nuicone-2012), Nirma University, Ahmedabad, Dec 6-8,2012, Pages. 1-5.

Abstract

In this paper, we introduce modified hardware architecture with key features of lessening the resource usage of the FPGA and elevating the face detection frame rate. The system is based on well-known Viola Jones Framework which consists of AdaBoost algorithm integrated with Haar features. We also enlist the modification in hardware design techniques to achieve more parallel processing and higher detection speed of the system. The system implemented on Xilinx Virtex-5 FPGA development board outputs a high face detection rate (91.3%) at 60 frame/second for a VGA (640 × 480) video source. The power consumption of the implementation is 2.1 W.

FPGA based stream processing of edge and skin detection algorithms

Suraj Das, Atit Jariwala and Pinalkumar Engineer

Conference Papers International Conference on Advanced Computing and Communication Technologies (ICACCT-2012)at Asia Pacific Institute of Information Technology SD India, Panipat (Haryana) on November 3,2012

Abstract

Field Programmable Gate Array Based Control Signal Generator for Pulsed Radar

Sanjay Trivedi, B. S. Raman, Pinalkumar Engineer and Dr. Mihir Shah

Journal Paper International Journal of Embedded Systems and Applications (IJESA), 2(3): September 2012

Abstract

The objective of this paper is to present the architecture design and implementation of a software defined hardware module called Control Signal Generator (CSG) for pulsed RADAR (Radio Detection and Ranging) application. It is a digital, programmable, application-specific control timing signal generator for Disaster Management Synthetic Aperture Radar (DM-SAR)[1]. This module is a slave controller which receives command through asynchronous serial interface and generates programmable timings. Architecture evolved and the module is developed using Very High Speed Integrated Circuit (VHSIC) Hardware Description Language (VHDL) and successfully implemented on Xilinx Field Programmable Gate Array (FPGA) XCV600-6HQ240

Design of a Unified Timing Signal Generator (uTSG) for Pulsed Radar

Sanjay Trivedi, B. S. Raman, Pinalkumar Engineer and Dr. Mihir Shah

Journal Paper International Journal of Electronics and Communication Engineering & Technology (IJECET), 3(1): June 2012

Abstract

Combining Power of MATLAB with SystemVerilog for Image and Video Processing ASIC Verification

Dhaval Modi, Harsh Sitapara, Rahul Shah, Ekata Mehul, Pinalkumar Engineer

Conference Papers ADVANCES IN NETWORK SECURITY AND APPLICATIONS, Communications in Computer and Information Science series 2011, 196(1). Pages 181-193

Abstract

The ultimate Aim of ASIC verification is to obtain the highest possible level of confidence in the correctness of a design, attempt to find design errors and show that the design implements the specification. Complexity of ASIC is growing exponentially and the market is pressuring design cycle times to decrease. Traditional methods of verification have proven to be insufficient for Digital Image processing applications. We develop a new verification method based on SystemVerilog verification with MATLAB to accelerate verification. The co-simulation is accomplished using MATLAB and SystemVerilog coupled through the DPI. Here is used the Image Resize design verification as case study by using co-simulation method between SystemVerilog and MATLAB. Golden reference will be made using MATLAB In-built functions, while rest of the Verification Environment are in SystemVerilog. The goal is to find more bugs from the Design as compared to traditional method of Verification, reduce time to verify video processing ASIC, reduce debugging time, and reduce coding length

Enhancing verification capability using system Verilog and Matlab

Dhaval Modi, Harsh Sitapara, Rahul V Shah, Ekata Mehul, Pinalkumar Engineer

Conference Papers 2nd International conference on Signals, Systems and Automation (ICSSA-2011), January 2011

Abstract

Integrating MATLAB with verification HDLs for functional verification of image and video processing ASIC

Dhaval Modi and Harsh Sitapara and Rahul V. Shah and Ekata Mehul and Pinalkumar Engineer

Journal Paper International Journal of Computer Science \& Emerging Technologies, 2(2):258--265, 2011

Abstract

The ultimate Aim of ASIC verification is to obtain the highest possible level of confidence in the correctness of a design, attempt to find design errors and show that the design implements the specification. Complexity of ASIC is growing exponentially and the market is pressuring design cycle times to decrease. Traditional methods of verification have proven to be insufficient for Digital Image processing applications. We develop a new verification method based on SystemVerilog verification with MATLAB to accelerate verification. The co-simulation is accomplished using MATLAB and SystemVerilog coupled through the DPI. I will be using the Image Resize design as case study by using co-simulation method between SystemVerilog and MATLAB. Golden reference will be made using MATLAB In-built functions, while rest of the Verification blocks are in SystemVerilog. The goal is to find more bugs from Image resizing Design as compared to traditional method of Verification, reduce time to verify video processing ASIC, reduce debugging time, and reduce coding length.

Pinalkumar Engineer

Sardar Vallabhbhai National Institute of Technology, Surat

Selected Publications

Filter by type:

Leave-One-Out Validation in Machine Cross-Learning

Abstract

A Memory Efficient Run-Time Re-Configurable Convolution IP Core for Deep Neural Networks Inference on FPGA Devices

Abstract

A review on Beamspace Channel Estimation Algorithms in Wireless Communication.

Abstract

Quantization Effects on a Convolutional Layer of a Deep Neural Network

Abstract

Parallelizing Non-Neural ML Algorithm for Edge-based Face Recognition on Parallel Ultra-Low Power (PULP) Cluster

Abstract

Smart Farming Ingredients: IoT Sensors, Software, Connectivity, Data Analytics, Robots, Drones, GIS-GPS

Abstract

Novel approaches to design 32-bit MAC unit for edge computing devices

Abstract

Image Communication Using Quasi-Cyclic Low-Density Parity-Check (QC-LDPC) Code

Abstract

Scalable implementation of particle filter-based visual object tracking on network-on-chip (NoC)

Abstract

Multithreaded Image Processing Using ReconOS on Reconfigurable Computing System

Abstract

Design and implementation of quasi cyclic low density parity check (QC-LDPC) code on FPGA

Abstract

Dual Screen Mobile device with Solar Charging

Abstract

Parameterizable FPGA Framework for Particle Filter Based Object Tracking in Video

Abstract

Framework for Application Mapping over Packet-Switched Network of FPGAs: Case Studies

Abstract

Mean-Shift Algorithm: Verilog HDL Approach

Abstract

FPGA implementation of particle filter based object tracking in video

Abstract

Modified architecture for real-time face detection using FPGA

Abstract

FPGA based stream processing of edge and skin detection algorithms

Abstract

Field Programmable Gate Array Based Control Signal Generator for Pulsed Radar

Abstract

Design of a Unified Timing Signal Generator (uTSG) for Pulsed Radar

Abstract

Combining Power of MATLAB with SystemVerilog for Image and Video Processing ASIC Verification

Abstract

Enhancing verification capability using system Verilog and Matlab

Abstract

Integrating MATLAB with verification HDLs for functional verification of image and video processing ASIC

Abstract