Professional Documents
Culture Documents
ISSN No:-2456-2165
The rest of the paper was organized that related work In [44] a new method is proposed which computes and
about face detection, mask detection and explanation studies highlights the main components of the important
are detailed in Section 2. The materials used in the study such representations from the layers. They claim up to 12%
as images and working environment are shown in Section 3. In improvement on weakly supervised object localization.
Section 4 Convolutional Neural Networks, Transfer Learning,
Class Activation Maps, and evaluation metrics are described. To the best our knowledge, mask detection and class
Experiments and results, Discussion are given in Section 5 and activation maps are used for the first time in [45]. They
Section 6, respectively. In Section 7, conclusion and future developed a system that monitors social distance, face mask,
works are explained. and touching face conditions by combining deep learning
based imaging system and class activation maps.
II. RELATED WORK
III. MATERIAL
To identify the COVID-19 number of studies with deep
learning are carried out [15-20], but these studies are about AI projects require hundreds of images to learn
clinical findings. Taking the advantage of deep learning, a effectively, and computation sources to realize mathematical
system that monitor whether people wear a mask can be operations. The collected images and working environment will
developed. be clarified in this section.
To detect face mask images [21] proposed a hybrid deep A. Creating Dataset
learning and machine learning model. Deep learning is used to The images used in this study have been collected from a
extract features and support vector machines (SVM) classifies created imaging system at Huawei entrance and open source
the extracted features. The SVM classifier achieved 99.64%, datasets [46-47]. Fig. 2 shows the created imaging system to
99.49%, 100% testing accuracy in different datasets. take real world face images. The system was detailed in the
Working Environment section.
In [22] a face mask-wearing condition identification
method is developed by combining super-resolution and
classification methods for images. Their algorithm consist of
four steps: pre-processing, face detection-cropping, super-
resolution, and face mask-wearing identification. They
achieved 98.70% accuracy using the proposed deep learning
method.
B. Working Environment
To take real world face images a camera setup was built Fig 5. Simple CNN Architecture including convolution, pooling and
at Huawei Entrance (Fig. 2). The used camera is Huawei fully-connected layers.
M2150-10-EI [48] which has 5MP image sensor, 1 TOPS
computing power, 2560(H) x 1920(V) effective pixels as
shown at Table 2. It can capture face images and send via File As it is seen in the Fig 5, the convolutional structure
Transfer Protocol. includes convolution and pooling layers that are feature
extraction and dimension reduction methods, respectively.
Table 2. Technical Specifications of the Camera After features were extracted and reduced, they are classified
CPU Hi3516D by neural layer which is also known as fully-connected layer.
Computing Power 1 TOPS
Intelligent Analysis Face and Person Detection When it compares to traditional learning methods, end-to-
Effective pixels 2560 (H) x 1920 (V) end learning can be realized by using CNNs architecture. The
comparison of CNNs and traditional learning is demonstrated
Video Encoding Format H.265/H.264/MJPEG
at the Table 4.
Frame Rate 30 FPS
Table 4. Comparison of CNNs and Traditional Learning
To train the AI model, a GPU-based linux server (Centos
Operations CNNs Traditional Learning
7, Cuda 11.2) that has Tesla T4 has been used. As can be seen
Feature Extraction Convolution Local Binary Pattern
at Table 3 Python is selected as programming language,
Dimension Linear Discriminant
Pytorch is selected as deep learning library, Resnet-18 is Pooling
selected as pre-trained model. Reduction Analysis
Neural Artificial Neural
Classification
Table 3. Used Hardware and Software Layer Network
Programming Language Python
As seen at Table 4, in traditional methods all features
Deep Learning Library Pytorch
should be extracted by using algorithms such as Local Binary
Transfer Learning Model Resnet-18
Pattern [51] then reduced by using embedded or filter based
GPU Tesla T4
algorithms such as Linear Discriminant Analysis [52].
CUDA 11.2
Operation System Linux Centos 7 After reduction of the features, a classifier such as
Artificial Neural Network or Support Vector Machine should
be used to classify the features into classes desired [53]. When
it comes to CNNs, all operations are realized by convolution,
pooling and neural layer automatically.
Fig 6. Transfer Learning Architecture. (2-5 are frozen and 6-9 are
trained.)
The Gradient-weighted Class Activation Maps need According to learning results with test images, sensitivity
class-related weights. Since ResNet-18 has 512 pieces filter and specificity values were obtained as 95.16% and 96.69%
with 1x1 size in global average pooling layer (Fig. 8 Part 2) and respectively. The results show that the CNN model can
two neuron in neural layer (Fig. 8 Part 3), there are 2 pieces successfully classifies images into mask and no mask classes,
1x512 weights related to mask and no mask classes. After and when we ask why an image was classified into mask class,
classification result was obtained, the weights of the desired it answers by highlighting the region where the mask image
class is taken. For example, if mask class's activations are exists. The importance of that is we can be sure there was no
looked for mask class's weights are taken (size is 1x152). Class overfitting and the CNN model was correctly trained.
activation maps of the mask and no mask classes are shown in
Fig. 12 and Fig. 13. VII. CONCLUSION AND FUTURE WORKS