Development of an Aerial Fire Identification System Based on Visual Artificial Intelligence

To reduce losses due to fire, it is necessary to extinguish and rescue immediately. However, in the dense area fire trucks were unable to reach the fire site due to narrow road access. In this case, drones that can fly by themselves to the point of fire then release fire-fighting bombs automatically can help fire disaster management. This means it needs a system where it can identify whether there is a fire. This study explores the idea of identifying fire using computer vision approach by making 8 identification models with each dataset of day, night, day, and night, thermal, day filter, night filter, day and night filter, and thermal filter, which had been tested by a set of data that corresponded to each dataset. YOLOv4 algorithm and Google Colaboratory were used, where each model took 8-10 hours to be trained. Results show that the day and night model was the most robust by having the highest average F1-score, 0.37. And will be performing the best on thermal data test with the value of F1-score is 0.6. This can be a representation for exploring new ideas on further study of how to obtain the most suitable dataset and data test.


Introduction
In the most populous province in Indonesia, DKI Jakarta, there have been around 6,429 fires throughout 2020 based on statistics. This dense area makes the distance between houses very close, which can accelerate the spread of fire from one house to another. To reduce losses due to fire, it is necessary to extinguish and rescue immediately. However, the fire trucks were unable to reach the fire site due to the narrow road access. In this case, drones that can fly by themselves to the point of fire then release fire-fighting bombs automatically can help fire disaster management. This means it needs a system where it can identify whether there is a fire.
This study explores the idea of identifying fire using computer vision approach by making 8 identification models with each dataset of day, night, day and night, thermal, day filtered, night filtered, day and night filtered, and thermal filtered, which had been tested by a set of data that corresponded to each dataset. YOLOv4 algorithm and Google Colaboratory were used, where each model took 8-10 hours to be trained. Each dataset contains 300-450 data in the form of images from the top view in a state of fire and not fire, also at the daylight and night.

Confusion matrix
There are four categories in the confusion matrix, which are True Positive (TP), False Positive (FP), False Negative (FN), and True Negative (TN). TP means that the model is capable to identify the desired object in the input image. FP means an error while identifying the input image by considering non-object as objects. FN means that there is an error also while identifying by being unable to identify the object in the input image. TN means that there is no error by identifying nothing where the input image has no object in it.

Precision and recall
Precision is the ratio of true predictions (true positive) and the total number of predictions. It measures how accurate is the identification model. While the recall is the ratio of true positive and the total desired predictions. It measures how well the model identifies the object in the input image.

FI-score
F1-score is the harmonic mean of the precision and recall. The highest possible value is 1 which has the perfect precision and recall.

Mean average precision (mAP)
AP is generally defined as the area under the smoothed precision-recall curve. This metric is commonly used to evaluate object detection models. In this study, COCO AP@0.50 is used, which is the same as Pascal VOC.

Results
This study explores the idea of identifying fire using computer vision approach by making 8 identification models tested with 8 data tests also, which the dataset and data test are days, night, day-night, thermal, day filter, night filter, day-night filter, and thermal filter. The images on every data test are different from the datasets to avoid bias for the results.
Images for datasets are collected from Google Images and Instagram consist of 50-day images and 50-night images by each half of it is in the form of fire with no smoke and the others are non-fire images. The images are edited in Photoshop software for the thermal dataset to make thermal images look alike. For filter datasets, the images are filtered in the Google Colaboratory platform using OpenCV library with dehazer, convert BGR2HSV, and inRange filters. In the inRange filter, for day filter, night filter, day-night filter datasets are using range low(0, 100, 200) and for the thermal filter, a dataset using low(10, 100, 200). The range high the dataset being(30, 250, 255) is implemented for the 4 filter datasets. Then the images are annotated and augmented on the Roboflow website with the output for each dataset being 300-450 images.
Data tests are collected from the Kaggle website then annotated with the Roboflow website resulting in each data test consisting of 173 bounding boxes.

Discussion
YOLOv4 algorithm and Google Colaboratory were used in this study, where each model took 8-10 hours to be trained, as presented in Table 1 to Table 8 below.   Tables 1 to 8 show the results of 64 testing cases with 8 datasets and 8 data tests, which are day, night, day-night, thermal, day filter, night filter, day-night filter, thermal filter. The representative of several tested models was also presented in Figure 1.  Fig. 1 shows the results of several tested models using various new input images. From the figure, it can be seen that the model can identify many spots of fire (true positive), but there are still some spots that are not identified (false negative).

Conclusions
In this study, 8 identification models had been built with each dataset of day, night, day and night, thermal, day filter, night filter, day and night filter, and thermal filter, which had been tested by a set of data that corresponded to each dataset model. YOLOv4 algorithm and Google Colaboratory were used, where each model took 8-10 hours to be trained. Based on the results, it can be concluded that the day and night model was the most robust by having the highest average F1score, which is 0.37. The model is good enough to be implemented on the corresponding data test but