Crosswalk detection for the outdoor navigation of people with visual impairment

Year             2022 
Author(s)     Odyssefs Karatzaferis  
Link              http://www.gdmc.nl/publications/2022/MScThessiOdyssefsKaratzaferis.pdf 

Software & Plug-ins Used

Workflow

Short Video

Summary

Additional Visuals

Possible Applications

ML Tags

Convolutional Neural Networks

Topic Tags

Remote Sensing / Object detection

Data augmentation

> Software & Plug-ins Used 
OpenStreetMap and Overpass Turbo for collecting open-source geographic data 
PDOK for aerial imagery 
YOLOv5 for the pre-trained CNN model 
Python in a Jupyter notebook for coding  
Colab virtual environment for model training 
ESRI’s ArcGIS Pro for Pre- and post-processing & visualization 

> Summary
The aim of this study is to utilise freely available aerial imagery and spatial data to identify the location of pedestrian zebra crosswalk in an area. This will ultimately allow improved, independent navigation for people who are visually impaired by helping to adjust a route to consider crosswalk locations and ensure that everyone can walk across roads more safely. The main steps towards building the proposed system include:  
Collecting training imagery dataset  
Choosing an appropriate algorithm  
Image quality control, grouping and labelling the images 
Implementing and training the CNN model  
Evaluating initial results  
Data augmentation  
Model fine-tuning  
Applying the trained model to crosswalk detection in Delft  
Testing model with different spatial resolution imagery 

LIMITATIONS:  
The model exhibited a tendency to misidentify non-crosswalk linear objects as crosswalks, resulting in many false positive predictions. This issue was found to be caused by unclear training data. Data had been included which not only explicitly depicted roads, but also included wider urban and sub-urban regions where objects like windows and solar panels caused the model’s confusion. This can be solved by filtering the predicted crosswalk locations using a road buffer (which was done as part of this research), by pre-processing the detection dataset to filter out non-road areas or by including appropriate images as background true negative samples to the training dataset.  
The trained model was also tested with imagery of higher spatial resolution and image quality than the ones used to train it. While some additional crosswalk locations were identified this way, predictions on the high-resolution image included a dramatically increased number of false positives. This was attributed to the sharper depiction of background linear objects due to the higher image quality and spatial resolution, making them more susceptible to misidentification. Training a new model with high resolution imagery with an appropriately sized true negative sample would possibly help reduce the number of erroneous predictions. Such an approach would however greatly increase computational demands and make the designed system unsuitable for real-time detection. 

> Additional Visuals

Thesis Report Figure 17:  Data training batch sample

Thesis Report Figure 18:  Correct and incorrect predictions. a) true positive, b) false negative, c)false positive, d)true negative

Thesis Report Figure 20:  Predictions comparison between the original (left images) and the fine-tuned (right images) models.

> Possible Applications
For ideas on how to implement some of the above mentioned techniques, please see 
‘Possible applications for students to try with Convolutional Neural Networks’