Crosswalk detection for the outdoor navigation of people with visual impairment
Year 2022Â
Author(s) Odyssefs Karatzaferis Â
Link http://www.gdmc.nl/publications/2022/MScThessiOdyssefsKaratzaferis.pdfÂ
ML Tags
Convolutional Neural Networks
Topic Tags
Remote Sensing / Object detection
Data augmentation
> Software & Plug-ins UsedÂ
OpenStreetMap and Overpass Turbo for collecting open-source geographic dataÂ
PDOKÂ for aerial imageryÂ
YOLOv5 for the pre-trained CNN modelÂ
Python in a Jupyter notebook for coding Â
Colab virtual environment for model trainingÂ
ESRI’s ArcGIS Pro for Pre- and post-processing & visualizationÂ
> Summary
The aim of this study is to utilise freely available aerial imagery and spatial data to identify the location of pedestrian zebra crosswalk in an area. This will ultimately allow improved, independent navigation for people who are visually impaired by helping to adjust a route to consider crosswalk locations and ensure that everyone can walk across roads more safely. The main steps towards building the proposed system include: Â
Collecting training imagery dataset Â
Choosing an appropriate algorithm Â
Image quality control, grouping and labelling the imagesÂ
Implementing and training the CNN model Â
Evaluating initial results Â
Data augmentation Â
Model fine-tuning Â
Applying the trained model to crosswalk detection in Delft Â
Testing model with different spatial resolution imageryÂ
LIMITATIONS:Â Â
The model exhibited a tendency to misidentify non-crosswalk linear objects as crosswalks, resulting in many false positive predictions. This issue was found to be caused by unclear training data. Data had been included which not only explicitly depicted roads, but also included wider urban and sub-urban regions where objects like windows and solar panels caused the model’s confusion. This can be solved by filtering the predicted crosswalk locations using a road buffer (which was done as part of this research), by pre-processing the detection dataset to filter out non-road areas or by including appropriate images as background true negative samples to the training dataset. Â
The trained model was also tested with imagery of higher spatial resolution and image quality than the ones used to train it. While some additional crosswalk locations were identified this way, predictions on the high-resolution image included a dramatically increased number of false positives. This was attributed to the sharper depiction of background linear objects due to the higher image quality and spatial resolution, making them more susceptible to misidentification. Training a new model with high resolution imagery with an appropriately sized true negative sample would possibly help reduce the number of erroneous predictions. Such an approach would however greatly increase computational demands and make the designed system unsuitable for real-time detection.Â
> Additional Visuals
Thesis Report Figure 17: Data training batch sample
Thesis Report Figure 18: Correct and incorrect predictions. a) true positive, b) false negative, c)false positive, d)true negative
Thesis Report Figure 20: Predictions comparison between the original (left images) and the fine-tuned (right images) models.
> Possible Applications
For ideas on how to implement some of the above mentioned techniques, please see
‘Possible applications for students to try with Convolutional Neural Networks’