top of page

Crosswalk detection for the outdoor navigation of people with visual impairment

Year  2022 

Author(s)  Odyssefs Karatzaferis  

Link  http://www.gdmc.nl/publications/2022/MScThessiOdyssefsKaratzaferis.pdf 

ML Tags

Convolutional Neural Networks

Topic Tags

Remote Sensing / Object detection

Data augmentation

> Software & Plug-ins Used 


  • OpenStreetMap and Overpass Turbo for collecting open-source geographic data 

  • PDOK for aerial imagery 

  • YOLOv5 for the pre-trained CNN model 

  • Python in a Jupyter notebook for coding  

  • Colab virtual environment for model training 

  • ESRI’s ArcGIS Pro for Pre- and post-processing & visualization 

> Summary


The aim of this study is to utilise freely available aerial imagery and spatial data to identify the location of pedestrian zebra crosswalk in an area. This will ultimately allow improved, independent navigation for people who are visually impaired by helping to adjust a route to consider crosswalk locations and ensure that everyone can walk across roads more safely. The main steps towards building the proposed system include:  

  • Collecting training imagery dataset  

  • Choosing an appropriate algorithm  

  • Image quality control, grouping and labelling the images 

  • Implementing and training the CNN model  

  • Evaluating initial results  

  • Data augmentation  

  • Model fine-tuning  

  • Applying the trained model to crosswalk detection in Delft  

  • Testing model with different spatial resolution imagery 


LIMITATIONS:  

  • The model exhibited a tendency to misidentify non-crosswalk linear objects as crosswalks, resulting in many false positive predictions. This issue was found to be caused by unclear training data. Data had been included which not only explicitly depicted roads, but also included wider urban and sub-urban regions where objects like windows and solar panels caused the model’s confusion. This can be solved by filtering the predicted crosswalk locations using a road buffer (which was done as part of this research), by pre-processing the detection dataset to filter out non-road areas or by including appropriate images as background true negative samples to the training dataset.  

  • The trained model was also tested with imagery of higher spatial resolution and image quality than the ones used to train it. While some additional crosswalk locations were identified this way, predictions on the high-resolution image included a dramatically increased number of false positives. This was attributed to the sharper depiction of background linear objects due to the higher image quality and spatial resolution, making them more susceptible to misidentification. Training a new model with high resolution imagery with an appropriately sized true negative sample would possibly help reduce the number of erroneous predictions. Such an approach would however greatly increase computational demands and make the designed system unsuitable for real-time detection. 

> Additional Visuals



Thesis Report Figure 17: Data training batch sample



Thesis Report Figure 18: Correct and incorrect predictions. a) true positive, b) false negative, c)false positive, d)true negative



Thesis Report Figure 20: Predictions comparison between the original (left images) and the fine-tuned (right images) models.

> Possible Applications


For ideas on how to implement some of the above mentioned techniques, please see

‘Possible applications for students to try with Convolutional Neural Networks’

bottom of page