Avasthi AI: Object Detection

I like to drive a lot and that and during highways at high speeds encountering a pothole can be tremendously dangerous. I always wondered whether AI can really detect potholes in the road. So I decided to give it a try. I always have a dashcam on my car that constantly records videos of my drive. So I decided to take a few of the videos along with the pothole dataset available on the internet to train a pothole detector.

I visualized this problem as an object detection problem. As I approached the problem, it became very clear that the form in which the dataset was, it was not ready for training. Look at one of the pictures from the dataset.

The image contains a bunch of potholes. We can use these type of images if we were to build a classifier which classifies the images into two groups, one having potholes and another without potholes.
Our intentions are very different. What we want to achieve is to locate the pothole in the frame. So we have to label our images. That is a time-consuming activity.
There are many tools available that help in tagging and labeling. I looked at quite a few of them and finally found VoTT from Microsoft to be a good tool. The fact that it can export tagged data into multiple formats was an icing on the cake.
So, I got around to tagging a bunch of these images. At this point, the only object that we are interested in is Pothole. After tagging, the above image looked like below image.

We also did the same exercise with a bunch of dashcam videos. It is really a time-consuming exercise and for that reason, we did not really get a substantial dataset.
Once we have our tagged and labeled dataset ready, we need to export them to tfrecord format. This is the format that tensorflow is most familiar with and it also makes it easy to merge multiple datasets into one.
Once, we are done with this, we have a number of tfrecord files and a pbtxt file.

item {
 id: 1
 name: 'RoadBump'
}
item {
 id: 2
 name: 'Pothole'
}
item {
 id: 3
 name: 'People'
}
item {
 id: 4
 name: 'Truck'
}
item {
 id: 5
 name: 'Bus'
}
item {
 id: 6
 name: 'Car'
}
item {
 id: 7
 name: 'TwoWheeler'
}
item {
 id: 8
 name: 'AutoRickshaw'
}
item {
 id: 9
 name: 'BadRoad'
}

Now we create two subdirectories, one to hold training data and another to hold evaluation data. We then split tfrecord files randomly into two sets, one for training and another for evaluation. Both the subdirectories can have the same pbtxt file.
The next step is to create a google cloud project. I am doing this work on an Ubuntu Linux machine. Download Google Cloud SDK to make life easier. The Google Cloud Developer tools website has good information on what needs to be done, but in summary, you need to authorize a user account.
The next step is to install tensorflow. Even though I plan on running training on Google Cloud, still I need the tensorflow because there are utilities that are required to package the job.
In my experience, tensorflow still works better with python 2.7, it works with python 3.5 and 3.6 as well but if you have a choice, stay with python 2.7.
You have a choice of installing tensorflow or tensorflow with GPU. If you have a machine that has Cuda, you need to install Cuda 9.0 drivers. If you are using Ubuntu 18.04, by default Nvidia will install Cuda 10.x. You have to make sure you remove them and install Cuda 9.0 drives. Also when you uninstall Cuda 10.x drivers, be careful. In my case it remove some of the important packages and then I had to manually install them later.

$ sudo apt update
$ sudo apt install python-dev python-pip
$ sudo pip install -U virtualenv  # system-wide install
$ pip install --upgrade tensorflow 
$ pip install --upgrade tensorflow-gpu # GPU install

It may be prudent to configure a virtual environment setup to make sure you don't have to worry too many broken dependencies. Tensorflow website has good information on how to go about doing the installation and that would be the best resource to go about it.
The next step is to figure out what should be done about the model. Since I am modeling the problem as an object detection problem, I will use one of the existing model checkpoints to use transference learning.

$ git clone https://github.com/tensorflow/models

We go into models/research directory and the code for object detection is in the object_detection directory. Tensorflow model zoo contains pre-trained models. We decided to use faster_rcnn_inception_resnet_v2_atrous_coco as the base model for training. At this time, download the model and extract the archive in a directory. It is always suggested to pick up the pipeline config file from the git repository. The config files are contained in model/research/object_detection/samples/configs directory. Each model has a corresponding config file. I have used faster_rcnn_inception_resnet_v2_atrous_coco.config for my model.
Now we need to set up our cloud environment. The first step is to create a bucket in cloud storage. Google tutorial uses the following trick to give names to buckets.

$ export PROJECT=$(gcloud config list project --format "value(core.project)")
$ export YOUR_GCS_BUCKET="gs://${PROJECT}-ml"

Within the cloud bucket, we create a directory named data. Within data, we create two subdirectories, train, and eval and we copy corresponding .tfrecord and pbtxt files into those directories. So ${YOUR_GSC_BUCKET}/data/train contains training dataset and ${YOUR_GSC_BUCKET}/data/eval contains evaluation dataset.
We need to package pycocotools to be submitted along with the model.

$ cd model/research
$ bash object_detection/dataset_tools/create_pycocotools_package.sh /tmp/pycocotools

We also need to add the location of models folder and models/slim folder to PYTHONPATH variable. From models directory, we need to compile protobuf files.

$ protoc object_detection/protos/*.proto --python_out = .

Now we generate all the archives that are needed to run the training. From model directory, we run following commands.

$ python setup.py sdist
$ (cd slim && python setup.py sdist)

Now we need to configure pipeline_config file. We use faster_rcnn_inception_resnet_v2_atrous_coco.config file as the base file. Basically, we need to modify all the instances of PATH_TO_BE_CONFIGURED in the file with appropriate values. We will need to set the following values.

fine_tune_checkpoint the cloud location of the model checkpt.
input_path for training data. This should be a cloud location of training files. You can give wild cards here. For example gs://myproject-ml/data/train/*.tfrecord.
label_map_path for training data. For example gs://myproject-ml/data/train/tf_label_map.pbtxt
input_path for eval data. This should be a cloud location of eval files. You can give wild cards here. For example gs://myproject-ml/data/eval/*.tfrecord.
label_map_path for eval data. For example gs://myproject-ml/data/eval/tf_label_map.pbtxt

For convenience sake, I also renamed the config file to pipeline.cfg. Now we upload the cfg file to cloud storage in data directory.
Now we are ready to submit our job for training.

$ gcloud ml-engine jobs submit training `whoami`_object_detection_`date +%s`     \
      --job-dir=${YOUR_GCS_BUCKET}/train
      --packages dist/object_detection-0.1.tar.gz,slim/dist/slim-0.1.tar.gz,/tmp/pycocotools/pycocotools-2.0.tar.gz
      --module-name object_detection.model_main \
      --config train.yaml \
      --runtime-version 1.10 \
      -- \
      --pipeline_config_path=${YOUR_GSC_BUCKET}/data/pipeline.config \
      --model_dir=${YOUR_GCS_BUCKET}/data/model_dir

Please pay attention to the fact that last two command line arguments are after -- and are passed on to the model.
In my case, I ran the model for approximately 24 hours and even with insufficient data the results were remarkable. Here are some of the images with detection in action.

As we can see even a limited amount of training data and training compute cycles can produce remarkable results. Now as a next step we need to tag and label a larger amount of data and then train it for longer.

Avasthi AI

Saturday, 24 November 2018

A pothole detector

How GenAI models like ChatGPT will end up polluting knowledge base of the world.