Neural Networks(tf) + ROS2

Sutej Kulgod
5
Min read
Image for post

Use of CNNs for image processing has gained massive popularity in recent times due to its accuracy and robustness.

Another tool which has has gained enormous acceptance in the robotics community is the all-powerful Robotic Operating System 2 (ROS2). The introduction of ROS2 has made it easier to have multiple robots on the same ROS network and has facilitated the use of small embedded platforms to be a participant in the ROS environment.

I believe that the marriage of these two popular tools was imminent for our product as we at Nymble strives to make use of the cutting edge technology to give cooking a new life. Our product utilizes the ROS2 network as its backbone and a deep neural network to detect the contents of dispensing boxes during automated cooking or the contents of the pan while manual cooking.

One major hurdle faced was the integration of learning-based image detection into the ROS network on a mobile platform. ROS2 is bleeding-edge and online resources on how the integration is done are scarce. Hence, I decided to draft this post to ease up the process for those who are trying to achieve something similar for their project.

We have used the following setup:

  1. The first step would be installing ROS2 from source. While installing ROS2 on the device, make sure that you ament ignore the packages you feel are not required and the packages that create errors during compilation. After the ignore files have been placed in the required packages, follow the same procedure given on the ROS2 installation page.
  2. Installing TensorFlow onto the system is a little more tricky, as you need to find the right version of tensorflow available for the python version you want to use. Download the version of your desire for the armv7 processor from the following resource. Install it onto your device using the following command. (The example given here is for TensorFlow 1.8.0 and python 3.5)
    sudo pip3 install tensorflow-1.8.0-cp35-none-linux_armv7l.whl
    I would recommend going through the basic tutorials available for ROS2 and TensorFlow before heading further, as that will help you understand and execute things more efficiently.
  3. Train the Tensorflow graph on an external computer, save the graph and transfer it onto the single-board computer. Even though it is possible to train the network on a single-board computer , I would recommend doing it on a more powerful machine or Google Colab to save time and to avoid the possibility of dealing with a slowed down / crashed system.
    If you are planning to use transfer-learning to retrain an existing module, avoid using modules which take up more than 400MB of RAM space while deployed (for example, Inception modules) as they will definitely crash the system and leave you high and dry. I would recommend using a suitable mobilenet module instead, as they are light and efficient.
    One thing to be cautious about while training the graph is that, decoding and resizing the image using tensorflow functions and opencv functions gives different pixel values which can change the accuracy of the results by around 10–15% in most cases. In order to avoid this loss of accuracy, use the same method for loading and resizing of the image while training and while loading the image for detection.
  4. This brings us to the final and the most crucial part of the process, integrating the image recognition code with ROS2. For integrating the detection into a ROS network you would require one node for loading the tensorflow graph and passing the image through it to obtain the result, and another node for sending an image or information about an image to be loaded to the detection code. You can use either python, c++ or a combination of the two for getting the nodes up and running.
Image for post
ROS2 network

For my project, I created a custom service with image data as request and string with the detection result as a response. The program which sends the image and receives the detection information acts as the ROS client and the image detection program is the service. I would suggest using a similar architecture instead of a publisher/subscriber network to avoid the complexities involved in matching the image and the result. I used python for the image detection part of the code and used the message type std_msg/UInt8MultiArray instead of sensor_msgs/Image for the convenience it offers to fiddle around with individual pixel values and makes it easy to change the orientation of the received image.

The following code should help you get a better picture about how you can set up a service for image detection.

#importing the required functions
from __future__ import absolute_import
from __future__ import division
from __future__ import print_functionimport argparse
import cv2
import numpy as np
import tensorflow as tf
import rclpy
from rclpy.executors import SingleThreadedExecutor#importing the custom service created
from customsrv.srv import ImgStr#creating a class for the service definition
class Service:def __init__(self):
   self.node = rclpy.create_node('image_client')
   self.srv = self.node.create_service(ImgStr, 'image_detect', self.detect_callback)def detect_callback(self, request, response):
   #loading the tensorflow graph
   model_file = \
   "Tensorflow/Temp_fc/output_graph.pb"
   label_file = "Tensorflow/Temp_fc/output_labels.txt"
   input_layer = "Placeholder"
   output_layer = "final_result"
   graph = load_graph(model_file)
   print("received data")
   global img
   for i in range(224):
     for j in range(224):
      for k in range(3):
         img[j,i,k]=request.im.data[k+j*3+i*224*3]

   #converting the image into a tensor form
   t = read_tensor_from_image_file(img)

   input_name = "import/" + input_layer
   output_name = "import/" + output_layer
   input_operation = graph.get_operation_by_name(input_name)
   output_operation = graph.get_operation_by_name(output_name)with tf.Session(graph=graph) as sess:
     results = sess.run(output_operation.outputs[0], {
       input_operation.outputs[0]: t
     })
   results = np.squeeze(results)top_k = results.argsort()[-5:][::-1]
   labels = load_labels(label_file)
   response.veg.data = labels[top_k[0]]
   for i in top_k:
       print(labels[i], results[i])
   return response#defining the load graph function
def load_graph(model_file):
 graph = tf.Graph()
 graph_def = tf.GraphDef()with open(model_file, "rb") as f:
   graph_def.ParseFromString(f.read())
 with graph.as_default():
   tf.import_graph_def(graph_def)
 return graph#defining the function which converts the image data into a tensor
def read_tensor_from_image_file(img):np_image_data = np.asarray(img)
 np_image_data= np.divide(np_image_data.astype(float),255)
 np_final = np.expand_dims(np_image_data,axis=0)
 return np_finaldef load_labels(label_file):
 label = []
 proto_as_ascii_lines = tf.gfile.GFile(label_file).readlines()
 for l in proto_as_ascii_lines:
   label.append(l.rstrip())
 return labeldef main(args=None):
 rclpy.init(args=args)
 print("creating service")#create the service
 service = Service()

 #spinning the node with a blocking call
 print("spinning")
 executor = SingleThreadedExecutor()
 executor.add_node(service.node)
 executor.spin_once(timeout_sec=-1)if __name__ == '__main__':
   main()

We at Nymble, are committed to building a team that represents a variety of backgrounds, perspectives, and skills. The more inclusive we are, the better our work will be.

If you are interested in applying learning based methods in robotics, write to us at hello@nymble.in !

Join our community and
get the latest updates.