A.I. - Practical challenges in image recognition

A.I. - Practical challenges in image recognition

Present day AI is based on training the Artificial Neural Networks using large training data sets. Using this approach, a given NN can be trained to classify a test image into a designated set of labels or classes. The accuracy of the NN heavily depends on the amount of training data fed into it. As the size of training data set increases, the training process becomes increasingly more resource hungry. This calls for high performance CPUs and larger RAMs capacities for training multi-layer NNs involving large training data sets.

A higher compute processing power is needed by NNs during the training phase. Post training completion, the NN can run on reduced compute capacity since backpropagation is not needed any more. This makes NN a good candidate to run on virtual machines on cloud that offers compute elasticity and scalability. However, this may not be as simple as it sounds. A given NN needs to work as a single integrated entity even if computationally distributed across multiple virtual machines. The neurons must be able to connect and interact with other neurons horizontally and vertically across the NN layers seamlessly. The solution lies in interconnecting the virtual machines in a single VXLAN and creating API adaptors on the VMs that can receive and propagate the activation and backpropagation signals to the neighboring neurons that may sit on different VMs.

Further, a need may arise to get a pre-trained NN to learn to identify a new class of images. On completion of a given training process, the NN settles down with a final set of neuronal weights through feed forward and backpropagation cycles. Once the training is completed, the NN cannot learn to identify any new image class through incremental training cycles exclusively targeted at the new image class. To include a new image class, additional full rounds (epochs) of training are required with well shuffled training data set that should include sufficient samples of the new image class. This process revises the previous neuronal weights and accommodates a new image class in the fully connected layer. It is often prudent to keep a few redundant neurons in the last layer of NN that can be assigned to new image classes to meet future needs.

The other challenge with respect to image recognition is image segmentation. Any given picture may be comprised of images of multiple objects. The object images may appear in any fashion within the picture. They may have significant degree of overlaps, size variations, occlusions, deformations, illumination peculiarities etc. that make it difficult for the NN to identify the images accurately. Here, it becomes necessary to segment the given picture into individual images. Often, it is difficult to identify an object’s outline without really knowing what the given object is. This creates a chicken and egg situation. There are various algorithms for image segmentation like canny edge, deep mask etc. Considerable progress is taking place on these algorithms to enhance segmentation accuracies. Compared to 2D, it is easier to distinguish objects in 3D space. As we know, human eyes can distinguish objects that we see due to our ability to sense and judge relative distances. Recently launched mobile phones carry dual lens camera that helps to differentiate distances of the objects in view. iPhone 7 comes with a dual lens camera that provides the depth of field effect which may further aid image processing in AI context. The advent of 3D dual lens cameras will make image segmentation easier.

The above discussion highlights some of the implementation challenges in image recognition along with potential solutions. It is expected that evolution of more sophisticated architectures and algorithms shall improve the accuracy of NNs. Also, going forwards, the NNs will require reducing amounts of training data inputs as they would learn to generalize in a better way. Such AI services will enable the end users to train the NNs using their own data sets. This advancement will make image recognition more flexible and easier to use.

要查看或添加评论,请登录

Chandrakant Pattekar的更多文章

  • INSPIRED BY BRAIN!

    INSPIRED BY BRAIN!

    Much of the research on Artificial Intelligence has been inspired by the brain and its functioning. Perceptrons or some…

    8 条评论
  • THE A.I. REVOLUTION IS ON!

    THE A.I. REVOLUTION IS ON!

    Significant progress has been made in the field of artificial Intelligence in the last few years. Mysteries in Math…

    2 条评论

社区洞察

其他会员也浏览了