This is similar to teaching people to distinguish certain plant species by showing photos of them. The object is recognized. Successful object recognition leads to an improvement in manufacturing in two ways. The penultimate goal of the recognition system is to collect data about observed objects and events. Accurate unbiased statistics is the key to reveal opportunities for improvements in internal business processes. The ultimate goal of the system is an automatic execution of certain action.
For example, if a computer sees a defect, it sends a command to a mechanism that will remove a defective part from the line. In some cases, the system needs to send an alarm to personnel, indicating a detected abnormality. Multi-purpose object recognition software offered by BitRefine group can accomplish both goals, connections to automation hardware and visual data analysis tools. The output obtained from these tools such as the search queries, tables, and charts provide a better picture of manufacturing processes. Today companies are starting their transformation towards a data-driven business to boost profits and gain competitive advantages.
Big data is the core of this transformation. And computer vision is an essential source of such data. Today developers offer recognition platforms either as a SaaS, or as a classic software package. The server then returns to you a list of objects that it managed to recognize.
However, there are a few downsides. If you choose classic software, first you need a powerful computer, usually a high-end machine. Then you install object recognition software, connect cameras or other sources of visual data, configure the system and let it run. Here you need to have a specialist, who will keep an eye on running modules, do maintenance and other services, typical for IT systems.
Outline of object recognition
The main advantage of on-premises object recognition solution is that it allows you to work with any type of visual data at any speed. At BitRefine, we realize that both solutions serve a specific purpose and are therefore valuable. BitRefine offers its computer vision platform both as SaaS solution and as an on-premise software.
The client is free to choose an option that better meets her requirement. In a dynamic industry such as manufacturing, a client may be interested in both the solutions. For instance, some clients may want to use SaaS services for a while before they make a commitment to on-premises object recognition solution. The field of AI machine vision is growing rapidly now. Many developers are already reaching clients with their solutions. This occasionally can overwhelm a business leader to create feelings of anxiety and confusion.
Some companies are convinced that they need to use an object recognition system but are not sure which one.
Here we offer general guidelines to choose an optimal system from available options. Machines that can see are able to solve a broad range of tasks. Usually, after successful implementation in one area, a company immediately discovers several new areas, where recognition software would perfectly fit. Therefore, from the very beginning company should give priority to flexible and adaptable platforms over software with a limited list of detectable objects. Flexibility means significant savings in the future. Another aspect of general flexibility is the ability of recognition software to run on different hardware platforms.
Developers of hardware follow the AI trends and keep bringing to the market various computer units, optimized for artificial neural networks. If object recognition software runs on a full-size PC with Windows only, you will not be able to leverage more powerful, yet more compact and cheap hardware units in the future.
Image Recognition: Current Challenges and Emerging Opportunities - Microsoft Research
Not just computer vision software but also the expert developers behind it is a key consideration. As we mentioned before, neural networks need training. This means that you need either to have an in-house specialist or rely on services provided by an external agency such as a system integrator or developer of computer vision software.
In this case, the client needs just to pass footages to developers and fix requirements. The developer will adapt neural network, train it properly and provide the client with a ready-to-go machine. Companies are leveraging computer vision to improve three general areas—to increase yields, reduce costs, and gain competitive advantages by integrating AI into the products themselves. Integration of computer vision in various sectors, including manufacturing, is happening at a rapid pace.
Although in the beginning only big players could afford this kind of technology, now no company can afford to fall behind. Today general purpose, computer vision solutions have removed all previous entry barriers and allowed even a small manufacturing company put artificial intelligence at its service. There is a growing consensus among business leadership that this is the time to include computer vision in manufacturing. Expert excerpt: machine vision system based on AI Classic machine vision software needs hard-coded instructions to find and check a certain object in the image.
We model this process with a fully convolutional network [ 7 ] , which we describe in this section. Because our ultimate goal is to share computation with a Fast R-CNN object detection network [ 2 ] , we assume that both nets share a common set of convolutional layers. In our experiments, we investigate the Zeiler and Fergus model [ 32 ] ZF , which has 5 shareable convolutional layers and the Simonyan and Zisserman model [ 3 ] VGG , which has 13 shareable convolutional layers. To generate region proposals, we slide a small network over the convolutional feature map output by the last shared convolutional layer.
This feature is fed into two sibling fully-connected layers—a box-regression layer reg and a box-classification layer cls. Note that because the mini-network operates in a sliding-window fashion, the fully-connected layers are shared across all spatial locations.
At each sliding-window location, we simultaneously predict multiple region proposals, where the number of maximum possible proposals for each location is denoted as k. So the reg layer has 4 k outputs encoding the coordinates of k boxes, and the cls layer outputs 2 k scores that estimate probability of object or not object for each proposal 4 4 4 For simplicity we implement the cls layer as a two-class softmax layer. Alternatively, one may use logistic regression to produce k scores. The k proposals are parameterized relative to k reference boxes, which we call anchors.
An important property of our approach is that it is translation invariant , both in terms of the anchors and the functions that compute proposals relative to the anchors. If one translates an object in an image, the proposal should translate and the same function should be able to predict the proposal in either location.
As a comparison, the MultiBox method [ 27 ] uses k-means to generate anchors, which are not translation invariant. So MultiBox does not guarantee that the same proposal is generated if an object is translated. The translation-invariant property also reduces the model size. As a result, our output layer has 2. Multi-Scale Anchors as Regression References.
Our design of anchors presents a novel scheme for addressing multiple scales and aspect ratios. This way is often useful but is time-consuming. The second way is usually adopted jointly with the first way [ 8 ].
As a comparison, our anchor-based method is built on a pyramid of anchors , which is more cost-efficient. Our method classifies and regresses bounding boxes with reference to anchor boxes of multiple scales and aspect ratios. It only relies on images and feature maps of a single scale, and uses filters sliding windows on the feature map of a single size.
Because of this multi-scale design based on anchors, we can simply use the convolutional features computed on a single-scale image, as is also done by the Fast R-CNN detector [ 2 ]. The design of multi-scale anchors is a key component for sharing features without extra cost for addressing scales.
For training RPNs, we assign a binary class label of being an object or not to each anchor. Note that a single ground-truth box may assign positive labels to multiple anchors. Usually the second condition is sufficient to determine the positive samples; but we still adopt the first condition for the reason that in some rare cases the second condition may find no positive sample. We assign a negative label to a non-positive anchor if its IoU ratio is lower than 0.
Related Advances in Object Recognition Systems
Copyright 2019 - All Right Reserved