哈尔滨工业大学计算机专业课复试 Computer vision

本站小编免费考研网/2020-02-22

Computer vision

From Wikipedia, the free encyclopedia

Computer vision is a field that includes methods for acquiring, processing, analyzing, and understanding images and, in general, high-dimensional data from the real world in order to produce numerical or symbolic information, e.g., in the forms of decisions.A theme in the development of this field has been to duplicate the abilities of human vision by electronically perceiving and understanding an image.[5] Understanding in this context means the transformation of visual images (the input of retina) into descriptions of world that can interface with other thought processes and elicit appropriate action. This image understanding can be seen as the disentangling of symbolic information from image data using models constructed with the aid of geometry, physics, statistics, and learning theory.[6] Computer vision has also been described as the enterprise of automating and integrating a wide range of processes and representations for vision perception.[7]

As a scientific discipline, computer vision is concerned with the theory behind artificial systems that extract information from images. The image data can take many forms, such as video sequences, views from multiple cameras, or multi-dimensional data from a medical scanner. As a technological discipline, computer vision seeks to apply its theories and models to the construction of computer vision systems.

Sub-domains of computer vision include scene reconstruction, event detection, video tracking, object recognition, object pose estimation, learning, indexing, motion estimation, and image restoration.

Representational and control requirements[edit]

Image-understanding systems (IUS) include three levels of abstraction as follows: Low level includes image primitives such as edges, texture elements, or regions; intermediate level includes boundaries, surfaces and volumes; and high level includes objects, scenes, or events. Many of these requirements are really topics for further research.

The representational requirements in the designing of IUS for these levels are: representation of prototypical concepts, concept organization, spatial knowledge, temporal knowledge, scaling, and description by comparison and differentiation.

While inference refers to the process of deriving new, not explicitly represented facts from currently known facts, control refers to the process that selects which of the many inference, search, and matching techniques should be applied at a particular stage of processing. Inference and control requirements for IUS are: search and hypothesis activation, matching and hypothesis testing, generation and use of expectations, change and focus of attention, certainty and strength of belief, inference and goal satisfaction.[10]

Typical tasks of computer vision[edit]

Each of the application areas described above employ a range of computer vision tasks; more or less well-defined measurement problems or processing problems, which can be solved using a variety of methods. Some examples of typical computer vision tasks are presented below.

Recognition[edit]

The classical problem in computer vision, image processing, and machine vision is that of determining whether or not the image data contains some specific object, feature, or activity. Different varieties of the recognition problem are described in the literature:

l    Object recognition (also called object classification) – one or several pre-specified or learned objects or object classes can be recognized, usually together with their 2D positions in the image or 3D poses in the scene. Blippar, Google Goggles and Like That provide stand-alone programs that illustrate this functionality.

l    Identification – an individual instance of an object is recognized. Examples include identification of a specific person's face or fingerprint, identification of handwritten digits, or identification of a specific vehicle.

l    Detection – the image data are scanned for a specific condition. Examples include detection of possible abnormal cells or tissues in medical images or detection of a vehicle in an automatic road toll system. Detection based on relatively simple and fast computations is sometimes used for finding smaller regions of interesting image data which can be further analyzed by more computationally demanding techniques to produce a correct interpretation.

Currently, the best algorithms for such tasks are based on convolutional neural networks. An illustration of their capabilities is given by the Image Net Large Scale Visual Recognition Challenge; this is a benchmark in object classification and detection, with millions of images and hundreds of object classes. Performance of convolutional neural networks, on the ImageNet tests, is now close to that of humans.[11] The best algorithms still struggle with objects that are small or thin, such as a small ant on a stem of a flower or a person holding a quill in their hand. They also have trouble with images that have been distorted with filters (an increasingly common phenomenon with modern digital cameras). By contrast, those kinds of images rarely trouble humans. Humans, however, tend to have trouble with other issues. For example, they are not good at classifying objects into fine-grained classes, such as the particular breed of dog or species of bird, whereas convolutional neural networks handle this with ease.

Several specialized tasks based on recognition exist, such as:

l    Content-based image retrieval – finding all images in a larger set of images which have a specific content. The content can be specified in different ways, for example in terms of similarity relative a target image (give me all images similar to image X), or in terms of high-level search criteria given as text input (give me all images which contains many houses, are taken during winter, and have no cars in them).

Computer vision for people counter purposes in public places, malls, shopping centres

l    Pose estimation – estimating the position or orientation of a specific object relative to the camera. An example application for this technique would be assisting a robot arm in retrieving objects from a conveyor belt in anassembly line situation or picking parts from a bin.

l    Optical character recognition (OCR) – identifying characters in images of printed or handwritten text, usually with a view to encoding the text in a format more amenable to editing or indexing (e.g. ASCII).

l    2D Code reading Reading of 2D codes such as data matrix and QR codes.

l    Facial recognition

l    Shape Recognition Technology (SRT) in people counter systems differentiating human beings (head and shoulder patterns) from objects

Computer vision system methods[edit]

The organization of a computer vision system is highly application dependent. Some systems are stand-alone applications which solve a specific measurement or detection problem, while others constitute a sub-system of a larger design which, for example, also contains sub-systems for control of mechanical actuators, planning, information databases, man-machine interfaces, etc. The specific implementation of a computer vision system also depends on if its functionality is pre-specified or if some part of it can be learned or modified during operation. Many functions are unique to the application. There are, however, typical functions which are found in many computer vision systems.

l    Image acquisition – A digital image is produced by one or several image sensors, which, besides various types of light-sensitive cameras, includerange sensors, tomography devices, radar, ultra-sonic cameras, etc. Depending on the type of sensor, the resulting image data is an ordinary 2D image, a 3D volume, or an image sequence. The pixel values typically correspond to light intensity in one or several spectral bands (gray images or colour images), but can also be related to various physical measures, such as depth, absorption or reflectance of sonic or electromagnetic waves, or nuclear magnetic resonance.[12]

l    Pre-processing – Before a computer vision method can be applied to image data in order to extract some specific piece of information, it is usually necessary to process the data in order to assure that it satisfies certain assumptions implied by the method. Examples are

l    Re-sampling in order to assure that the image coordinate system is correct.

l    Noise reduction in order to assure that sensor noise does not introduce false information.

l    Contrast enhancement to assure that relevant information can be detected.

l    Scale space representation to enhance image structures at locally appropriate scales.

l    Feature extraction – Image features at various levels of complexity are extracted from the image data.[12] Typical examples of such features are

l    Lines, edges and ridges.

l    Localized interest points such as corners, blobs or points.

More complex features may be related to texture, shape or motion.

l    Detection/segmentation – At some point in the processing a decision is made about which image points or regions of the image are relevant for further processing.[12] Examples are

l    Selection of a specific set of interest points

l    Segmentation of one or multiple image regions which contain a specific object of interest.

l    Segmentation of image into nested scene architecture comprised foreground, object groups, single objects or salient object parts (also referred to as spatial-taxon scene hierarchy)[13]

l    High-level processing – At this step the input is typically a small set of data, for example a set of points or an image region which is assumed to contain a specific object.[12] The remaining processing deals with, for example:

l    Verification that the data satisfy model-based and application specific assumptions.

l    Estimation of application specific parameters, such as object pose or object size.

l    Image recognition – classifying a detected object into different categories.

l    Image registration – comparing and combining two different views of the same object.

l    Decision making Making the final decision required for the application,[12] for example:

l    Pass/fail on automatic inspection applications

l    Match / no-match in recognition applications

l    Flag for further human review in medical, military, security and recognition applications

哈尔滨工业大学计算机专业课复试 Computer vision

本站小编免费考研网/2020-02-22

相关话题/哈尔滨工业大学 计算机

领限时大额优惠券,享本站正版考研考试资料!

哈尔滨工业大学计算机专业课复试 2016HITCS机试试题及答案含部分解析

哈尔滨工业大学计算机专业课复试 2015HITCS

哈尔滨工业大学计算机专业课复试 2016机试附加题

哈尔滨工业大学应用经济学原理849经验分享

哈尔滨工业大学2014年808理论力学考试试题

重庆大学零基础生物跨考计算机初试复试经验

北京邮电大学21计算机考研经验分享！

东北大学计算机2017-2019考研真题

电子科技大学计算机考研复试经验贴

2018云南大学计算机程序设计904回忆考研试题

电子科技大学计算机820考研经验贴

2020哈尔滨工业大学计算机考研试题854计算机基础

苏州大学计算机考研初试经验分享

2020哈尔滨工业大学通信803复试经验分享

2020大连海事大学计算机真题回忆