Spring 2014

Spatial Frequency Response of Plenoptic Cameras (TAKEN)

Plenoptic cameras [1], such as the newly released Lytro, are able to capture a 4D light field in one single exposure. This is achieved by inserting a microlens array between the sensor and main lens, creating a plenoptic camera. Each microlens measures not just the total amount of light deposited at that location, but how much light arrives along each ray. This allows for exciting new image rendering options, such as defining focus and depth-of-field after capture, stereo images, etc.

Considering the recent commercialization of these cameras, there have not yet been any methods developed on how to measure their objective image quality characteristics, such as resolution and noise, for example. For traditional digital cameras, the International Standards Organization (ISO) has published well established standards. Taking the example of resolution, the ISO standard 12233 [2] defines a test target and evaluation method on how to compute the spatial frequency response of the optics and sensor combination. Yet, considering that the images from a plenoptic camera are created computationally, the effective resolution varies according to the computation. The traditional testing methods thus do not apply anymore.

The purpose of this project is thus to develop, implement, and testmethods on how the spatial frequency response of such light field cameras can be measured. In other words, how can we determine the real resolution of the camera and thus the real information content of light field images.

 

References

[1] R. Ng et al., Light Field Photography with a Hand-held Plenoptic Camera, Stanford Tech Report CTSR 2005-02, 2005.

[2] ISO 12233:2000: Photography–Electronic Still-Picture Cameras–Resolution Measurements.

Prerequisites: Knowledge in signal and image processing, programming with MATLAB for signal processing.

Type of work: 40% research and 60% implementation

Level: MS thesis or semester project.

Supervisors: Loic Baboulaz ([email protected]) and Sabine Süsstrunk ([email protected])

 
 
 
Image Deblurring using Motion Sensors (TAKEN)
Synopsis: Deblurring an image by fusing camera and motion sensors’ data 

In order to take a well-focused picture, it is important not to move or shake the camera during the exposure time. However, this is unavoidable, especially with the pictures that are taken using hand-held devices such as mobile phones.  Luckily, today’s mobile phones (or “smart” phones) are equipped with various motion sensors such as accelerometers, gyroscopes and magnetometers. We claim that using these sensors in accordance with the camera in a mobile device, we can perform deblurring on an image. In this project you will:

• Use the iPhone app that is designed in our lab to collect image and motion sensor data (we will provide an iPhone to collect the data, if necessary)

• Estimate the camera motion during the exposure time by integrating the information from camera and available motion sensors,

• Propose a method, which uses this motion information to deblurr the image

Deliverables: The final deliverables for the project will be the implemented motion estimation and deblurring algorithms, and a final report that explains the work done.

Prerequisites: Knowledge in image processing, programming in C++ using OpenCV library OR programming with MATLAB for signal processing

Type of work: 50% research and 50% implementation.

Number of students: 1.

Level: MS, semester project, Computer Science or Communication Systems.

Supervisor: Gokhan Yildirim ([email protected]).

 
 
 

Chromatic Aberration in the RGB+NIR camera (TAKEN)

Synopsis: In this project, we will study the chromatic aberration (CA) in the RGB and near-infrared (NIR) camera, where a single sensor is used to capture both images simultaneously.

Chromatic aberration is one of the important concerns in designing multi-spectral imaging systems including color cameras. This phenomenon happens because the refraction index of the lens changes with the wavelength. Thus, the lens cannot focus light rays with different wavelengths at the same point. As a result, in color imaging, if a simple lens is used, only the image of one channel is in focus and the other two images are blurred. To handle this problem, two general approaches are usually followed in color imaging. The first approach is to correct the optical path by using compound lenses. In such lenses, several materials with different dispersions are combined together to form the lens. It is also possible to minimize the chromatic aberration by processing the color image after acquisition.

In the case of RGB+NIR joint acquisition, the wavelength range of the captured light is 700nm (from 400 to 1100nm), which is almost twice the wavelength changes in color imaging (from 400 to 700nm). Therefore, chromatic aberration in the RGB+NIR camera is more severe than the commercial color camera. To address CA, in this project we will first investigate whether the current correction techniques used in color imaging are applicable to the case of RGB+NIR joint acquisition or not. 

The first step of the project is to study the compound lenses that can correct chromatic aberration in both visible and NIR bands of the spectrum. In the next step, we will try to extend the post-processing techniques developed in color imaging [1] to attenuate the effects of CA in the RGB+NIR camera. Finally, to handle CA, we will modify the color filter array (CFA) and the demosaicing algorithm optimized in [2] for the RGB+NIR camera. To this end, we assume that some channels of captured images would suffer from distortions introduced by the imperfect lens, and adapt the CFA and demosaicing accordingly.

References:

[1] M. Singh and T. Singh. Joint chromatic aberration correction and demosaicing. Proceedings of IS&T/SPIE EI: Digital Photography, VIII, 2012.
[2] Y. M. Lu, C. Fredembach, M. Vetterli and S. Süsstrunk. Designing color filter arrays for the joint capture of visible and near-infrared images. International Conference on Image Processing, 2009.

Prerequisites: Basic knowledge about optics, image processing and demosaicing, and good MATLAB skills.

Type of work: 10% literature survey, 30% theory, 60% MATLAB implementation. 

Number of students: 1.

Level: BS or MS, semester project, Computer Science or Communication Systems.

Supervisor: Zahra Sadeghipoor ([email protected]). 
 
 
 

Handwritten Word Recognition using Hidden Markov Models and Recurrent Neural Networks (TAKEN)

SynopsisIn the problem of handwritten word recognition the goal is the automatic conversion of a handwritten text document into a form that can be processed by any text-processing application. The recognition algorithm receives as input a handwritten text image and converts it into electronic text form which can be used for information retrieval applications. 

Many algorithms have been proposed so far in the literature for addressing the above problem. Two of the most successful are the Hidden Markov Model and the Recurrent Neural Network. They have been used to model many types of sequential data, such as handwritten text and speech. In this project we investigate their use and applicability to historical handwritten text documents. In this project the student(s) will:

  • Get familiar with the two models and their use in handwritten text recognition.
  • Implement the two models for historical handwritten text recognition.
  • Evaluate its properties and accuracy in handwritten text datasets.

References:

[1] Marti and Bunke. Text Line Segmentation and Word Recognition in a System for General Writer Independent Handwriting Recognition. Document Analysis and Recognition, 2001.
[2] Espana-Boquera et al. Improving Offline Handwritten Text Recognition with Hybrid HMM/ANN Models. Transactions on Pattern Analysis and Machine Intelligence, 2011.
[3] Graves et al. A novel Connectionist System for Unconstrained Handwriting Recognition. Transactions on Pattern Analysis and Machine Intelligence, 2009.

DeliverablesThe student should provide in the end of the semester a written report related to the work done. The whole implementation of the recognition algorithm used for generating the results in the report should also be provided.

PrerequisitesBasic knowledge of Image Processing and Machine Learning, programming with MATLAB for signal/image processing OR C++ using the OpenCV library.

Type of work: 30% research, 70% implementation.

Number of students: 1 or 2.

Level: MS, semester project, Computer Science or Communication Systems.

Supervisor: Nikolaos Arvanitopoulos ([email protected]).
 
 
 

Historical Document Image Segmentation (TAKEN)

SynopsisThe problem of Document Image Segmentation involves the extraction of information about the layout components of a document page. A typical historical document contains several components, such as text regions, headlines, illustrations and drawings. It is a necessary pre-processing step for any handwriting recognition system, where the goal is to recognize the handwritten words in a document page. Ideally, a document image segmentation algorithm should be able to accurately separate text regions from any other component of the document page so that the input to a recognition algorithm is as noiseless and reliable as possible.

The goals of this project are:

  • To get familiar with the proposed algorithms in [1,2] for the document segmentation problem. 
  • To implement them and compare them in a dataset of historical document pages. 
  • Provide findings about their differences in accuracy and their applicability on the different documents under investigation. 

References:

[1] Mehri et al. Old document image segmentation using the autocorrelation function and multiresolution analysis. Document Recognition and Retrieval XX, San Francisco, USA, 2013.
[2] Cohen et al. Robust Text and Drawing Segmentation Algorithm for Historical Documents, HIP 2013.

DeliverablesThe student should provide in the end of the semester a written report related to the work done. The whole implementation of the segmentation algorithms used for generating the results in the report should also be provided.

PrerequisitesBasic knowledge of Image Processing, programming with MATLAB for signal/image processing OR C++ using the OpenCV library.

Type of work: 20% research, 80% implementation.

Number of students: 1.

Level: BS, semester project, Computer Science or Communication Systems.

Supervisor: Nikolaos Arvanitopoulos ([email protected]).

 

 

 

Games to Test Apps for Color Deficient Viewers (TAKEN)

Description:

Statistically, 7-10% of males have some form of red-green color deficiency [CBI; Wikipedia].  For these viewers, some shades of red may look like some shades of green, making it hard to distinguish between reds and greens.  It can be hard to see the contrast between certain pairs of colors that appear very different to viewers with normal vision.  We have developed an Android app to help color deficient viewers visualize the color contrasts that normal viewers can see.  We propose a semester project to evaluate the app with a user study in the format of a game.

 

Games have been successfully used to perform user studies, motivating users to perform a task correctly [Von Ahn and Dabbish 2008].  We would like to measure our app’s ability to help users to visualize color contrasts.  One way to do this would be to time how long it takes the user to distinguish between colors, find a specific color, or name a color patch.  However, measuring response times or any other factor that depends on time would not be reliable because we cannot guarantee that the user will try to complete the task as fast as possible.  We propose to design a game to motivate users to respond in a timely manner so that we can use time-dependent measures in our analysis.  This semester project includes implementing the game by extending the current Android app’s code, conducting the user study, and analyzing the results.

 

References:

-[CBI] Color Blindness Information, Identification, Solutions. Color Blindness Facts & Statistics: Prevalence.

-http://www.colour-blindness.com/general/prevalence/

-L. Von Ahn and L. Dabbish. 2008. Designing Games With A Purpose. Communications of the ACM.

-Wikipedia. Color Blindness.

http://en.wikipedia.org/wiki/Color_blindness

Prerequisites:  Android app programming, OpenGL GPU programming, basic knowledge of human color vision and color science.

Type of Work:  50% research, 50% implementation.

Number of students: 1.

Level:  MS, semester project, Computer Science of Communication Systems.

 

Supervisor:  Cheryl Lau ([email protected]). 

 

 

 

Find the important objects in the image (TAKEN)

Synopsis: Incorporating context information to help find the desirable objects in the image. 

Object detection is a hot research topic in computer vision area, because many applications rely on first finding the objects in the image. Note that not all objects in the image are important to the users, to find the desirable objects is an open problem. In this project we try to infer users’ intention from the context information provided by the users. These context information is then incorporated into object detection algorithms to find the desirable objects in the image. 

The purpose of this projects is:

To get familiar with several state-of-the-art object detection algorithms.

To implement one or two object detection algorithms for a class of objects.

To create an interface that inputs the name of the objects users want to search for and  outputs the detection results.

Evaluate the performance of the detection algorithm.

 

References: 

[1] C. Zhang and Z. Zhang. A survey of recent advances in face detection. Technical report, Microsoft Research, 2010.

[2] Paul Viola , Michael Jones. Robust Real-time Object Detection. International Journal of Computer Vision. 2001.

Deliverables:  In the end of the semester, the student should provide a written report on the work done. The implemented object detection algorithms as well as the written interface should also be provided.  

Prerequisites: Basic knowledge of image processing and computer vision, programming with Matlab or C/C++ using OpenCV.

Type of work: 50% research and 50% implementation. 

Number of students: 1.

Level: MS, semester project, Computer Science or Communication Systems.

Supervisor: Bin Jin ([email protected]).

 

 

 

Professional vs non-professional photos (TAKEN)

Synopsis: Since the prevalence of smartphones, everyone takes pictures. However, there is a big difference between a professional photographer and an amateur. The professional photos always distinguish themselves from the others by their composition, color and visual path. In this project we aim at studying the differences between professional photos and non-professional ones. We want to develop an algorithm to distinguish these two.
 
In this project you will:
Construct a database that contains both professional images and non-professional ones.
Study the differences between these two: the composition, the illuminance, the color, e.t.c.
Train a classifier to distinguish between these two sets.
References:
 
[1] Ensenberger, P. [Focus on composing photos], Focal press.
[2] Liu, L., Chen, R., Wolf, L., Cohen-Or, D. (2010, May). Optimizing photo composition. In Computer Graphics Forum (Vol. 29, No. 2, pp.469-478). Blackwell Publishing Ltd.
[3] Datta, Ritendra, et al. Studying aesthetics in photographic images using a computational approach. Computer Vision-ECCV 2006. Springer Berlin Heidelberg, 2006. 288-301.
Deliverables: In the end of the semester, the student should provide a written report on the work done as well as the data and the code.

Prerequisites: Basic knowledge of image processing and computer vision, programming with Matlab or C/C++ with OpenCV.

Type of work: 50% research and 50% implementation.

Number of students: 1.

Level: MS, semester project, Computer Science or Communication Systems 

Supervisor: Bin Jin ([email protected]).