Past projects

Credits detection in RTS videos

Description: TV shows, movies, and videos usually contain beginning and ending credits (generiques, in French). The goal of this project it to detect the time instant when the beginning credits start and the ending credits end.

Note: This is an RTS (radio television Swiss) project.

Tasks:
1. Set up a video decoding scheme (ffmpeg or such)
2. Code-up a few basic algorithms for detecting credits frames (text detection help will be provided)

Deliverable:
An executable (command line or GUI) that can take videos as input and indicate time instances as output.

Number of students: 1

Prerequisites:
– knowledge of image processing, computer vision, etc.
– coding skill in C / C++

Level:
MS Semester project, Computer Science or Communication Systems

Type of work: 30% research and 70% implementation

 

Supervisor: Radhakrishna Achanta (firstname.lastname@epfl.ch)  

 

Historical Document Image Segmentation (TAKEN)

SynopsisThe problem of Document Image Segmentation involves the extraction of information about the layout components of a document page. A typical historical document contains several components, such as text regions, headlines, illustrations and drawings. It is a necessary pre-processing step for any handwriting recognition system, where the goal is to recognize the handwritten words in a document page. Ideally, a document image segmentation algorithm should be able to accurately separate text regions from any other component of the document page so that the input to a recognition algorithm is as noiseless and reliable as possible.

The goals of this project are:

  • To get familiar with the proposed algorithms in [1,2] for the document segmentation problem. 
  • To implement them and compare them in a dataset of historical document pages. 
  • Provide findings about their differences in accuracy and their applicability on the different documents under investigation. 

References:

[1] Mehri et al. Old document image segmentation using the autocorrelation function and multiresolution analysis. Document Recognition and Retrieval XX, San Francisco, USA, 2013.
[2] Cohen et al. Robust Text and Drawing Segmentation Algorithm for Historical Documents, HIP 2013.

DeliverablesThe student should provide in the end of the semester a written report related to the work done. The whole implementation of the segmentation algorithms used for generating the results in the report should also be provided.

PrerequisitesBasic knowledge of Image Processing, programming with MATLAB for signal/image processing OR C++ using the OpenCV library.

Type of work: 20% research, 80% implementation.

Number of students: 1.

Level: BS, semester project, Computer Science or Communication Systems.

 

Supervisor: Nikolaos Arvanitopoulos (nick.arvanitopoulos@epfl.ch).

 

 

 

Handwritten Word Recognition using Hidden Markov Models and Recurrent Neural Networks (TAKEN)

SynopsisIn the problem of handwritten word recognition the goal is the automatic conversion of a handwritten text document into a form that can be processed by any text-processing application. The recognition algorithm receives as input a handwritten text image and converts it into electronic text form which can be used for information retrieval applications. 

Many algorithms have been proposed so far in the literature for addressing the above problem. Two of the most successful are the Hidden Markov Model and the Recurrent Neural Network. They have been used to model many types of sequential data, such as handwritten text and speech. In this project we investigate their use and applicability to historical handwritten text documents. In this project the student(s) will:

  • Get familiar with the two models and their use in handwritten text recognition.
  • Implement the two models for historical handwritten text recognition.
  • Evaluate its properties and accuracy in handwritten text datasets.

References:

[1] Marti and Bunke. Text Line Segmentation and Word Recognition in a System for General Writer Independent Handwriting Recognition. Document Analysis and Recognition, 2001.
[2] Espana-Boquera et al. Improving Offline Handwritten Text Recognition with Hybrid HMM/ANN Models. Transactions on Pattern Analysis and Machine Intelligence, 2011.
[3] Graves et al. A novel Connectionist System for Unconstrained Handwriting Recognition. Transactions on Pattern Analysis and Machine Intelligence, 2009.

DeliverablesThe student should provide in the end of the semester a written report related to the work done. The whole implementation of the recognition algorithm used for generating the results in the report should also be provided.

PrerequisitesBasic knowledge of Image Processing and Machine Learning, programming with MATLAB for signal/image processing OR C++ using the OpenCV library.

Type of work: 30% research, 70% implementation.

Number of students: 1 or 2.

Level: MS, semester project, Computer Science or Communication Systems.

 

Supervisor: Nikolaos Arvanitopoulos (nick.arvanitopoulos@epfl.ch).

 

 

 

Professional vs non-professional photos (TAKEN)

Synopsis: Since the prevalence of smartphones, everyone takes pictures. However, there is a big difference between a professional photographer and an amateur. The professional photos always distinguish themselves from the others by their composition, color and visual path. In this project we aim at studying the differences between professional photos and non-professional ones. We want to develop an algorithm to distinguish these two.

In this project you will:

  1. Construct a database that contains both professional images and non-professional ones.
  2. Study the differences between these two: the composition, the illuminance, the color, e.t.c.
  3. Train a classifier to distinguish between these two sets.

References:

[1] Ensenberger, P. [Focus on composing photos], Focal press.
[2] Liu, L., Chen, R., Wolf, L., Cohen-Or, D. (2010, May). Optimizing photo composition. In Computer Graphics Forum (Vol. 29, No. 2, pp.469-478). Blackwell Publishing Ltd.
[3] Datta, Ritendra, et al. Studying aesthetics in photographic images using a computational approach. Computer Vision-ECCV 2006. Springer Berlin Heidelberg, 2006. 288-301.

Deliverables: In the end of the semester, the student should provide a written report on the work done as well as the data and the code.

Prerequisites: Basic knowledge of image processing and computer vision, programming with Matlab or C/C++ with OpenCV.

Type of work: 50% research and 50% implementation.

Number of students: 1.

Level: MS, semester project, Computer Science or Communication Systems.

Supervisor: Bin Jin (bin.jin@epfl.ch).

 
 
 
Games to Test Apps for Color Deficient Viewers (TAKEN)
 
Description:
 
Statistically, 7-10% of males have some form of red-green color deficiency [CBI; Wikipedia].  For these viewers, some shades of red may look like some shades of green, making it hard to distinguish between reds and greens.  It can be hard to see the contrast between certain pairs of colors that appear very different to viewers with normal vision.  We have developed an Android app to help color deficient viewers visualize the color contrasts that normal viewers can see.  We propose a semester project to evaluate the app with a user study in the format of a game.
 
Games have been successfully used to perform user studies, motivating users to perform a task correctly [Von Ahn and Dabbish 2008].  We would like to measure our app’s ability to help users to visualize color contrasts.  One way to do this would be to time how long it takes the user to distinguish between colors, find a specific color, or name a color patch.  However, measuring response times or any other factor that depends on time would not be reliable because we cannot guarantee that the user will try to complete the task as fast as possible.  We propose to design a game to motivate users to respond in a timely manner so that we can use time-dependent measures in our analysis.  This semester project includes implementing the game by extending the current Android app’s code, conducting the user study, and analyzing the results.
 
References:
 
-[CBI] Color Blindness Information, Identification, Solutions. Color Blindness Facts & Statistics: Prevalence.
-http://www.colour-blindness.com/general/prevalence/
-L. Von Ahn and L. Dabbish. 2008. Designing Games With A Purpose. Communications of the ACM.
-Wikipedia. Color Blindness.
http://en.wikipedia.org/wiki/Color_blindness
 
Prerequisites:  Android app programming, OpenGL GPU programming, basic knowledge of human color vision and color science.
 
Type of Work:  50% research, 50% implementation.
 
Number of students: 1.
 
Level:  MS, semester project, Computer Science of Communication Systems.
 
Supervisor:  Cheryl Lau (cheryl.lau@epfl.ch).