Available Projects – Spring 2019

Simulation of moiré effects for counterfeit prevention

Synopsis:
EPFL and startup company Innoview Sàrl have developed counterfeit prevention features.
The proposed project consists of simulating a setup comprising lenslets and reflective elements and of verifying the quality of the resulting moiré shapes. This reflective moiré setup can be simulated by applying simple ray-tracing techniques and/or by using the Blender computer graphics rendering software.

Deliverables: Report and framework for running simulations

Prerequisits:
– coding skills in Matlab or Java.
– basic knowledge in 3D graphics

Level: BS, MS semester project

Supervisors:
Dr Romain Rossier, Innoview Sàrl, romain.rossier@innoview.ch, tel 078 664 36 44
Prof. hon Prof. hon. Roger D. Hersch, INM034, rd.hersch@epfl.ch, cell: 077 406 27 09


Recovery of a watermark hidden within a color image by an Android smartphone

Description:
Startup company Innoview Sàrl has developed software to recover by smartphone a watermark hidden into a grayscale image that displays simple graphical elements such as a logo. The current project aims at carrying out a similar recovery, but for a watermark that is hidden into a full color image (e.g. the color photograph of the holder of a document).  

Deliverables: Report and running prototype (Matlab and/or Android).

Prerequisites:
– knowledge of image processing / computer vision
– basic coding skills in Matlab and/or Java Android

Level: BS or MS semester project or possibly master project

Supervisors:
Dr Romain Rossier, Innoview Sàrl, romain.rossier@innoview.ch, , tel 078 664 36 44
Prof. Roger D. Hersch, INM034, rd.hersch@epfl.ch, cell: 077 406 27 09


Recovery of watermarks on a curved surface by an Android smartphone

Startup company Innoview Sàrl has developed software to recover by smartphone a watermark by superposing a software revealer on top of a base image that is obtained by camera acquisition. The project aims at extending this project in order to recover watermarks that are printed on a curved surface such as on the label of a bottle of wine.

Deliverables: Report and running prototype (Matlab and/or Android).

Prerequisites:
– knowledge of image processing
– basic coding skills in Matlab and Java Android

Level: BS or MS semester project

Supervisors:
Dr Romain Rossier, Innoview Sàrl, romain.rossier@innoview.ch, , tel 078 664 36 44
Prof. Roger D. Hersch, INM034, rd.hersch@epfl.ch, cell: 077 406 27 09


Classification of Movies (Master Thesis Project or Semester Project)

Description: The goal of this project is to predict the specific tasks for each scene in a film by using neural networks. We use a three channel approach where the training is performed on video, audio, and text channels. The details for the specific tasks will be shared after the application.

Tasks:
– Understand the literature and state of art

– Revise the training set.
– Revise our multi-channel network and make parameter tuning.
– Validate the model and improve the accuracy.
– Correlate the age profiles with the trends in the industry.

References:
Sivaraman, K. S., and Gautam Somappa. “MovieScope: Movie trailer classification using Deep Neural Networks.” University of Virginia (2016).

Simões, Gabriel S., et al. “Movie genre classification with convolutional neural networks.” Neural Networks (IJCNN), 2016 International Joint Conference on. IEEE, 2016.

Deliverables: At the end of the semester, the student should provide a framework that gives the predictions of scenes for the given task.

Prerequisites: Experience in deep learning and computer vision, experience in Python, experience in Keras, Theano, or TensorFlow

Type of work: 40% research, 60% development and testing

Level: Master

Supervisor: Sami Arpa (sami.arpa@epfl.ch)


Deep Learning based Movie Recommendation System (Master Thesis Project or Semester Project)

Description: In this project, you will work on the recommendation system of Sofy.tv. Sofy.tv uses a recommendation system based on the recipes of the movies. These recipes are found through our multi-channel deep learning system. The goal of this project is to improve the recommendations by finding the best fits for the user taste.

Tasks:
Understand the literature and our framework

– Revise our taste clustering system
– Revise our matchmaking system between the users and films.
– Test the revised model.

Deliverables: At the end of the semester, the student should provide an enhanced framework for the recommendation.

Prerequisites: Experience in deep learning and computer vision, experience in Python, experience in Keras, Theano, or TensorFlow. Basic experience in web programming.

Type of work: 50% research, 50% development and testing

Level: Master

Supervisor: Sami Arpa (sami.arpa@epfl.ch)


Reconstruction of 3D objects with planar mirrors

Synopsis:
3D reconstruction of objects has traditionally been done with a set of two or more cameras. We would like to instead replace the additional cameras with mirrors, allowing us to capture a scene from several viewpoints within a single image by using a single camera.

In this project, you will implement a system capable of 3D object reconstruction. You will start by designing and building a box composed of four mirrors that will later be placed in front of the camera. The reflections from this box of mirrors will provide us with four additional views in addition to the central view [1]. The central view will be projected in the center of the camera image, and the additional views will be located on the sides of the camera image. Like in traditional stereo systems, you will calibrate the setup, rectify the side, top and bottom views and match image features along epipolar lines across the views [2]. Finally, you will perform triangulation to reconstruct the 3D scene.

References:
[1] Joshua Gluckman and Shree K. Nayar. “Rectified catadioptric stereo sensors.” IEEE transactions on pattern analysis and machine intelligence (2002).
[2] Joshua Gluckman and Shree K. Nayar. “Catadioptric stereo using planar mirrors.” International Journal of Computer Vision (2001).

Deliverables: Report and running prototype.

Prerequisites:
 – knowledge of computer vision
– coding skills in Matlab and Python or C/C++

Level: MS semester project

Type of work: 50% implementation and 50% research

Supervisor: Marjan Shahpaski (firstname.lastname@epfl.ch)


Simulation of a radial imaging system

Synopsis:
Radial imaging systems capture a scene from a large number of viewpoints within a single image, using a camera and a curved mirror. These systems can recover scene properties such as scene geometry, reflectance, and texture [1].

In this project, you will implement a system that simulates an image captured by such a setup. For a predefined set of camera parameters and cylinder length and diameter, and a known 3D scene, you will simulate the virtual image that this “pinhole” camera would capture by using ray tracing.

References:
[1] Sujit Kuthirummal and Shree K. Nayar. “Multiview radial catadioptric imaging for scene capture.” ACM Transactions on Graphics (2006).
[2] Joshua Gluckman and Shree K. Nayar. “Catadioptric stereo using planar mirrors.” International Journal of Computer Vision (2001).

Deliverables: Report and running prototype.

Prerequisites:
 – knowledge of computer vision
– coding skills in Matlab and Python or C/C++

Level: BS/MS semester project

Type of work: 50% implementation and 50% research

Supervisor: Marjan Shahpaski (firstname.lastname@epfl.ch)


Knowledge transfer between RGB only and RGB + other modalities

Synopsis:
Deep neural networks are mostly trained on large RGB datasets. This limits their application for different modalities (depth, NIR, thermal, etc.) where other types of information are available. We want to make use of additional data such as depth and NIR to improve the accuracy of conventional deep learning approaches.

The problem with multimodal data is the lack of large datasets as in RGB images alone. So, the proposed technique should not suffer from overfitting and should benefit from the large RGB only datasets.

In this project you will:
Train networks on RGB and RGB+ data in supervised or unsupervised fashion (baselines).

Improve the RGB only network using the additional modalities available such as NIR, depth and thermal

channels (datasets for NIR and thermal are available).

References:
[1] Cross Modal Distillation for Supervision Transfer https://arxiv.org/pdf/1507.00448.pdf

Deliverables:
A neural net that receives RGB+ input to improve the accuracy of a conventional net trained on RGB alone

Prerequisites:
Knowledge and experience in deep learning
RGB+NIR and RGB+Thermal datasets are available

Level: MS semester or thesis project

Type of work: 60% implementation and 40% research.

Supervisor: Fayez Lahoud, Siavash Bigdeli


Visual feedback for tone training

Synopsis:
The sound of language can be divided into consonants, vowels and tones. As many as 70% of the world’s languages use tones to convey word meaning. – Moira Yip, Tone.
However, for Latin and most Germanic based languages, tone is only used to convey emotion combined with stress and rhythm changes. This creates difficulties in learning tonal languages when it comes to speaking and listening. In this project, we aim to create a tool (application) that helps people train their tones by providing a clear and constructive feedback based on understanding where the user’s difficulties lie. We will first focus on Mandarin Chinese as it has a simple tonal system comprised of only 4 different contour tones.
We already have a setup ready to draw the pitch from a sound recording, we propose to build a tool that:

  • Takes a user recording and a correct recording and compares them, providing a good feedback for the user to improve
  • Study the user’s history of successes/mistakes to adjust the learning schedule
  • Potentially help tackle minimal pairs training for listening and speaking (e.g. shuang vs shuan/zuo vs cuo vs suo/…)

In this project you will:
Design and implement a distance measure to compute the difference between a user’s tones and the correct tonal pronunciation. Study the patterns of mistakes done by users to identify their weakness and propose an adjustment in their learning schedule.

References:
[1] Tone visualization: http://www.sinosplice.com/life/archives/2008/01/21/seeing-the-tones-of-mandarin-chinese-with-praat

Deliverables:
Functional tool (Optional: Mobile application)

Prerequisites:
Comfortable with machine learning, speech processing
Knowledge of a tonal language could help you on the way

Level: Bachelor or Master

Type of work: 60% implementation, 40% research

Supervisor: Fayez Lahoud, Ruofan Zhou


Tracking and mapping in infrared video

Synopsis:
The goal of this project is to be able to localize and map an area using infrared (thermal) video sequences. In low visibility conditions, such as darkness or smoke, visual methods are not helpful as most of the visible photography becomes black. One can use frames from a thermal camera to create model to help navigate people/robots in such areas.
There has been a lot of previous work on visual mapping, this project aims to study the literature and reproduce the state of art, then apply this knowledge to the problem of thermal mapping.

In this project you will:

  • Understand previous literature and state of art
  • Implement a few existing methods
  • Transfer that knowledge to a thermal model

References:
[1] Parallel Tracking and Mapping for Small AR Workspaces. G. Klein, D. Murray. ISMAR 2007
[2] Real-Time 6-DOF Monocular Visual SLAM in a Large-scale Environments. H. Lim, J. Lim, H. Jin Kim. ICRA 2014.
[3] RGB-T SLAM: A flexible SLAM framework by combining appearance and thermal information. L. Chen, L. Sun, T. Yang, L. Fan, K. Huang and Z. Xuanyuan. ICRA 2017.

Deliverables: A set of implemented visual SLAM methods (at most 3) + a thermal SLAM

Prerequisites: Experience in tracking and localization, potentially usage of deep learning

Level: Master

Type of work: 70% implementation, 30% research

Supervisor: Fayez Lahoud


Task-driven super-resolution

Synopsis:
Super-resolution is a low-level vision problem while it can contribute to many high-level vision tasks such as detection. In this project you will try to incorporate the objective of the downstream task (such as object detection, image segmentation, recognition) into training of a super-resolution module, evaluate if task-driven super-resolution could improves accuracy of both high-level task and the super-resolution.

Tasks:

  • implement convolutional networks
  • validate the model and test on different datasets
  • evaluate super-resolution results
  • evaluate high-level tasks results

Deliverables: Report and implementation of deep convolutional networks

Prerequisites: knowledge of deep learning and computer vision; experience in pytorch/tensorflow/keras

Level: Master semester project or thesis

Type of work: 80% implementation, 20% research

Supervisor: Ruofan Zhou (ruofan.zhou@epfl.ch)


CNN deep learning on large image transformation

Synopsis:
The project revolves around CNN deep learning for transforming full images. For more information about the exact topic, you can discuss in person with one of the supervisors. Generally, the method’s aim is to transform images and is also concerned with perceived image quality, so both will be important factors in the project and in the assessment of your results.

Deliverables: Report and implementation of a deep convolutional approach

Prerequisites: Very strong background in deep learning (PyTorch), Python, perceptual image quality assessment, and adversarial training experience

Level: MS

Type of work: 80% research, 20% development and testing

Supervisor: Ruofan Zhou, Majed El Helou


Longitudinal chromatic aberration assessment tool

Description: The goal of this project is to evaluate longitudinal chromatic aberration from a single photo in the presence of lateral chromatic aberration. You will be acquiring a set of photos of printed edges placed at different depths in the scene and the objective is to remove the lateral chromatic aberration across the image to be able to use its full width for assessing longitudinal chromatic aberration, which you will work on in the second part of the project. Your final results should then be compared to prior work results to evaluate the matching.

Tasks:
Review the literature; PSF estimation protocols, lens assessment and chromatic aberration.

Capture a dataset of edge images.
Remove lateral chromatic aberration across the image.
Evaluate longitudinal chromatic aberration from a single image where the lateral chromatic aberration was corrected.
Evaluate the results on different lenses/cameras and compare to prior work.

References: http://www.imatest.com/docs/sfr_chromatic/

https://www.dxomark.com/dxomark-lens-camera-sensor-testing-protocol/

Deliverables: Report, dataset, implementation codes for lateral chromatic aberration removal and for single-image longitudinal chromatic aberration assessment.

Prerequisites: Comfortable reading codes in MATLAB (and writing), strong mathematical signal processing background, experience with hardware and (professional) image acquisition techniques.

Type of work: 80% research, 20% development and testing

Level: BS

Supervisor: Majed El Helou


Latex Generation from printed equation

Description: LaTeX is a powerful typesetting system that is extremely useful for technical documents, in particular mathematical equations. However, once rendered, the output cannot be modified without access to the underlying code. Re-coding lengthy equations is time consuming and prone to error. This project intend to allow a user to take a photograph of a printed equation and produce LaTeX markup code to generate the equation. 

Deliverables: Report and framework for generating Latex code

Prerequisites: Comfortable with image processing and dealing with math equations 

Type of work: 60% implementation, 40% research

Level: BS

Supervisor: Ruofan Zhou