Video Processing Internships

[OPEN][RI-ISL_2019-DM-VP-015] Speeding-up video coding with deep-learning

The topic of this internship is the development of Deep Learning based methods to speed-up/improve state-of –the-art video codec (namely VCC/H.266). The goal of the internship is to tackle the combinatory problem arising with the new codecs, especially because of the enhanced block topologies available in the codec. The goal is to manage combinatory reduction without decreasing the codec performance. Deep Learning based methods have already proved their efficiency for intra coding mode (see http://phenix.int-evry.fr/jvet/doc_end_user/documents/10_San%20Diego/wg11/JVET-J0034-v2.zip). Many extensions are possible, especially regarding the inter coding mode, dealing with motion field segmentation. The candidate should be familiar with current machine learning software packages and have a good background in image processing in general.

Skills : Skills: (deep) learning algorithms and software, programming (C++/python), motion estimation

Keywords : video codec, machine learning, motion segmentation, image processing

This internship is located in Rennes, France. If interested, please apply at stage.technicolor@technicolor.com by sending us your resume and a cover letter with the internship reference in the email subject line.

[OPEN][RI-ISL_2019-VP-019] Exploration of Advanced video compression technologies

MPEG and VCEG are jointly exploring, in a standardization project named JVET, future video coding technologies, as successors of the AVC and HEVC video coding standards. Technicolor is deeply involved in this exploration process. The internship aims at exploring new tracks for improving video compression, mostly focused on inter prediction and coding. The research work will be made based on the reference software developed by JVET. The internship will take place in a research team of several video coding and standardization experts. It will also benefit from the Technicolor experts working on HDR and Color science

Skills : video coding, signal and video processing, c++

Keywords : .video coding, motion prediction, intra prediction, motion coding, intra coding, HEVC, JVET, HDR

[OPEN][RI-ISL_2019-CV-DM-VP-052] One-click video editing

In film post-production, digital makeup of actors to age or de-age them, add or remove scars and generally modify their appearance is a painstaking and time-consuming process, even for skilled artists. In this internship, the goal is to facilitate this process, by helping the development of automatic tools that allow artists to easily edit an actor's appearance across many scenes. The work will revolve around the concept of 'Unwrap mosaics', a novel representation that condenses the appearance of an actor within a video sequence into a single image, which can be edited and propagated back into the video. The candidate will build on a prototype of unwrap mosaics implemented at Technicolor. To ensure that the applied corrections are realistic, the internship will first target the development of an automatic color correction solution. More specifically, the goal is to compute automatically a suite of parametric corrections (color, contrast, sharpness) that will adjust the edit in all frames so that there is visually no seam after insertion. To further reduce the artistic effort required, the internship will also focus on the propagation of edits from one scene to other scenes where the same actor appears, managing both the geometric and the colorimetric corrections necessary. Both deep learning and traditional techniques can be tested and compared. The work will happen in the context of an ongoing and tight collaboration between Technicolor R&I and the film production teams in Hollywood specialized in VFX and digital makeup (https://www.technicolor.com/create/technicolor-los-angeles/vfx), addressing their ongoing needs in active and future projects.

Skills : machine learning, computer vision, video processing

Keywords : .vfx, post-production, image processing

[CLOSED][RI-ISL_2019-DM-VP-039] Compression of Neural Networks

The objective of the internship is to study and to develop algorithms for compressing neural network (NN) coefficients for transport or storage. The algorithms will be derived from state-of-art still pictures compression techniques. The efficiency of the compression of NN algorithms will be evaluated using several on-the-shelf NN classifiers or image post-filters, by comparing the results obtained with the original NN with the results obtained with the compressed NN.

After having studied state-of-the-art NN-based compression publications, the intern will test existing NN frameworks supporting the quantization of their coefficients. Next, the intern will propose and develop new algorithms for compressing NN models.

Skills : machine learning, coding, video processing

Keywords : machine learning, coding

[CLOSED][RI-ISL_2019-CG-CV-VP-017] Augmented Reality Point Clouds: Real Hologram Experience

The new augmented reality devices: phones, MagicLeap… and the new 3D video standard open the door to create new AR services, making it possible to create applications that integrate real 3D animated characters in the real world. The goal of this internship is to create new experiences to evaluate and promote this kind of technologies mixing 3D video with real environments. The Star Wars holograms can now become real: “Help me. You're my only hope.”

Skills : - Creative, enthusiast, strength of proposal - Computer Graphics / Computer Vision - Android / iOS programming - ARCore / ARKit / Unity - Ability to write well-structured and documented code - Good written and spoken English - Team working skills

Keywords : .Augmented Reality, Point Cloud, Android programming

[CLOSED] [RI-IML_2019-CG-DM-VP-007] Deep 3D Object Localization & Tracking for Real / Virtual Fusion

To fuse virtual objects with real content (or the opposite) it is necessary to know the parameters of the virtual and real cameras. In particular, the 3D localization of the real camera and the real object needs to be computed and tracked to ensure temporal consistency and fast computing. Current solutions rely on visual pattern tracking. More recent technologies consider fusing information from various sensors as used for automotive applications, such as accelerometer, gyroscope, GPS, LIDAR, radar…The goal of the internship is to improve current algorithms and propose innovative solutions for filtering and fusing the various signals. Deep learning could also be considered as an improvement of existing techniques.

Skills : Applied Mathematics, Signal Processing, Programming, Computer Vision, Machine/Deep Learning.

Keywords : Localization, sensors, filtering, machine learning, VFX, movies.

[CLOSED][RI-ISL_2019-CV-DM-VP-021] Deep Learning for Rotoscoping

Very recently deep learning approaches allowed developing very efficient approaches in various fields (e.g., image/video processing, computer vision, audio processing). This internship proposal targets the development of deep learning approaches for high-end visual effects. In this context, both the interaction with a user (roto artist) and the efficient propagation of the effect throughout a whole sequence are keys to achieve both a highly accurate and efficient process. The proposal will target these two aspects, interaction and spatio-temporal propagation in the context of deep learning segmentation and matting methods. Resulting algorithms might be integrated in a professional VFX software to help the colorists.

Skills : machine learning, deep learning, computer vision, video/image processing, PyTorch, TensorFlow or Keras deep learning frameworks, Python or C++.

Keywords : .machine learning (deep learning), video processing, computer vision, interaction, segmentation, tracking, rotoscoping, matting

[CLOSED][RI-ISL_2019-CV-DM-VP-022] Deep learning for 3D face rig

This internship proposal targets development of deep learning approaches for high-end visual effects (generation and animation of 3D avatars for film studios). Recent techniques, such as MoFA, achieve good 3D face rig reconstruction from still images and videos. However, these face rigs only cover skin parts, missing eyes and mouth interior. To improve this, we propose to study the use of Generative Adversarial Networks (GAN) to fill these parts.

Skills : machine learning, deep learning, computer vision, video/image processing, PyTorch, Python

Keywords : .machine learning, deep learning, video processing, computer vision

[CLOSED][RI-HOME_2019-DM-VP-030] Speech transformation using cycle-consistent adversarial networks

In this internship we are addressing various speech transformation tasks targeting modifying a particular attribute in the speech signal. Potential attributes to be modified may include age, identity, gender, accent, mood, etc.

At the beginning we will mostly concentrate on age modification (aging or de-aging), since this task is of a great interest for movie production. The proposed approaches will be based on deep learning methods and more specifically on cycle-consistent adversarial networks. We assume that some training data are provided. For example, if we would like to transform a speech of a 25 years old men so as it sounds as he is 60 years old, it is supposed that we have a training set of speech signals including two subsets: one of 25 years old speakers and another of 60 years old speakers. However, to greatly increase the range of the applications, we assume that the training dataset is “parallel-data-free”, i.e., the utterances pronounced in the two subsets are not necessarily the same, and thus they cannot be aligned on the phoneme level.

The approach will consist in following steps (1) extracting speech-specific parameters, (2) transforming them to the target ones using a cycle-consistent adversarial network, and (3) resynthesizing the resulting speech from the transformed parameters. Up to this processing the cycle-consistent adversarial network should be pre-trained on the available parallel-data-free training dataset.

Skills : machine learning, audio processing, speech processing, Python or C++.

Keywords : speech transformation, deep learning, machine learning, cycle-consistent adversarial networks