[OPEN][RI-IML_2019-CG-DM-HCI-008] Reconstruction of Personalized 3D Human Body Model

This internship will focus on the 3D reconstruction of the human body. Virtual body enables to generate personalized avatars which are more and more required in VR, AR, gaming and many other virtual applications. It helps increasing the embodiment and preventing from cybersickness. The intern will be included in the Immersive Lab within the Virtual Production group at Technicolor Rennes. Several tools and pieces of software developed by the team are available and will be improved. The work will consist in: (1) study the state of the art around 3D reconstruction of human avatars [1], and in particular evaluate a technique based on RGB-D video [2], (2) improve the existing camera rig and reconstruction algorithm in order to provide at the end a realistic digital double of the people.

[1] J. Achenbach, T. Waltemate, M. Erich Latoschik, and M. Botsch. 2017. Fast generation of realistic virtual humans. In Proceedings of the 23rd ACM-VRST '17
[2] Alldieck, T., Magnor, M. A., Xu, W., Theobalt, C., & Pons-Moll, G. (2018). Video Based Reconstruction of 3D People Models. arXiv preprint arXiv:1803.04758.

Skills  : Computer graphics, Python, Deep Learning, Math (optimization and geometry), English, motivated by research.

Keywords  : Geometry Processing, 3D reconstruction, deep learning

 

This internship is located in Rennes, France. If interested, please apply at stage.technicolor@technicolor.com  by sending us your resume and a cover letter with the internship reference in the email subject line.

[OPEN][RI-ISL_2019-DM-VP-015] Speeding-up video coding with deep-learning

The topic of this internship is the development of Deep Learning based methods to speed-up/improve state-of –the-art video codec (namely VCC/H.266). The goal of the internship is to tackle the combinatory problem arising with the new codecs, especially because of the enhanced block topologies available in the codec. The goal is to manage combinatory reduction without decreasing the codec performance. Deep Learning based methods have already proved their efficiency for intra coding mode (see http://phenix.int-evry.fr/jvet/doc_end_user/documents/10_San%20Diego/wg11/JVET-J0034-v2.zip). Many extensions are possible, especially regarding the inter coding mode, dealing with motion field segmentation. The candidate should be familiar with current machine learning software packages and have a good background in image processing in general.

Skills  : Skills: (deep) learning algorithms and software, programming (C++/python), motion estimation

Keywords  : video codec, machine learning, motion segmentation, image processing

 

This internship is located in Rennes, France. If interested, please apply at stage.technicolor@technicolor.com  by sending us your resume and a cover letter with the internship reference in the email subject line.

[OPEN][RI-ISL_2019-CG-DM-018] Deep Geometry

3D models are often represented by very large numbers of points or triangles. This makes both storage and image synthesis inefficient, and therefore requires high-end GPUs to produce images. This internship will investigate the possibility of replacing large and detailed 3D models, normally represented as either triangle meshes or point clouds, with deep neural networks. Such networks could then be employed by a renderer to efficiently produce images.

Skills  : machine learning, 3D rendering, Python, C++.

Keywords  : .machine learning, 3D rendering, Python, C++.

 

This internship is located in Rennes, France. If interested, please apply at stage.technicolor@technicolor.com  by sending us your resume and a cover letter with the internship reference in the email subject line.

[OPEN][RI-ISL_2019-CG-CV-DM-020] Aging 3D Character

In VFX production (film or advertisement) the need to reconstruct 3d actor’s face from video input is increasing. Over the last decade, the technology that pulls out a 3d facial model from a flat image has been improved significantly, while fine-scale mesoscopic detail may miss out. With the recent growth of deep learning techniques, we believe that morphing a 3d character’s age would be possible, by learning “(de-)aging” from data. Done automating this pipeline brings benefits to the VFX industry, reducing manual labour. Our research team is based in Rennes & New York. And collaborates with engineers and artists located at The Mill, New York.

Skills  : Machine learning, Deep-learning, Computer Graphics, Computer Vision, Python, PyTorch, Maya.

Keywords  : .deep network, visual effects, facial rig, 3d reconstruction, shape from shading

This internship is located in Rennes, France. If interested, please apply at stage.technicolor@technicolor.com  by sending us your resume and a cover letter with the internship reference in the email subject line.

[OPEN][RI-IML_2019-CV-DM-028] Deep learning for depth estimation from multi-view video

This internship tackles the issue of learning depth estimation from multi-view video. Deep learning is likely to overcome traditional depth estimation techniques, just like most Computer Vision tasks. Latest networks, whose training are supervised by the quality of synthesized views, exhibit impressive results in terms of depth maps. Yet state-of-the-art solutions process each set of successive frames independently. The goal of this internship is to shift to a temporally stable solution, by enforcing the temporal consistency of the generated depth map stream. The framework environment includes an existing dataset of light field video sequences [1] that will be used as ground-truth for training. The neural network design will be based on existing deep learning approaches. Its architecture shall enable the extraction of the temporal redundancy occurring in the input images and enforce the temporal consistency of the output depth maps. To this end a survey of conventional optical flow estimation approaches might be a helper.

[1] Sabater et al., ‘Dataset and Pipeline for Multi-View Light-Field Video’, CVPR 2017

Skills  : Machine Learning, Computer Vision, Image & video processing, Python

Keywords  : Deep learning, light fields, view synthesis, depth estimation, optical flow estimation

 

This internship is located in Rennes, France. If interested, please apply at stage.technicolor@technicolor.com  by sending us your resume and a cover letter with the internship reference in the email subject line.

[OPEN][RI-ISL_2019-CV-DM-VP-052] One-click video editing

In film post-production, digital makeup of actors to age or de-age them, add or remove scars and generally modify their appearance is a painstaking and time-consuming process, even for skilled artists. In this internship, the goal is to facilitate this process, by helping the development of automatic tools that allow artists to easily edit an actor's appearance across many scenes. The work will revolve around the concept of 'Unwrap mosaics', a novel representation that condenses the appearance of an actor within a video sequence into a single image, which can be edited and propagated back into the video. The candidate will build on a prototype of unwrap mosaics implemented at Technicolor. To ensure that the applied corrections are realistic, the internship will first target the development of an automatic color correction solution. More specifically, the goal is to compute automatically a suite of parametric corrections (color, contrast, sharpness) that will adjust the edit in all frames so that there is visually no seam after insertion. To further reduce the artistic effort required, the internship will also focus on the propagation of edits from one scene to other scenes where the same actor appears, managing both the geometric and the colorimetric corrections necessary. Both deep learning and traditional techniques can be tested and compared. The work will happen in the context of an ongoing and tight collaboration between Technicolor R&I and the film production teams in Hollywood specialized in VFX and digital makeup (https://www.technicolor.com/create/technicolor-los-angeles/vfx), addressing their ongoing needs in active and future projects.

Skills  : machine learning, computer vision, video processing

Keywords  : .vfx, post-production, image processing

 

This internship is located in Rennes, France. If interested, please apply at stage.technicolor@technicolor.com  by sending us your resume and a cover letter with the internship reference in the email subject line.

[CLOSED][RI-ISL_2019-DM-VP-039] Compression of Neural Networks

The objective of the internship is to study and to develop algorithms for compressing neural network (NN) coefficients for transport or storage. The algorithms will be derived from state-of-art still pictures compression techniques. The efficiency of the compression of NN algorithms will be evaluated using several on-the-shelf NN classifiers or image post-filters, by comparing the results obtained with the original NN with the results obtained with the compressed NN.

After having studied state-of-the-art NN-based compression publications, the intern will test existing NN frameworks supporting the quantization of their coefficients. Next, the intern will propose and develop new algorithms for compressing NN models.

Skills  : machine learning, coding, video processing

Keywords  : machine learning, coding

 

This internship is located in Rennes, France. If interested, please apply at stage.technicolor@technicolor.com  by sending us your resume and a cover letter with the internship reference in the email subject line.

[CLOSED][RI-HOME_2019-DM-001] Weakly supervised multiple instance learning

We have been developing sensors that can detect the presence of people based on supervised learning, for instance geophone, microphone… However, a sensor does not always port well in a new environment, and it is not easy to adapt it to a new context, because one cannot expect that the user will provide a precise ground truth for each step or for each trace of presence in any sensor. We propose therefore to study the feasibility of using a new learning paradigm for the home setting: multiple instance learning. This learning method is inherently weakly supervised: it needs a ground truth expressing that “there is nothing in a series of observations”, or “there is something in a range of observations”, even if one is not able to say precisely where it is possible to detect something. With such an algorithm the user has just to say: “I get out of the house” or “I am back”, or this can be automatically detected from its phone for instance. When there is nobody the system learns the background noise of the house, and when there is somebody it learns the specific noises that somebody will cause on the sensor. Notice that the background noises can comprise noises caused by neighbors (in a condo), or by pets, or by traffic outside. Examples of signals to exploits: geophones, microphones, low resolution cameras, discrete sensors. The objective of the training period is to develop a new algorithm, experiment and publish results in a research paper.

Skills  : Signal Processing, Machine Learning, Neural Networks. Python programming. Knowledge of Scikit Learn. Able to read and write research papers. Fluent in written English. Interested in mathematics and signal processing.

Keywords  : Learning algorithm, weakly supervised algorithms

 

This internship is located in Rennes, France. If interested, please apply at stage.technicolor@technicolor.com  by sending us your resume and a cover letter with the internship reference in the email subject line.

[CLOSED][RI-IML_2019-CG-DM-VP-007] Deep 3D Object Localization & Tracking for Real / Virtual Fusion

To fuse virtual objects with real content (or the opposite) it is necessary to know the parameters of the virtual and real cameras. In particular, the 3D localization of the real camera and the real object needs to be computed and tracked to ensure temporal consistency and fast computing. Current solutions rely on visual pattern tracking. More recent technologies consider fusing information from various sensors as used for automotive applications, such as accelerometer, gyroscope, GPS, LIDAR, radar…The goal of the internship is to improve current algorithms and propose innovative solutions for filtering and fusing the various signals. Deep learning could also be considered as an improvement of existing techniques.

Skills  : Applied Mathematics, Signal Processing, Programming, Computer Vision, Machine/Deep Learning.

Keywords  : Localization, sensors, filtering, machine learning, VFX, movies.

 

This internship is located in Rennes, France. If interested, please apply at stage.technicolor@technicolor.com  by sending us your resume and a cover letter with the internship reference in the email subject line.

[CLOSED][RI-HOME_2019-DM-NW-003] Network congestion control based on deep reinforcement learning

Congestion control is one fundamental building block of networking protocols. Over the last decades, network researchers have designed a wide variety of congestion control algorithms that each target a specific environment or application (e.g. datacenter, wifi, video…) or optimize a specific aspect in networking performance (bufferbloat, latency, bandwidth...). Most congestion control schemes were designed as classical deterministic algorithms based on expert knowledge using hardwired and predefined control responses. Only recently machine-learning based congestion control schemes have been proposed by the community. These efforts are quite recent and do not employ the latest deep reinforcement learning techniques. Deep Reinforcement Learning denotes the use of deep learning, a powerful class of learning algorithm, to develop reinforcement learning algorithms: algorithms that attempt to learn how to control a system optimally. These algorithms have been in a spotlight lately as they have achieved impressive results in a variety of tasks such as beating human experts at Go, competing against them in cooperative video games or reducing energy consumption by 40% in data centers. Considering the recent advances in Deep Reinforcement Learning and the availability of emulation and evaluation platforms for congestion control schemes, we believe that there is room to design a congestion control scheme based on deep reinforcement learning. The objective of this internship is therefore to design and evaluate deep reinforcement learning based congestion control schemes. Technicolor being a home gateway and settop-box manufacturer a specific focus in the evaluation of the proposed algorithms will be home environments.

Skills  : .

Keywords  : Deep learning, reinforcement learning algorithm

 

This internship is located in Rennes, France. If interested, please apply at stage.technicolor@technicolor.com  by sending us your resume and a cover letter with the internship reference in the email subject line.

[CLOSED][RI-ISL_2019-DM-HCI-016] Gesture Recognition by Deep Learning

Action and Gesture recognition have a growing interest in several application domain essentially in human machine interaction like in automotive, games and digital TV user interface. The goal of this internship is to explore and propose a new framework based on Neural Network to achieve a gesture recognition for digital TV application where features are extracted as well in the spatial and temporal domain.

Skills  : - Matlab/Python/C programming, ideally with image processing expertise - Ability to write well-structure and documented code - Good written and spoken English - Excellent team working skills as the internship forms a part of a larger project, involving many team members - Ability to work independently

Keywords  : .Machine Learning, Deep Learning, SVM, Clustering

 

This internship is located in Rennes, France. If interested, please apply at stage.technicolor@technicolor.com  by sending us your resume and a cover letter with the internship reference in the email subject line.

[CLOSED][RI-ISL_2019-CV-DM-VP-021] Deep Learning for Rotoscoping

Very recently deep learning approaches allowed developing very efficient approaches in various fields (e.g., image/video processing, computer vision, audio processing). This internship proposal targets the development of deep learning approaches for high-end visual effects. In this context, both the interaction with a user (roto artist) and the efficient propagation of the effect throughout a whole sequence are keys to achieve both a highly accurate and efficient process. The proposal will target these two aspects, interaction and spatio-temporal propagation in the context of deep learning segmentation and matting methods. Resulting algorithms might be integrated in a professional VFX software to help the colorists.

Skills  : machine learning, deep learning, computer vision, video/image processing, PyTorch, TensorFlow or Keras deep learning frameworks, Python or C++.

Keywords  : .machine learning (deep learning), video processing, computer vision, interaction, segmentation, tracking, rotoscoping, matting

 

This internship is located in Rennes, France. If interested, please apply at stage.technicolor@technicolor.com  by sending us your resume and a cover letter with the internship reference in the email subject line.

[CLOSED][RI-ISL_2019-CV-DM-VP-022] Deep learning for 3D face rig

This internship proposal targets development of deep learning approaches for high-end visual effects (generation and animation of 3D avatars for film studios). Recent techniques, such as MoFA, achieve good 3D face rig reconstruction from still images and videos. However, these face rigs only cover skin parts, missing eyes and mouth interior. To improve this, we propose to study the use of Generative Adversarial Networks (GAN) to fill these parts.

Skills  : machine learning, deep learning, computer vision, video/image processing, PyTorch, Python

Keywords  : .machine learning, deep learning, video processing, computer vision

 

This internship is located in Rennes, France. If interested, please apply at stage.technicolor@technicolor.com  by sending us your resume and a cover letter with the internship reference in the email subject line.

[CLOSED][RI-ISL_2019-CG-CV-DM-024] Extraction of quadrupeds motion parameters from video

The goal of the internship is to apply deep learning techniques for the extraction of motion parameters of quadrupeds from video. In order to cope with the lack of ground truth, the approach will build upon both weakly and unsupervised learning. Biomechanical knowledge or possibly tiny manual annotation dataset might also be exploited. The motivation for this work is to develop a statistical model of the motion of some quadrupeds in order to synthesize plausible animation.

The context of this work is the VFX workflow for animated movies industry. This work is part of an effort to automatize the currently very manual process.

The objective is to design the model and the learning methodology for extracting the 3D coordinates of a moving quadruped in video.

The expected outcome of the internship are :
- A model with the quantitative evaluation of its performance
- A description of the approach which might lead to a publication or patent
- A demo which will visually display the produced 3D animation

References
- Zhou, Xingyi, Qixing Huang, Xiao Sun, Xiangyang Xue, and Yichen Wei. "Weakly-supervised Transfer for 3D Human Pose Estimation in the Wild." arXiv preprint arXiv:1704.02447 (2017).
- Newell, Alejandro, Kaiyu Yang, and Jia Deng. "Stacked hourglass networks for human pose estimation." In European Conference on Computer Vision, pp. 483-499. Springer International Publishing, 2016.

Skills  : deep learning

Keywords  : deep learning

 

This internship is located in Rennes, France. If interested, please apply at stage.technicolor@technicolor.com  by sending us your resume and a cover letter with the internship reference in the email subject line.

[CLOSED][RI-IML_2019-CV-DM-029] Deep learning for light field video synthesis

This internship tackles the issue of learning light field video synthesis from a view subset of a camera array. In the last few years, numerous deep learning architectures have been proven efficient for still image synthesis, including for large baseline setups. Next step consists in shifting to video, i.e. enforcing temporal stability and consistency. The framework environment includes an existing dataset of light field video sequences [1]. The intern’s research and developments will be built on the top on existing view synthesis deep learning literature, such as [2]. The network design shall successfully exploit the redundancy occurring in the successive input images and explicitly enforce the output temporal consistency. To this end a survey of conventional optical flow estimation approaches might be a helper.

[1] Sabater et al., ‘Dataset and Pipeline for Multi-View Light-Field Video’, CVPR 2017
[2] Flynn et al., ‘DeepStereo: Learning to Predict New Views from the World's Imagery’, CVPR 2016

Skills  : Machine Learning, Computer Vision, Image & video processing, Python

Keywords  : Deep learning, light fields, view synthesis, depth estimation, optical flow estimation

 

This internship is located in Rennes, France. If interested, please apply at stage.technicolor@technicolor.com  by sending us your resume and a cover letter with the internship reference in the email subject line.

[CLOSED][RI-HOME_2019-DM-VP-030] Speech transformation using cycle-consistent adversarial networks

In this internship we are addressing various speech transformation tasks targeting modifying a particular attribute in the speech signal. Potential attributes to be modified may include age, identity, gender, accent, mood, etc.

At the beginning we will mostly concentrate on age modification (aging or de-aging), since this task is of a great interest for movie production. The proposed approaches will be based on deep learning methods and more specifically on cycle-consistent adversarial networks. We assume that some training data are provided. For example, if we would like to transform a speech of a 25 years old men so as it sounds as he is 60 years old, it is supposed that we have a training set of speech signals including two subsets: one of 25 years old speakers and another of 60 years old speakers. However, to greatly increase the range of the applications, we assume that the training dataset is “parallel-data-free”, i.e., the utterances pronounced in the two subsets are not necessarily the same, and thus they cannot be aligned on the phoneme level.

The approach will consist in following steps (1) extracting speech-specific parameters, (2) transforming them to the target ones using a cycle-consistent adversarial network, and (3) resynthesizing the resulting speech from the transformed parameters. Up to this processing the cycle-consistent adversarial network should be pre-trained on the available parallel-data-free training dataset.

Skills  : machine learning, audio processing, speech processing, Python or C++.

Keywords  : speech transformation, deep learning, machine learning, cycle-consistent adversarial networks

 

This internship is located in Rennes, France. If interested, please apply at stage.technicolor@technicolor.com  by sending us your resume and a cover letter with the internship reference in the email subject line.

[CLOSED][RI-IML_2019-CG-CV-DM-032] Joint deep completion of geometry and texture for Mixed Reality applications

We are proposing a 6 months internship in the Mixed Reality team, focusing on 3D scene completion with deep learning. The internship could be continued as a PhD scholarship. Scanning indoor scenes with RGB-D sensors usually ends up with incomplete scenes, with many missing geometry and texture details. These are not good enough for high-end Augmented and Mixed Reality applications. Classical approaches try to extrapolate scanned information towards missing regions using basic priors on existing patterns. Deep learning, on the other hand, provides a powerful framework to learn patterns from existing 3D scans and 2D images, from local details to global contextual information, which can be exploited to reconstruct missing parts. We aim at developing a multi-purpose tool for scene completion, based on deep learning, combining both colour and geometry information, respecting constraints provided by scanned regions, and taking scene classification and semantics into account. The solution will be integrated into a larger pipeline, and it will be used for Virtual Reality, Mixed Reality and Diminished Reality applications.

Skills  : Computer Vision, Machine Learning, Image/Video Processing, 3D geometry, C++/Python, fluent English, good team spirit and communication skills

Keywords  : Deep learning, 3D modeling, scene completion, inpainting, semantic labelling, Augmented reality

 

This internship is located in Rennes, France. If interested, please apply at stage.technicolor@technicolor.com  by sending us your resume and a cover letter with the internship reference in the email subject line.

[CLOSED][RI-HOME_2019-CV-DM-HCI-033] Advanced deep learning methods for audio event detection in domestic environment

The internship addresses detection of audio events in domestic environment for emerging real life applications to be implemented within a set-top-box.

This task, which has been benchmarked in DCASE challenges (see [1] for the DCASE 2018), has attracted a lot of attention in the past few years. With the advances in deep neural networks (DNN) and the release of large-scale audio datasets, numerous approaches have been investigated in the literature, including both supervised and weakly-supervised [2] methods. Grounded on DCASE 2019 challenge with benchmarked datasets, the internship targets to build a state-of-the-art DNN model to do the inference accurately. Several settings might be considered: single channel vs. multichannel inputs, supervised vs. weakly supervised learning where the annotations are noisy and/or incomplete. The intern will conduct both research and implemention while investigating the use of advanced DNN architectures and data augmentation strategies for the considered tasks. Depending on the actual work and the obtained result, the work may be concluded by a participation in the DCASE 2019 challenge and by a submission of a scientific publication in an international conference/workshop.

[1] IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE), http://dcase.community/challenge2018/.
[2] Romain Serizel et al., “Large-Scale Weakly Labeled Semi-Supervised Sound Event Detection in Domestic Environments,” Proc. DCASE2018 Workshop, July 2018. https://hal.inria.fr/hal-01850270.

Skills  : Machine learning (deep learning), audio processing, Python.

Keywords  : Machine learning (deep learning), audio signal processing, weakly supervised learning, acoustic event detection.

 

This internship is located in Rennes, France. If interested, please apply at stage.technicolor@technicolor.com  by sending us your resume and a cover letter with the internship reference in the email subject line.

[CLOSED][RI-IML_2019-CG-CV-DM-040] Change Detection & 3D Model Update for Mixed Reality applications

Mixed Reality (MR) applications are based on the creation of a new world that merges 3D virtual assets with the real environment of the user. Reconstructing 3D point clouds or meshes to represent the real world is key, and for the apps to be even more realistic and user engaging, MR experiences should not restrict to static environment but also adapt to temporal changes. This internship will focus on the detection of geometric changes between different observations of the same scene captured at different instants. Semantic segmentation based on deep learning will serve in this process as a powerful tool to identify objects that have been moved, introduced or removed. The oldest 3D model will be updated with the most recent observations.

The intern will be included in the Immersive Lab within the Mixed Reality group at Technicolor Rennes. The proposed solution will be integrated into a larger pipeline and used for MR demonstrations. The internship could be continued as a PhD scholarship. Applicants should be strongly motivated by research.

Skills  : Computer vision, machine learning, 3D geometry, C++/Python, fluent English, good team spirit and communication skills

Keywords  : Deep learning, 3D modelling, semantic labelling, augmented reality

 

This internship is located in Rennes, France. If interested, please apply at stage.technicolor@technicolor.com  by sending us your resume and a cover letter with the internship reference in the email subject line.

[CLOSED][RI-IML_2019-CG-CV-DM-043] 3D scene relighting for Mixed Reality applications

Mixed Reality (MR) applications are based on the creation of a new world that merges 3D virtual assets with the real environment of the user. These assets are often virtual objects or figures that can move in the real scene. Other MR applications can consist on retexturing or relighting the scene. This latter application is the subject of this internship.

The study will focus on realistic relighting effects in a mixed scene in presence of real lights. The scenarios will include the insertion of virtual lights to produce realistic effects (shadows, shading, specular effects…) as well as the removal of real lighting effects. This requires the estimation of lighting as well as surface reflectance properties in the real scene via image and 3D processing. Moreover, the stability of rendered lighting effects over time will be addressed.

The intern will be included in the Immersive Lab within the Mixed Reality group at Technicolor Rennes. The proposed solution will be integrated into a larger pipeline and used for MR demonstrations. The internship could be continued as a PhD scholarship.

Skills  : computer vision, 3D geometry, C++/Python, machine learning, fluent English, good team spirit and communication skills

Keywords  : lighting and reflectance modelling, 3D modelling, rendering & relighting, mixed reality, deep learning

 

This internship is located in Rennes, France. If interested, please apply at stage.technicolor@technicolor.com  by sending us your resume and a cover letter with the internship reference in the email subject line.