[CLOSED] [HOME_DM_006] Reversing automatic solution recommendations in online discussions 

Technicolor manufactures and distributes a large number of set-top boxes, broadband modems and gateways. End-users discuss about these devices in online forums to troubleshoot issues that they experience in their homes. Often other users provide the solutions to these issues. Still, there also exist forums that automatically recommend problem fixing solutions based on the asked question or the users profile [1]. Such automatic solution finding is based on machine-learning or recommendation algorithms. Similarly to recent work that tries to uncover the functioning of an algorithm or machine learning model based on the mere observation of a recommendation [2,3,4], this internship tries to understand how the answers are selected.

This internship deals with the understanding of the algorithms that personalize the end-users web experiences. The objective of this internship is to understand, reverse and possibly mimic the solution finding algorithm by simply observing the questions and automatic answers provided in a forum.

The internship consists in proposing and evaluating several algorithms and machine learning approaches to identify the underlying recommendation algorithm. To validate and generalize the approach, we will test and extend the proposed methods on other non-forum data.

Skills : machine learning, algorithm design, programming (python)

 

This internship is located in Rennes, France. If interested, please apply at stage.rennes@technicolor.com by sending us your resume and a cover letter with the internship reference in the email subject line.

[OPEN] [HOME_DM_007] Product reputation modeling, from noisy www observations 

In 2015, Technicolor shipped a total of 31.8 million products and is number one worldwide in solutions for broadband modems and gateways. Technicolor has an obvious interest in understanding the end users satisfaction, problems and complaints.

The goal of this internship is to build a reputation model [1] for newly launched products. It covers both system and modeling aspects.

To build the reputation we need to track the evolution of end-users online discussions (e.g. amounts of posts on web forums), and the sentiment (positive or negative) of the general comments about a given product. Other sources of information are the publicly available rankings of search engines regarding a product, the evolution and criticality of top-ranked complaints, and forum posts. Collecting such information requires a large-scale webcrawler such as Apache Nutch (already up and running), an indexing engine such as Solr, and text analysis means..

Algorithmically, the modeling can be inspired by epidemic spreading models [2]. As the search engine results for a given product already include a form of importance (for the ranking decision), those results may be leveraged by the model, as a dynamic pre-computed observation. Natural Language Processing is also an important building block. For example opinion mining and topic detection can be used to recognize an outbreak of complaints due for instance to a particular software update of the product

The built reputation model can be validated against the sales volumes, and an internal database of failed and returned products

 

Skills : modeling, data mining, Python, curiosity!

Keywords : Modeling, Text mining, Large-scale web mining

 

This internship is located in Rennes, France. If interested, please apply at stage.rennes@technicolor.com by sending us your resume and a cover letter with the internship reference in the email subject line.

[OPEN] [HOME_DM_021] Family member identification through IoT sensors for the home 

Identification of family members within a smart home is a key step to personalize each room (music, temperature…) to the likings of individuals currently in that room. Identification in a home is best performed with sensors that are fixed (as wearables are often not carried in a house) and non-invasive (no camera).

Typical recognition systems require labelled data to be trained, that is, data where the system knows who is in the room. Continuously providing labels to the system can be a heavy burden for customers. An alternative is to use unsupervised learning algorithms that are able to differentiate individuals from unlabelled data. Once individuals can be discriminated by the algorithm, only one label per individual can be requested. This internship will investigate such an approach

The intern will:

- Review existing unsupervised approaches for identification from sensors;

- Using available Technicolor data sets, evaluate these algorithms and compare them to supervised approaches on individual sensors: microphones, geophones, accelerometers;

- Design and evaluate an approach based on different types of sensors used jointly.

Skills : Machine learning background, development skills (python), English

Keywords : biometric, smart home, identification, unsupervised learning, sensor

 

This internship is located in Rennes, France. If interested, please apply at stage.rennes@technicolor.com by sending us your resume and a cover letter with the internship reference in the email subject line.

[OPEN] [HOME_DM_022] Bandit algorithms for smart home WiFi 

Our connected lives and the advent of the internet of things will lead to a rise in the number of connected devices present in a home and to increasingly varied usage patterns. In parallel, several connection channels (cable, optic fibre, 4G…) are typically available. A natural question is how to optimally allocate these devices and/or applications across these channels based on cost, bandwidth, latency, energy… The problem is further complicated when the channel properties and of the device usage are considered as random variables.

Reinforcement learning (RL) is a machine learning approach where an agent must select actions, learns by interacting with a system and gradually optimizes the behaviour of this system under uncertainty. It has shown potential for networking application such as dynamic channel selection. This internship will investigate the potential of RL for home networks of the future.

More specifically, the objectives are:

- to create a controller within a simulator of home network, potentially based on public or in-house data;

- to identify and evaluate relevant RL algorithms;

- to develop new approaches tailored to the problem.

Skills : Mathematical, analytical and modelling capabilities (a machine learning background is a plus), development skills (python), English

Keywords : reinforcement learning, time series, control

 

This internship is located in Rennes, France. If interested, please apply at stage.rennes@technicolor.com by sending us your resume and a cover letter with the internship reference in the email subject line.

[OPEN] [HOME_DM_023] Home presence detection based on wifi data analysis 

Technicolor is now for some years involved in data mining projects based on collecting data from basic sensors but also from more atypical ones that leverage interesting research topics. Technicolor has a strong presence within the home through its top level rank within the gateway market space. Wifi extenders are also on track to be deployed to enhance the wifi coverage within the home on which many devices connect and disconnect every day.

Then it would be valuable to infer absence/presence or even better the number of people thanks to monitor wifi signal variation over time. It would avoid additional sensor like motion detectors that increases installation and cost and then decreases dramatically user acceptance.

The expectations of the internship are as follows:

• To provide a state of art

• To extract data (like RSSI, MAC address, labels, etc.) from existing large dataset

• To analyse it through time series data analytics tools to infer absence/presence or even more number of people in correlating those data to daily connection/disconnection of devices per week/work day. It could advantageously infer who is at home through its personal device

• To provide an algorithm of presence/absence inference

• To validate it with an experiment in collecting new data within the home environment

The challenge relies on finding relevant algorithm to detect absence/presence or even better the level of presence through number of connected devices

Skills : Data mining - Good level of English enabling easy reading of scientific literature and drafting documentation - Pandas, Python 2/3, supervised/unsupervised machine learning, Linux OS, Raspberry PI

Keywords : RSSI, machine learning, time series, data mining

 

This internship is located in Rennes, France. If interested, please apply at stage.rennes@technicolor.com by sending us your resume and a cover letter with the internship reference in the email subject line.