The R&D group
Our goal is to put into place cutting edge techniques in NLP for the specific application of financial text analysis. To do so, we are in close relationship with well known universities in Paris and UK. We frequently go to the relevant conferences in the field and participate in academic challenges. Our group is very international. The common language is either English or French.
The candidate will join a group that is mostly focused on semantic text analysis. We use machine learning on top of word embedding representations with dependency parsing, POS tagging and synthatic analysis. The goal is to solve problems inbetween slot filling tasks, entity linking and knowledge based creation. Therefore the techniques may vary depending on the exact information to extract.
We are looking for a highly independent, collaborative and motivated person with previous experience in the field (phd in NLP or +5 years of experience in NLP projects). The role of the candidate is to take ownership of a part of the project from start to deployment, which means:
- define the subtasks to solve
- get data for validation/traninig for such tasks: get a big set of relevant text and define a Mechanical Turk task, or with any other crowdsourcing company. Define a protocol for cleaning and validation of these tags
- define the architecture of the solution: how the solutions to each task get coordinated,in a way that is not too time/resource consuming
- define a way of numerically obtain the performance of the complete solution
- deploy the solution so it can be tested
Depending on the profile, the person will be in charge of a group of 1-2 junior data scientists to develop the project.
The candidate should know at least the following:
- NLP: neural word embeddings, Name entity recognition (NER), text splitting into sentences,
- Machine learning: lstm, cnn, knn, spectral clustering, PCA
- Project management: previous use of project management tools such as Jira, trello, etc. Also git version control for the code.
- Programming: python (sklearn, keras)
Nice to know:
- NLP: relation extraction, multi-lingual word embeddings, sentence embeddings, automatic text summarisation, text generation,
- Machine learning: auto-encoders, generative models, recommandation systems, reinforcement learning, planning
- Project management: definition of unit testing for ML code
- Programming: tensorflow, theano