C.V.
Didier Guillevic
- Ph.D. in Computer Science (Machine Learning): Concordia University (Canada)
- M.Sc. in Electrical Engineering: ESIEE (France)
- Tripartite: ESIEE (France), Karlsruhe Institute of Technology (Germany), Essex U. (England)
- Machine Learning, Deep Learning, Software 2.0, Natural Language Processing, (Generative) “Artificial Intelligence”, Graph Learning, Privacy Preserving Machine Learning
Contact
- Montréal, Québec, Canada, didier@guillevic.net, didier.guillevic.net
Expertise
- Machine Learning, Deep Learning, Software 2.0
- Generative “Artificial Intelligence” (Large Language Models, …)
- Natural Language Processing: traditional (pre 2010) as well as deep learning
- Information Retrieval: lexical, semantic
- Languages: Python, C++, ANSI C, R, Unix shell
- Other languages: Scala, Java, Matlab, Octave, JavaScript
- Databases: MongoDB (NoSQL), PostgreSQL
- Cloud computing: AWS, Oracle Grid Engine, multi-threaded, distributed computing environment
- Tools: Linux, Unix, git, scikit-learn, spaCy, GNU Makefile, Anaconda
- Deep Learning: PyTorch, HuggingFace
- Graph: NetworkX, Deep Graph Library, PyTorch Geometric, graph-tool, Neo4j
- Web app: Streamlit, Gradio, Dash, R Shiny, API REST, Flask
- Data visualization: Bokeh, Dash Cytoscape, UMAP
- Big Data (to a lesser extent): Hadoop, MapReduce, Apache Spark, Apache MLLib, TensorFlow
Professional Experiences
Government of Canada
- Scientist
- Canada (Montréal), 2017-present
Idilia Inc.
- Machine Learning Expert: Natural Language Processing - Machine learning
- Canada (Montréal), 2002-2017
-
Reporting to the Head of Research and Development (R&D)
- Commercial product in natural language processing for the semantic analysis of texts.
- Database with billions of sense annotated queries, tweets and documents.
- Product Adsquirl designed for marketing agencies to drive advertising campaigns:
- precise semantic meaning of keywords
- negative keywords generation using Idilia’s database of sense annotated queries.
- long tail queries generation by matching the precise meaning of seed keywords with Idilia’s database of sense annotated queries.
- Product for the extraction and filtering of information coming from social media.
- Working with large data sets: billions of tweets, queries and documents.
- Knowledge graph (Idilia Language Graph), with millions of word senses and several hundred million links.
- Designed a document classification module to extract named entities from Wikipedia and insert them into Idilia’s taxonomy. Linear classifier trained with millions of documents, several hundred thousand features and several hundred labels. Database: MongoDB.
- Designed modules for named entity recognition (Hidden Markov Models and rule-based), capitalization (HMMs, N-grams, data sources), word sense disambiguation, lexical analysis, classification and confidence scoring.
- API REST interface to some of the modules.
- Software to be deployed on thousands of servers and able to serve thousands of queries per second. Software needs to be fast, scalable, distributed, multi-threaded and very reliable in its predictions with respect to its confidence scores.
- Tools: Linux, C++, boost C++ libraries, Python, Jupyter (IPython), MongoDB, GNU Make, Git, Oracle Grid Engine, JSON, Flask-RESTful
- Machine Learning: libsvm, liblinear, sklearn, graph-tool, spaCy, gensim, Neural Networks, Hidden Markov Models, Random Decision Forest.
Locus Dialogue (acquired by Nuance)
- Software Engineer - Machine Learning: Speech Recognition and Machine Learning
- Canada (Montréal), 2000-2002
-
Reporting to the Research and Development Manager (R&D)
- Team responsible for the post-processing of the Automatic Speech Recognition (ASR) output.
- Hidden Markov Model, Gauss, generalized linear models, neural networks.
- Software deployed at close to 1,000 installations worldwide and able to serve approximately one-half billion calls annually. Reliability of predictions being extremely important.
- Design and implementation of a new version of the post-processing module.
- Responsible for implementation, machine learning tools, evaluation, testing and improvements.
- Significantly enhanced the robustness and performance of the post-processing module shipped in the Locus Dialogue speech products.
- Tools: Windows, Cygwin, C++, ClearCase
- Machine Learning: Neural Networks, Hidden Markov Models, Generalized Linear Models
NEC Central Research Labs
- Member of the Research Staff: Machine learning, Pattern recognition
- Japan (Kawasaki), 1997-2000
-
Reporting to the Research and Development Manager (R&D)
- Pattern Recognition Laboratory.
- R&D activities for the NEC postal sorting machines shipped worldwide.
- Design and implementation of software to be deployed in Finland’s postal sorting facilities to handle hundreds of millions of mail pieces per year. Recognition results need to have an extremely low error rate.
- Responsible for full development cycle of handwritten digits and word recognition modules.
- U.S.A. patent for a word lexicon reduction system: US Patent 6834121.
- Tools: Linux, EWS-UX, HP-UX, C++
- Machine Learning: Neural Networks, Hidden Markov Models.
Xerox Research Center Webster
- Software development for Xerox optical character recognition systems.
- Tools: Unix, ANSI C
ACSIEL (formerly SYCEP)
- Manager of the German Office: Export
- Germany (Munich), 1987-1989
-
Reporting to the Company’s President in Paris, France
- Acted as a liaison between the French manufacturers of electronic components and their German customers and distributors.
- Provided assistance to the French companies in establishing their distribution channel in Germany.
- Assisted the French companies in their business dealings with the German Institute of Norms (DIN).
(Formal) Education
- Ph.D. Computer Science
- Concordia University, Montréal, Canada
- Machine Learning, Artificial Neural Networks, Hidden Markov Models
- M.Sc. Electrical Engineering
- ESIEE (France)
- Selected to join the Tripartite Programme
- Karlsruhe Institute of Technology (Germany)
- Essex University (England)
- Mathematics, telecommunications, robotics, pattern recognition, statistical learning
(Continuing) Education
Graph Learning
- CS224W: Machine Learning with Graphs (Stanford): PyG, NetworkX, Graph Neural Network
- Mining Massive Datasets (Stanford) : Locality Sensitive Hashing (LSH)
Generative “Artificial Intelligence”
Quantum Computing
- Understanding Quantum Computers (Keio University)
- 2021 Qiskit Global Summer School on Quantum Machine Learning
Privacy Preserving Machine learning
- Secure and Private AI (Udacity)
- local / global differential privacy, federated learning, encryption via secure multi party computation, homomorphic encryption
Deep Learning
- Deep Learning with PyTorch (Udacity) : PyTorch
- Deep Learning (Collège de France)
- Deep Learning (Udacity/Google) : TensorFlow
- CS224n: Deep Learning for Natural Language Processing (Stanford)
- CS231n: Convolutional Neural Networks for Visual Recognition
Machine Learning - Natural Language Processing
- Machine Learning (Stanford)
- Statistical Learning (Stanford)
- Scalable Machine Learning (UC Berkeley) : Apache Spark
- Natural Language Processing (Stanford)
- Natural Language Processing (Columbia)
Data Science
- Introduction to Data Science (U. of Washington) : Apache Hadoop , NewSQL
- Data Manipulation at Scale: Systems and Algorithms (U. of Washington) : Exact Test, …
- Data Science and Analytics in Context (Columbia)
Computer Science - Statistics - Data Structures - Algorithms
- Calculus: Single Variable (Penn) : THE course… A gem…
- Algorithms, Part I (Princeton) : UnionFind, stack, queue, sort, BSTs, …
- Algorithms, Part II (Princeton) : Graphs, …
- Networked Life (Penn)
- Model Thinking (Michigan)
- Functional Programming Principles in Scala (Polytechnique Lausanne)
Learning / Education
- Teaching Character (Relay GSE) : Grit
- Learning How to Learn (UC San Diego) : Focused versus Diffuse Mode of Thinking
- How to Learn Math (Stanford) : Fixed versus Growth Mindset
Languages
- French (mother tongue)
- English (bilingual)
- German (fluent, 3 years in Germany)
- Japanese (intermediate, 3 years in Japan)
Personal Experiences
- Work and study in France, Germany, England, Canada, USA and Japan.
- Sports: running, cycling, dragon boat.