Number of vacancies: 2
Preference for Location: London, UK
About Pangaea Data Limited
Pangaea Data Limited provides a machine learning based software product to its customers from the biopharmaceutical and healthcare industry for faster identification of patient cohorts based on phenotypes (clinical characteristics and symptoms) from electronic health records (EHRs) and unstructured doctors’ notes. This is critical for detecting patients at risk of diseases, finding genes linked to a phenotype in the context of drug or biomarker discovery, recruiting patients for clinical trials, conducting real world evidence (RWE) studies and matching the right patients with the right drugs. The company's product has demonstrated that its able to find the right patient cohorts at 50x speed and 30% higher accuracy when compared to alternative methodologies such as rule based natural language processing (NLP), keyword extraction and manual means. The company is headquartered in London and was founded by serial entrepreneurs who have raised more than £130 million through their work and is advised by leading experts from industry, Imperial College London and Stanford University. Pangaea's investors include a UK based Deep Tech fund and renowned angels who founded several UK and US headquartered unicorns. Pangaea has access to 90 million patient electronic health records through its partnerships with hospitals and other such providers from the US, UK, Europe, South America and Asia Pacific.
As a Data Scientist in Machine Learning and Natural Language Processing, you are involved in the research team. We need our researchers to develop cutting-edge technology which can be applied in our products. You’ll also get a chance to work closely with engineers to productise your research technology. A strong research skills and knowledge on Machine Learning especially Natural Language Processing are essential.
What You'll Do
Communicate with other researchers, the engineers and the product team on requirements.
Research and develop cutting-edge technology on Text Mining and Natural Language Processing (NLP).
Applying NLP technology for medical applications.
With university qualification (Bachelors, Masters, Doctorate) who have completed at least two years of university study in Computer Science, Medical Informatics or related.
Experience (classroom/work) in Machine learning, Natural Language Processing, Algorithmic Foundations of Optimization, Data Science, Data Mining and/or Bioinformatics.
Experience on general programming languages: Python, C++, Java, R.
Experience (classroom/work) with popular ML frameworks: TensorFlow or PyTorch.
Nice to Have
Experience with research communities and/or efforts, including having published papers (being listed as author) at AI/ML/NLP/CV conferences (e.g. NeuraIPS, ICML, ICLR, ACL, CVPR, KDD etc).
Relevant work experience, including internships, full time industry experience or as a researcher in a lab.
Perks and Benefits
Flexible hours and remote working are considered
Allocations for continuous learning and development
Base salary coupled with sales revenue based commissions and stock options
Opportunity to grow with the company into a senior executive role
Please email your CV to firstname.lastname@example.org outlining your relevant experience.
The interview process will include three rounds. The first round will be a coding (software programming) interview which aims to evaluate your skills of coding and algorithm design. You will be asked to solve 2-3 algorithm problems online within limited time independently. The second and third rounds will be interviews with our senior management which will focus on your technical and personal skills, during which you would be expected to answer questions regarding your CV.
Pangaea Data’s headquarters is in London (UK) with teams in San Francisco (US) and Hong Kong. For more information please visit www.pangaeadata.ai.
Pangaea Data is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, colour, sex, sexual orientation, gender identity or expression, religion, national origin or ancestry, age, disability, marital status, pregnancy, protected veteran status, protected genetic information, political affiliation, or any other characteristics protected by local laws, regulations, or ordinances.