Hear our Customer Success Stories

Go back

Natural Language Generation Engineer (NLG)

Preference for Location: London or San Francisco
Number of vacancies: 1

About Pangaea Data Limited

Pangaea Data Limited provides a machine learning based software product to its customers from the biopharmaceutical and healthcare industry for faster identification of patient cohorts based on phenotypes (clinical characteristics and symptoms) from electronic health records (EHRs) and unstructured doctors’ notes. This is critical for detecting patients at risk of diseases, finding genes linked to a phenotype in the context of drug or biomarker discovery, recruiting patients for clinical trials, conducting real world evidence (RWE) studies and matching the right patients with the right drugs. The company’s product has demonstrated that it is able to find the right patient cohorts at 50x speed and 30% higher accuracy when compared to alternative methodologies, such as rule based natural language processing (NLP), keyword extraction and manual means. All such work and metrics are published in high impact peer reviewed journals accessible through https://www.pangaeadata.ai/insights/.

Pangaea is based in San Francisco, London and Hong Kong and was founded by serial entrepreneurs who have raised more than £130 million through their work and is advised by leading experts from industry, Imperial College London and Stanford University. Pangaea’s investors include leading Deep Tech and Life science funds and serial entrepreneurs who founded several UK and US headquartered unicorns. Pangaea has access to more than 500 million patient electronic health records through its partnerships with hospitals and other such providers from the US, UK, Europe, South America and Asia Pacific.

Role Description and Responsibilities

As a Natural Language Processing (NLP) Engineer with a focus in Natural Language Generation (NLG), you will be a core member of the technical team. In this role you will be researching, developing, and producing cutting-edge NLG technology, as well as also having the opportunity to work on NLP problems and product alongside our engineering team. Strong research experience and knowledge in NLP, especially NLG, are essential for this role.

Key responsibilities

  • Research and develop cutting-edge technology and product on NLG.
  • Applying Natural Language Generation technology for medical applications.
  • Communicate with end users and colleagues on requirements.

Mandatory Requirements

Technical Skills:

  • University degree (bachelors, masters, and/or doctorate) with a minimum of two years of university-level Computer Science, Medical Informatics or similar.
  • Experience in Machine learning, Natural Language Processing, Algorithmic Foundations of Optimization.
  • Experience in Natural Language Generation, including but not limited to, Summarization, Machine Translation, Chatbot, Sequence-to-sequence Models.
  • Experience on general programming languages: Python, C++, Java.
  • Experience with popular ML frameworks: TensorFlow, PyTorch, Scikit-learn, HuggingFace Transformers, etc.

Personal traits:

  • Ability to effectively communicate complex ideas, both verbally and in writing, in English.
  • Strong intuition for what makes products a joy to use.
  • Ability to effectively communicate with end-users and understand their varying requirements.
  • Strong interpersonal skills and the ability to engage all levels of our and the end user’s organization (especially the front line).
  • Ability to work collaboratively in a team environment and collaborate efficiently with colleagues in different time zones.

Nice to Have:

  • Experience with research communities and/or efforts, including having published papers (being listed as author) at AI/ML/NLP/CV conferences (e.g. NeuraIPS, ICML, ICLR, ACL, CVPR, KDD, etc.) or biomedical journals.

Perks and Benefits

  • Flexible working schedule and ability to work in London & San Francisco
  • Highly competitive salary depending on experience.
  • Package of attractive benefits, including stock options and discretionary bonus.
  • You will join a dedicated highly renowned team offering you the opportunity to grow and develop your professional skills and profile.
  • You will have the opportunity to learn about building a rapidly growing business from experienced professionals and serial entrepreneurs.


The interview process will include at least four rounds. The first round will be an online coding (software programming) interview which aims to evaluate your skills of coding and algorithm design. In this online assessment you will be asked to independently solve 2-3 algorithm problems. The 2nd-4th rounds will be interviews with our senior management. These rounds will focus on your technical and personal skills, during which you would be expected to answer questions regarding your CV.

Application Contact Information

This position will be highly visible to senior management within our company and at our customer organisations.

Your application should be in English and include a cover letter highlighting your experiences with Biopharmaceutical companies and Healthcare organisations along with your CV.

Please send your latest resume along with a cover letter to careers@pangaeadata.ai.

General Information

Pangaea Data is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, colour, sex, sexual orientation, gender identity or expression, religion, national origin or ancestry, age, disability, marital status, pregnancy, protected veteran status, protected genetic information, political affiliation, or any other characteristics protected by local laws, regulations, or ordinances.