Predicting Length-of-Stay & Risk of Mortality for ICU Patients

Pangaea Data Awarded Top Tier Co-Sell Partnership Status by Microsoft

Pangaea Named ‘Digital Solution of the Year’ by UK Government’s Department for International Trade

Global Pharmaceutical Proves Pangaea’s AI

Go back

Natural Language Processing (NLP) Researcher

Preference for Location: London, UK

About Pangaea Data Limited

Pangaea Data provides a novel AI driven product, which has clinically proven to characterize patients in a federated privacy preserving and scalable manner. For example, Pangaea helped characterize cachexia in cancer patients, which led to the discovery of 6x more undiagnosed, miscoded and at risk cachectic cancer patients with 95% accuracy with the potential to save £1billion annually and improving outcomes. Additionally, US based healthcare systems have applied Pangaea to measure health inequity across the US through characterization of patients and their journeys based on tumour genomic testing results, demographics and social indicators from patient records. Clinicians at pharmaceutical companies are applying Pangaea to discover new clinically actionable insights which have helped them find new drug targets, define new end points for clinical trials, understand relationships between drugs and adverse events, find more patients for clinical trials and during the launch of new therapies. The founders (Dr. Vibhor Gupta and Prof. Yike Guo) are based between South San Francisco and London and have attracted $200 million through their research. 

The Role

Pangaea is looking for a talent researcher to join its founding technical team to design and develop novel state-of-the-art NLP/NLG algorithms which can potentially be productized in Pangaea’s core product (Pangaea’s Intelligence Extraction and Summarization – PIES). A strong research background, knowledge and skills Natural Language Processing/Generation especially Large Language Models (LLMs) are essential.

Key Responsibilities

Technical Responsibilities:

  • Understand the problems, pain points, challenges and requests in the industry and academics in NLP/NLG.
  • Define research statement and write research plan and grant applications (when needed).
  • Research, design, develop and optimise novel cutting-edge NLP/NLG algorithms to address real-world challenges such as extraction, prediction and generation.
  • Monitor the impact of new research to determine if they achieved the goals set out for them at the start.
  • Publish origin research works on top conferences and journals.
  • Clearly communicate research plan with internal stakeholders and support colleagues on productisation.
  • Understand the high-level company vision and goals, and make sure these are reflected in ongoing research.

As an NLP researcher, you will be involved in leadership, research direction decisions and coordination between teams that go into the above process.​


Personal Traits:

  • A strong intuition for what makes products a joy to use.
  • Empathy for how different users will need different things out of a product at different stages, and how to effectively serve these different needs in one product.
  • Strong communication and mediation skills.
  • Strong people skills and the ability to engage all levels of the organization (especially the front line).
  • Ability to work collaboratively in a team environment.
  • Ability to communicate complex ideas effectively, both verbally and in writing, in English.
  • A strong research and software engineering background with machine learning expertise to understand how the user facing product will tie into research, backend and architectural decisions. ​

Technical Skills:

  • With university research qualification (Doctorate) or equivalent research experience in industry in Natural Language Processing, Machine Learning and Deep Learning.
  • Experience with research communities and/or efforts, including having published papers (being listed as author) at AI/ML/NLP/CV conferences (e.g. NeuraIPS, ICML, ICLR, ACL, EMNLP, NAACL, CVPR, KDD etc) and/or biomedical journals.
  • Experience on general programming languages: Python, C++, Java, etc.
  • Experience with deep learning, machine learning and NLP frameworks such as PyTorch (or TensorFlow), HuggingFace Transformer, Scikit-learn.
  • Experience with training and tuning of large language models.
  • Experience with working in Linux.

Nice to Have:

  • Experience in productising latest research outputs in machine learning and deep learning especially NLP/NLG.
  • Experience with cloud platforms such as AWS, Azure, Google Cloud Platform.
  • Experience in writing research grant applications.

Perks and Benefits

  • Flexible working hours.
  • Salary dependent on experience.
  • Benefits include private medical insurance, life insurance and travel cards.
  • You would join a small, dedicated and fast-growing team.
  • You will have the opportunity to learn about building a startup business from experienced professionals and serial entrepreneurs.
  • We are currently supported by serial entrepreneurs and angel investors. You will have the opportunity to experience an investment life cycle for a startup and meet leading venture capitalists.
  • We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, colour, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status. ​

Application Contact Information

Please send your latest resume along with a cover letter to

General Information

Pangaea Data’s headquarters is in London (UK) with teams in San Francisco (US) and Hong Kong. For more information, please visit

Pangaea Data is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, colour, sex, sexual orientation, gender identity or expression, religion, national origin or ancestry, age, disability, marital status, pregnancy, protected veteran status, protected genetic information, political affiliation, or any other characteristics protected by local laws, regulations, or ordinances.