IFC E T Consultant - Environment

Job Description

Description

IFC — a member of the World Bank Group — is the largest global development institution focused on the private sector in emerging markets. We work in more than 100 countries, using our capital, expertise, and influence to create markets and opportunities in developing countries. In fiscal year 2023, IFC committed a record US$43.7 billion to private companies and financial institutions in developing countries, leveraging the power of the private sector to end extreme poverty and boost shared prosperity as economies grapple with the impacts of global compounding crises. For more information, visit www.ifc.org.

The ESG Sustainability Advice and Solutions Department (CEG) is IFC’s center of excellence in the area of Environment, Social, and Governance (ESG) and offers a range of expertise to help IFC’s Investment and Advisory clients identify and solve complex ESG risk-related challenges, and to find value-added opportunities in their business operations. More details can be found at www.ifc.org/sustainability.

CEG has a long history of innovation centered on integrating ESG considerations into emerging markets. A dedicated Innovation and Data Science function within the Department has been mandated to serve as a learning, technology exploration, experimentation, and knowledge hub to explore, test, and understand emerging technologies, using design thinking and lean-agile principles to enable IFC to fully harness the Digital Age and achieve its mission.

CEG is building an Artificial Intelligence-powered platform known as Machine Learning ESG Analyst (MALENA) to create ESG risk and impact assessment capacity at scale. More details can be found on the MALENA Page.

The purpose of this Terms of Reference is to hire an extended-term consultant (ETC) to support the MALENA team and the CEG Data Science work program in building version 2.0 based Machine Learning/Deep Learning models, emergent use cases such as Generative AI and creating real-time inference pipelines for beta testers.

The detailed roles and responsibilities are listed below.

Roles and Responsibilities:

• Report to CEG Innovation Lead.
• Work independently in coordination with the CEG Lead Data Scientist
• Work with ESG stakeholders to understand complex business problems and connect those problems with solvable Data Science solutions.
• Audit the different text data assets of the Department and determine how to analyze these data assets for insights.
• Clean and prepare text data to enable Natural Language understanding.
• Prepare high-quality training data with appropriate coverage of the ESG business domain.
• Advise and apply different natural language processing (NLP) techniques to analyze unstructured data, including information retrieval, questions-answering, and text generation within the purview of IFC’s ESG business domain.
• Work independently to build ingestion processes to prepare, extract, and annotate a variety of unstructured data sources (social media, news, internal/external documents, images, video, voice, emails, financial data, and operational data).
• Build data automation and integration solutions to ease business problem-solving and enable data sources connections and data quality assurance.
• Leverage a variety of tools and approaches to solve complex business objectives, from statistical NLP, information retrieval/extraction, Machine Learning/Deep Learning, Large Language Modelling, Machine Translation, and semantic search.
• Advise on automated ways to label unstructured data from various data sources.
• Experiment with multiple machine learning and large language models (LLMs) and choose the optimal model for training or fine-tuning.
• Follow industry trends in the data science and the AI domain and execute proof of concepts with advanced techniques.
• Have familiarity with working using a lean agile team framework with teams that deliver business value incrementally.

Selection Criteria

• Ph.D. in Computer Science, Data Science, Data Engineering, with a minimum of seven years of experience working in and leading teams of AI, data & analytics professionals to deliver on business-driven analytics projects using natural language processing and machine learning on unstructured data. Candidates holding a Master’s Degree and at least 10 years of relevant work experience may also be considered.

• Expert knowledge of data preparation, machine learning, deep learning, natural language processing, and the ability to discuss mathematical formulations, alternatives, and impact on modeling approaches.

• In-depth understanding of Text analytics & Natural Language Processing concepts such as Lemmatization, Word segmentation, and Part-of-speech. Tagging, Stemming, Named-Entity Recognition, word2Vec, and Doc2Vec.

• Demonstrated experience fine-tuning pre-trained large language models, such as GPT-3, BERT, Facebook's Llama, Mistral, on a specific task or dataset to improve performance.

• Expert knowledge in creating a new language model from scratch, including designing a new architecture, collecting, and cleaning training data, and training the model.

• Expertise in developing applications that use large language models, such as chatbots, recommendation systems, or automated summarization tools.

• Expertise in performing prompt engineering to improve large model outcomes.

• Experience in building multimodal models, e.g., involves integrating and analyzing data from multiple sources or modalities, such as text, images, audio, video, and other types of structured and unstructured data,

• Ability to quickly use and implement the latest NLP research and approaches.

• Advanced expertise in Python (PySpark) and specialized machine learning libraries/packages for implementing machine learning models.

• Proficient in Python (PySpark) data analysis and ML libraries like Panda, NumPy, and Scikit-learn.

• At least seven years of experience using one or more of the following deep learning frameworks: TensorFlow, Keras, MLFlow, Pytorch, etc.

• Extensive work experience in one or more of the standard NLP models and tools like Google BERT, RoBERTa, SpaCy, NLTK, Stanford Core NLP, and Lang chain.

• Proficient with parallel processing APIs such as Apache Spark and PySpark

• Proficient in Machine translation and Optical Character Recognition (OCR) for complex documents processing (PDF, Word Documents, Scanned Documents.

• Extensive work experience in building Jupyter notebooks for conducting data science operations.

• Substantive experience performing data science tasks (data discovery, cleaning, model selection, validation, and deployment).

• Mastery of coding artificial intelligence methods and restructuring, refactoring, and optimizing code for efficiency.

• Expert understanding of development practices such as testing, code design, complexity, and code optimization.

• Experience with Cloud platforms (Azure Databricks and AWS)

• Outstanding problem-solving skills and effective verbal/written communication.

• Demonstrated ability to work in multi-disciplinary teams.

• Capacity to work independently.

• Understanding non-financial risks, including environmental, social, and governance risks, is a plus.

• Strong expertise in MLOps and Model serving at scale and knowledge of innovative ways of deploying ML models such as plugins and add-ons, SharePoint add-ons, mobile apps, desktop apps, APIs, etc.

• Proficient with taxonomy management Platforms, PoolParty or FIBO

World Bank Group Core Competencies

We are proud to be an equal opportunity and inclusive employer with a dedicated and committed workforce, and do not discriminate based on gender, gender identity, religion, race, ethnicity, sexual orientation, or disability.

Learn more about working at the World Bank and IFC, including our values and inspiring stories.

This position is no longer open.

Similar Jobs