Skip to Job Description

Consultancy for Development of a Meta-dataset of Wheat and Maize Fertilizer

International Maize and Wheat Improvement Center (CIMMYT)
Consultancy
Apply Now
Plan Next Steps
Posted 1 hour ago
Job Description

1        Background

Under the CGIAR Sustainable Farming Program, this activity aims to compile, extract, validate, and merge published wheat and maize fertilizer evidence from Nepal and India into a harmonized, CAROB-aligned agronomy dataset as a farm-level diagnostic evidence generation.

 Fertilizer trial/demo papers often include complex treatment structures, including recommended fertilizer doses, farmer practice, site-specific nutrient management, nutrient omission plots, and yield-gap comparisons. In addition, relevant evidence may include binding constraints including agronomy, soil, and micronutrient constraints, particularly boron and zinc as critical.

Advertisement

 Existing evidence is scattered across peer-reviewed papers, project reports, theses, institutional datasets, and local knowledge sources. Standardizing this evidence is necessary to support diagnostic protocols, spatial analyses, farm-level constraint characterization, and context-specific intervention targeting.

2        Overall objective

To develop a reproducible, quality-controlled, CAROB-aligned meta-dataset of wheat and maize fertilizer treatment trials from Nepal and India, suitable for subsequent meta-analysis, nutrient-use-efficiency analysis, yield response estimation, and decision-support applications, while also generating standardized diagnostic evidence products for identifying farm-level binding constraints.

3        Program alignment

This activity contributes to the CGIAR Sustainable Farming Science Program under AoW03: Farm-level diagnostic evidence on soil, plant, water, and socio-economic constraints. It directly supports to generate, standardize, and disseminate diagnostic frameworks, spatial analyses, and evidence products that characterize farm-level binding constraints and guide context-specific intervention targeting. The assignment will contribute to standardized diagnostic protocols, one per country, and diagnostic databases that can be pipelined into interactive dashboards.

4        Activity 1- Online search

4.1      Literature search and corpus compilation

  1. Develop a search protocol for retrieving peer-reviewed articles, reports, theses, project documents, and datasets on wheat and maize fertilizer trials in Nepal and India.
  2. Search relevant scientific and grey-literature sources using structured keyword combinations covering:

Wheat, maize; fertilizer trials; nitrogen, phosphorus, potassium, sulfur, zinc, micronutrients; nutrient omission plots; site-specific nutrient management;

recommended dose of fertilizer; farmer fertilizer practice; Nutrient Expert;

India and Nepal, CSISA wheat datasets; soil diagnostic reports; weed, pest, and disease trial evidence; crop nutrition and human nutrition linkages.

  1. Screen studies using clear inclusion and exclusion criteria.
  2. Compile a literature inventory with bibliographic details, study location, year, crop season, treatment types, available variables, and data-extraction status.

4.2      Review papers and develop CAROB-aligned extraction rules

  1. Review selected papers, tables, figures, appendices, supplementary files, and reports.
  2. Define a CAROB-aligned extraction schema for trial-level, site-level, treatment-level, and observation-level data.
  3. Develop search and extraction rules to capture:
    • study ID and source; country, state/province, district, site, coordinates where available;  year, season, irrigation status, cropping system;  wheat or maize variety or cultivar;  experimental design, replication, plot size;  treatment name and treatment code;  N, P, K, S, Zn, B and other nutrient rates;  fertilizer source, timing, and method of application; treatment comparisons, including control, N omission, P omission, K omission, recommended dose, farmer practice, SSNM, and enhanced-efficiency fertilizers;  grain yield, biomass, harvest index, nutrient uptake, profitability, and NUE indicators.
  4. Use Python and AI-assisted text extraction to identify candidate data from PDFs and tables, while retaining transparent scripts and logs.

4.3      Manual checking and quality assurance

  1. Manually verify extracted values against original papers and appendices.
  2. Check consistency of units, treatment labels, nutrient-rate conversions, yield units, site names, seasons, and replication structure.
  3. Flag ambiguous, missing, or non-extractable information.
  4. Maintain a QA log documenting:
    • corrected values; assumptions; unresolved issues; studies excluded after full review; confidence level of extracted records.
  5. Ensure that all extracted records remain traceable to page, table, figure, or appendix source.
  6. Document whether each source contributes to fertilizer response evidence, soil diagnostics, weed/pest/disease diagnostics, micronutrient constraints, or crop–human nutrition linkages.

5        Activity 2 - Data harmonization and merging

  1. Convert study-level extractions into standardized CAROB-style CSV files.
  2. Harmonize variable names, data types, units, treatment codes, and metadata.
  3. Merge all curated study datasets into a single master dataset.
  4. Include source identifiers, study identifiers, and processing flags to allow full traceability.
  5. Produce a reproducible Python workflow, with optional R compatibility, for loading, checking, and merging datasets.
  6. Deliver a clean final dataset, raw extraction files, QA logs, and metadata documentation.

6        Expected sub-activity reports

Deliverable and Description

1. Inception note and search protocol

Search strings, databases/sources, screening criteria, target variables, and extraction workflow

2. Literature inventory

Bibliographic database of wheat and maize fertilizer trial papers and reports from Nepal and India, including CSISA data, soil diagnostic reports, weed, pest and disease reports, theses, and micronutrient evidence

3. CAROB-aligned extraction template

Data dictionary, variable definitions, treatment coding rules, and unit conventions, including diagnostic constraint categories

4. AI/Python extraction workflow

Scripts or notebooks for PDF/table extraction, text search, treatment parsing, and structured output

5. Manually checked study datasets

Study-level CSV files with corrected and verified extracted data

6. QA and audit log

Documentation of checks, corrections, assumptions, exclusions, and unresolved issues

7. Merged master dataset

CAROB-aligned wheat and maize fertilizer trial dataset for Nepal and India

8. Final technical report

Summary of search results, extraction coverage, dataset structure, limitations, and recommendations for meta-analysis

Requirements

1.1      Qualifications and Experience

The consultant or consultancy team should have:

  1. Proficiency in Python/R, including:
    • Pandas/packages; PDF/table extraction; data cleaning; reproducible workflows; structured exports to CSV or similar formats.
  2. Experience with AI-assisted evidence extraction, including use of LLMs, prompt design, text-mining workflows, semantic search, or human-in-the-loop extraction systems.
  3. Demonstrated experience in meta-analysis or systematic evidence synthesis, preferably involving agronomic trials, crop yield response, nutrient management, or fertilizer experiments.
  4. Knowledge of agronomy, soil science, crop science, agricultural, statistics, or a closely related field is preferred.
  5. Strong understanding of fertilizer trial design, including nutrient omission plots, N-rate trials, recommended dose comparisons, farmer practice, site-specific nutrient management, and multi-location field trials.
  6. Familiarity with agronomy data standards, preferably CAROB nomenclature or comparable crop-trial harmonization standards.
  7. Experience with South Asian cereal systems, especially wheat/maize-based systems in Nepal and India.
  8. Ability to document assumptions clearly and produce reproducible, auditable code and datasets.

1.2      Desirable qualifications

Experience with Python/R, Git/GitHub, Zotero/Mendeley, PRISMA-style screening, OCR correction, geocoding of trial locations, nutrient-use-efficiency indicators, and fertilizer response modeling will be an advantage.

1.3      Reporting and supervision

The consultant will report to CIMMYT technical lead. Regular progress meetings will be held to review the search inventory, extraction rules, sample extractions, QA results, and final merged dataset.

Benefits

The consultancy is for a fixed-term contract of 1.5 months.

This position is recruited locally. Applicants must have the legal right to work in the duty station country at the time of application. CIMMYT does not provide visa sponsorship or work permit support for locally recruited staff roles.

{{waiting}}
This position is no longer open.