Georgina Smith/CIAT

Semantics and words matter for analyzing gender and social inclusivity in development projects

Compelling discussion, commentary, stories on agriculture within thriving ecosystems.

The Agrobiodiversity Index and the Enhancing Sustainability Across Agricultural Systems of the Water Land and Ecosystems (WLE FG5) program team have been puzzled by how to capture efforts that are hard to quantify or "invisible". For example, how do you measure commitments towards something? We set out to test the usefulness of text mining and artificial intelligence to reveal the power of words. What we found was a powerful tool to bring our discipline to task and progress more equitably.

Text mining is a relatively simple technique. It consists of automatically extracting any word, paragraph, or sentence from digital texts in several formats (PDFs, scanned documents, or Word documents). The challenge is defining "what to extract" and "how to analyze" it.

The identification of "what to extract" requires a lot of back and forth among researchers, practitioners, and policymakers. We often talk about the same things although we may do it using different words which impedes understanding and cross-collaboration.

Regarding "how to analyze" there is room for experimentation, trial, and error. Our team tried different methods but found that simple scoring can convey a clear message while identifying critical gaps in commitments towards agrobiodiversity usage and protection.

The Agrobiodiversity Index and the WLE FG5 team is interested in supporting the adoption of this technique in different projects and for different purposes. For example, text mining can be widely adopted among CGIAR Research Centers and development agencies that want to draw attention to the progress or effect of policies. Additionally, text mining could support the assessment of projects to see if they include and address or ignore key topics -- notable amongst them, gender.

Despite the importance of gender-sensitive policies and approaches, 'gender-blind' and biased terms still dominate global discussions and agendas. For example, we see on a daily basis terms such as 'man-made' or 'man-power,' which leave at least half of humanity out of the story. These may seem to be only semantics, but such details shape the way we think, which in turn affects how we research, write, and implement projects and changes.

Text mining analysis has already begun to show the scope of what is present and what is missing from restoration projects on gender issues. The Restoring Degraded Landscapes (RDL) team is working to understand how gender is presented in existing land restoration literature and policy environment, with Ethiopia as a case study. A quick screen of titles and keywords across 279 peer-reviewed articles on restoration in Ethiopia shows limited use of terms such as inclusion (or inclusiveness), women, youth, gender, elder, power dynamics or rights.

Word cloud with the title and keywords from 279 peer-reviewed research articles on restoration in Ethiopia

Deep-seated gendered bias permeates all aspects of our societies. In rural communities, women and men, young and old, rich and poor typically access, use, and benefit from crops, trees, lands, and other resources differently and unequally. Therefore, sustainable development projects such as those aiming to restore ecosystem services must account for gender differences or their varied impacts on women and men. If this critical shortcoming remains unaddressed, it can increase gender gaps and, more generally, bolster inequality.

Text mining will help to bring such weighted gender biases to light. The Agrobiodiversity Index team and the WLE FG5 team is partnering with the RDL team to build "in-house" capacity and to expand the applications of text mining. This process includes a four-hour training on how to conduct text mining analysis, support for identifying the list of keywords and the documents that will be mined. This is the method that is being used in the Ethiopia case study.

Analyzing the presence and/or absence of words is not necessarily, by itself, an analysis of efficiency or efficacy. Still, the inclusion of certain words and concepts in development agendas and policies creates an enabling environment, which is a first step towards sorely needed action.

Stay tuned for the results of this critical research. Feel free to contact Ermias on the overall project, and Wuletawu, or Dawit for the case study in Ethiopia.Contact Roseline Remans if you are interested in using the text mining tools in your research or project. Also, contact Natalia or Sarah for further details on text mining applications. Including, assessing countries' commitment towards using and safeguarding agrobiodiversity (9 countries) and commitment in the national biodiversity strategies and action plans (NBSAPs) to protect and use agrobiodiversity (119 countries).


Thrive blog is a space for independent thought and aims to stimulate discussion among sustainable agriculture researchers and the public. Blogs are facilitated by the CGIAR Research Program on Water, Land and Ecosystems (WLE) but reflect the opinions and information of the authors only and not necessarily those of WLE and its donors or partners.

WLE and partners are supported by CGIAR Trust Fund Contributors, including: ACIARDFIDDGISSDC, and others.

Experiment to determine soil loss. Debre Berhan, central Ethiopia.
Georgina Smith/CIAT