European Commission
Directorate-General for Research and Innovation (DG RTD)
Not applicable
Not applicable
Not applicable
RTD-GENDERINRESEARCH@ec.europa.eu
RTD-PUBLICATIONS@ec.europa.eu
Not applicable
Not applicable
21/12/2022
21/12/2022
21/12/2022
She Figures provides a range of comparable, pan-European statistics on gender equality in Research and Innovation, and has been released every three years since 2003.
A large portion of the indicators included in She Figures present and explore the following themes:
Each edition of She Figures also aims to provide better understanding of emerging issues by introducing additional indicators.
Further information about She Figures publications, including downloadable reports and other publications can also be found on the webpage of the Publications Office of the European Union.
The Gender Statistics Database includes She Figures indicators since 2015 (reference year 2012).
Not applicable
The publications within Web of Science (WoSTM) and ScopusTM encompass a diverse range of research areas. Web of Science includes approximately 12,000 peer-reviewed publications in the fields of natural sciences and engineering (NSE), health sciences (HS) and social sciences and humanities (among other areas), while Scopus includes publications from more than 40,000 journals in all major research fields (including physical sciences, health sciences, life sciences and social sciences)
For a full overview of all the statistical concepts and definitions related to the She Figures publications users are referred to:
Below we list only the statistical concepts and definitions that are relevant to understand and interpret the She Figures indicators included in the Gender Statistics Database.
Active authors:
In the context of these indicators, “active authors” are defined as those that produced 10 or more papers in the last 20 years (2000-2019) and at least 1 paper in the last 5 years (2015-2019) OR those who produced 4 or more papers in last 5 years (2015-2019).
Fractional authorship:
“Fractional authorship” is a means of distributing publication and citation counts equally across multiple authors. For example, a given publication with 2 authors would be counted as 0.5 publications for each author.
Gender dimension of a publication:
Research that explores the gender dimension is defined a research that “integrates sex and gender analysis into research, whereby “sex” refers to basic biological characteristics of females and males and ‘gender’ refers to cultural attitudes and behaviours that shape ‘feminine’ and ‘masculine’ behaviours, products, technologies, environments, and knowledge.” This definition is provided by the Gender Innovation Project (http://ec.europa.eu/research/swafs/pdf/pub_gender_equality/gendered_innovations-KINA25848ENC.pdf#view=fit&pagemode=none)
Compound Annual Growth Rate (CAGR):
Compound annual growth rate (CAGR) is defined as the year-over-year constant growth rate over a specified period of time. Starting with the first value in any series and applying this rate for each of the time intervals yields the amount in the final value of the series. Throughout the term CAGR is also referred to as ‘(yearly) growth rate.’
Field-Weighted Citation Index (FWCI):
FWCI is an indicator of citation impact of a publication based on the actual number of citations received by an article compared to the expected number of citations for articles of the same document type (article, review or conference proceeding paper), publication year and subject field. When an article is classified in two or more subject fields, the harmonic means of the actual and expected citation rates is used. The indicator is therefore always defined with reference to a global baseline of 1.0 and intrinsically accounts for differences in citation accrual over time, differences in citation rates for different document types (reviews typically attract more citations than research articles, for example) as well as subject-specific differences in citation frequencies overall and over time and document types.
In general, the Field-Weighted Citation Impact (FWCI) for a publication is defined as:
FWCI = Ci/Ei
Where:
Ci: citations received by publication i
Ei: expected number of citations received by all similar publications in the publication year plus following 3 years
When a similar publication is allocated to more than one discipline, the harmonic mean is used to calculate Ei.
For indicators on authorship, the statistical units are the authors of publications as recorded in the Web of Science (WoSTM) and ScopusTM abstract and citation database.
For the indicator on gender dimension in research and innovation content, the statistical units are publications as recorded in the Web of Science (WoSTM) and ScopusTM abstract and citation database.
For indicators on authorship, the statistical population is all academic authors with a first name available on peer-reviewed within the Web of Science (WoSTM) and ScopusTM abstract and citation database (depending on the edition of She Figures) in each EU27, EU associated country, EU candidate country and the UK. All author IDs for whom no first name data was available were not included in the analysis. For indicators on active authorship, only active authors are included.
For the indicator on gender dimension in research and innovation content, the statistical population is all publications as recorded in the Web of Science (WoSTM) and ScopusTM abstract and citation database.
The EU Member States, in addition to candidate countries (Albania, North Macedonia, Montenegro, Serbia and Turkey) and Associated Countries (Armenia, Bosnia and Herzegovina, Faroe Islands, Georgia, Iceland, Israel, Moldova, Norway, Switzerland, Tunisia, Ukraine and the UK).
The time coverage of data in She Figures publications varies by indicator. Detailed description of time coverage for each indicator is available in:
Not applicable
Indicators on authorships and co-authorships: ratios (women to men) and percentages (for the compound annual growth rates - CAGR)
Indicator on gender dimension in research and innovation content: percentage
The reference period varies by indicator. The specific reference period for the data provided is clearly stated in each indicator description.
The EU is committed to advancing gender equality in the area of research and development. Particularly, the promotion of gender equality and gender mainstreaming in research is a clear objective and a legal obligation under the EU framework programme for research and innovation Reg 1291/2013).
More recently, the 2020 ERA Communication renewed the EU’s commitment to gender equality and gender mainstreaming in research through deepening existing priorities and initiatives.
Not applicable
Not applicable
Some She Figures indicators use the names of scientific authors (included in the Web of Science (WoSTM) and ScopusTM databases) or of patent applicants (from the PATSTAT database). However, the information on the names is only used to infer the sex of the author/applicant and then aggregated at the country level. Hence, no direct identification of a person is possible from the indicators in the She Figures.
The She Figures reports are published one year after the data collection. Corresponding datasets containing the indicators from the She Figures included in the Gender Statistics Database are available through the EU Open Data Portal at the following links:
Not applicable
The European Commission (Directorate-General for Research and Innovation) makes datasets freely available to the public. Datasets are made available no later than one year after completion of data collection.
She Figures datasets and accompanying materials are made available online via the EU Open Data Portal:
She Figures data collections takes place every three years. Publication and corresponding data files are disseminated one year after the data collection.
No regular news releases.
The She Figures publications are not published as an online database. Data files related to She Figures indicators 2015, 2018 and 2021 are available from the European Open Data Portal.
The She Figures publications are not published as an online database. Data files related to She Figures indicators 2015, 2018 and 2021 are available from the European Open Data Portal.
Some She Figures indicators are computed from survey micro-data. The underlying micro-data are not available for public access.
Not applicable
She Figures 2015, 2018 and 2021 contain methodological appendices detailing data sources and methods. Additionally, these editions of the She Figures are accompanied by specific handbooks, where users can find extensive information on the sources and the construction of each indicator.
The She Figures handbooks can be found at the following links:
Information on all aspects of data quality is available in the handbooks accompanying the She Figures 2015 and 2018 publications:
To ensure high quality of the data, a quality framework was devised. As part of this framework three different dimension were considered in selecting indicators: relevance, accuracy and availability. Each indicator was evaluated by grading it for each dimension and by an overall assessment. Details on the data quality framework can be found in the handbooks accompanying the 2015 and 2018 She Figures publications:
Based on the European Statistical System (ESS) quality criteria, the She Figures indicators can be considered of high quality in terms of relevance, timeliness and punctuality. In fact, the She Figures indicators are highly relevant for a wide range of users, from national governments, the EU, and international and national non-governmental organisation. The She Figures indicators use the most recent available data to describe the current situation in the single countries and at the EU level and are published no later than one year after the data collections.
Some weaknesses have been identified in terms of accuracy and comparability over time / across countries for some She Figures indicators. Further details on this issue are provided at points 13. and 15. below.
The users of She Figures data include EU policy makers, national governments, and international organisations. The publications provide an insight into the situation regarding gender equality in Research and Innovation at the pan-European level. It aims to give an overview of the gender equality situation in research and innovation, using a wide range of indicators to examine the impact and effectiveness of the policies implemented in this area.
Not applicable.
The She Figures indicators are complete compared to relevant regulations and guidelines.
As noted in the She Figures handbook 2018, bibliometric indicators can only give insights about the outputs of male and female authors in different countries but should not be interpreted as accurate measure of scientific production. Especially for social sciences and humanities, a reasonable proportion of research outputs take the form of books, monographs, and non-textual media. As the She Figures bibliometric indicators refer only to journal articles, they cannot fully account for the scientific production of researchers in those fields. Moreover, bibliometric indicators can only account for the scientific production of named authors and not of all “researchers” (based on the OECD, 2015 definition, provided at point 3.4 above) in a country. For instance, research in the corporate sector would not be included in these bibliometric indicators. The bibliometric indicators exploring the sex and gender dimension of research output can be considered accurate. For the 2018 She Figures, the output generated by the different bibliometric queries used to construct these indicators was validated by a pool of gender experts, and the number of false positives, based on the expert assessment, was less than 1%. Despite this accuracy, these indicators can still suffer from bias because they depend on the fields of publications presented in the bibliometric databases used as data sources.
Finally, both bibliometric indicators and indicators on inventions and innovations are computed from databases that do not explicitly indicate the sex of the author/patent applicant. Different methodologies were used to infer the sex of authors/applicants for the Scopus/Web of Science and PATSTAT databases. Details can be found in the She Figures handbooks 2015, 2018 and 2021. Overall, these procedures ensure a good level of accuracy for the matched names but may fail to provide matches between authors and sexes in some cases.
Sampling errors are reported also for bibliometric and invention indicators. The sampling errors assume that the bibliometric / patent database, respectively are a random sample of all publication / patent applications in each subfield / IPC category (respectively). Sampling errors were used to compute confidence intervals for the bibliometric / inventions indicators, which are reported in the She Figures publications.
Non-sampling errors for the She Figures indicators included in the Gender Statistics Database may be related to processing errors such as cleaning errors or mis-assignment of gender or the presence of outliers.
The She Figures handbooks 2015, 2018 and 2021 detail all the coherence and validation checks that were carried out to detect potential non-sampling errors and guarantee accuracy of the data.
The She Figures data collections take place every three years. Data refer to the most recent point in time available (this varies by data sources).
Punctuality is 100%, as the She Figures publications are released according to schedule.
Bibliometric indicators are partially comparable across countries. In fact, although the methodology to compute the indicators is the same across countries, the extent to which name of the authors were matched to their sexes varies by country. Details on the percentage of matched sex-name pairs can be found in the She Figures handbooks 2015, 2018 and 2021.
Comparability over time of the She Figures indicators included in the Gender Statistics Database was established based on Appendix 1 of the She Figures 2021 publication, which provides a correspondence table between the She Figures 2018, 2015 and 2021.
For indicators that, despite having the same name, were not comparable between the two data collections (e.g., because of major methodological changes) only the most recent year of data was included in the Gender Statistics Database (i.e., the data reported in the 2021 She Figures).
Among the She Figures indicators that are comparable over time, the Gender Statistics Database includes:
Cross-domain coherence cannot be established for bibliometric and innovation indicators, as the data sources used in the She Figures are the only ones available at the EU level to assess these phenomena.
The She Figures indicators on researchers with “precarious working contracts” are not fully coherent with the labour market indicators published by Eurostat, because the definition of precarious working contracts used by Eurostat is different from the one used in the She Figures. Eurostat defines ‘precarious working contracts’ as being for three months or less, while the She Figures defines precarious working contracts as those without contract, with fixed term contracts of up to one year, or with other non-fixed term, non-permanent contracts.
Each She Figures indicator included in the Gender Statistics Database has full internal coherence, as it is based on the same data source. Data sources differ across indicators.
Data for the She Figures indicators in the Gender Statistics Database was collected by the European Commission, Directorate General Research and Innovation from EC MORE Survey on the Mobility of Researchers, the Worldwide Patent Statistical Database (PATSTAT) of the European Patent Office (EPO) and the Web of ScienceTM and ScopusTM abstract and citation database.
No cost burden has been placed on individual countries or EU Member States for the collection of the She Figures indicators.
Variables are being edited and corrected based on set of logical edits at data entry stage. No revisions are done after the publication for the data.
There is no fixed revision schedule.
Bibliometric indicators were computed in She Figures 2015 from Web of ScienceTM and in She Figures 2018 and 2021 from the ScopusTM abstract and citation databases.
Triennial
Bibliometric indicators were computed in She Figures 2015 from Web of ScienceTM and in She Figures 2018 and 2021 from the ScopusTM abstract and citation databases. Specific procedures, described in the She Figures handbooks 2015, 2018 and 2021, are used to attribute a sex to the name of the authors in the database. Bibliometric queries (validated by external experts) are used to establish the gender dimension in the content of scientific publications.
The She Figures handbooks 2015, 2018 and 2021 detail all the coherence and validation checks that were carried out to detect and correct potential non-sampling errors and presence of outliers and guarantee accuracy of the data.
For information on data compilation processes for each She Figures indicators, please consult:
Not applicable
Not applicable