Document Types

Document Types in the SHE Corpus – A User’s Guide

The SHE Corpus interface enables users to limit search results to particular document formats, (not to be confused with document file formats such as .pdf or .docx).

Different document formats overlap, sometimes extensively, and it is notoriously difficult to define any document format with complete precision. What definitions there are, as well as lists of document formats, also tend to vary considerably across domains. The She Corpus focuses on the domain of health and its intersection with sustainability, but recognizes that the domain of health also intersects with many other sites of interaction, including the legal and political domains.

For our purposes, the following document formats are relevant and are broadly defined as follows.

* Act: This is a bill that has been passed into law. See for example the Irish Health (Regulation of Termination of Pregnancy) Act 2018.  See also Bill.

* Article: this label is strictly reserved for articles in scholarly journals such as The Lancet, BMJ, Feminists@Law, Medical Teacher, etc.

* Bill: A law proposal that has not been passed, meaning it has not become an act yet. See for example the Irish Health (Regulation of Termination of Pregnancy) Bill 2018. See also Act.

* Blog: is an informational website or section of a website, owned and maintained by an individual or institution. Posts on blog sites are relatively short and typically displayed in reverse chronological order. They may be written by a single individual, as in the case of the Jonathan Cook blog, or different individuals, as in the case of the Science-based Medicine blog. Many scholarly journals, publishers and other large institutions maintain a blog section on their website, e.g. the BMJ Global Health Blog and the Cambridge Core Blog. This document type is not to be confused with ‘web content’, as defined in the context of SHE and in this document.

* Book: is traditionally a printed document consisting of a substantial number of pages, with a cover, and bound or sewn together. Today, books are often published in both printed and electronic form, and sometimes only in electronic form. The SHE corpus occasionally includes individual chapters from a book, without including the book as a whole. These chapters are also tagged as ‘Book’.

* Committee_debate: Transcript of a debate having been undertaken in a parliamentary committee. See for example the Joint Committee on Health discussion on the report of the review of the operation of the Irish Health (Regulation of Termination of Pregnancy) Act 2018.

* Communications: in the SHE Corpus this category covers a somewhat diverse range of document formats of relatively short length. They include memos, letters, and briefing notes. These texts may be published online as web pages, or they may be downloadable as .pdf files. Unlike Info sheets, they are generally internal-facing documents.

* Constitution: Refers to the constitution of a national, sub-, or supra-national political entity, such as a country, a state in a federal country, or a union of nation states. The constitution sets out foundational principles and rules by which the entity in question should be governed. The constitution of a sovereign country ranks at the top of the legal hierarchy. Examples include the Constitution of Ireland and the National Constitution of Argentina.

* Convention: Conventions are binding formal agreements. When a convention is ratified in sufficient numbers, it enters into force and becomes legally binding on the states that have ratified and signed it. A good example is the UN Convention on the Rights of the Child.

 * Declaration: is an official, public announcement of an institutional or group position on an important issue, often labelled explicitly as Declaration, Manifesto, Press Release, Statement, (White) Paper, Submission, or Agreement; this category also includes supplements to and comments on such documents. Some declarations/statements are issued by expert individuals on a specific topic, an example from the SHE Corpus being the Anand Pain Report. Examples of institutional level documents that fall under this category include the Sao Paulo Declaration of Planetary Health, Wellcome Trust’s Public Health Workshop on the Department of Health White Paper: Healthy Lives, Healthy People, Abortion Rights Campaign’s Submission to the Oireachtas Health Committee regarding the General Scheme of the Protection of Life during Pregnancy Bill 2013, and Amnesty International’s Press Release Argentina: Legalization of abortion is an historic victory. See also Resolutions.

* Explanatory memorandum: These documents provide explanations of the content of a legislative proposal, outlining its objective and implications, and is intended to be more accessible for the general public. They will have different nomenclature across distinct national contexts.

* Guideline: this label is reserved for practical, consensus-driven, official guidelines issued by an institution such as a medical association or the World Health Organization, addressed to clinicians, and published on the website of the association/institution. Guidelines are based on the best available research evidence, typically randomized controlled trials and systematic reviews. Examples include the WHO Guidelines for Malaria and the Centres for Disease Control and Prevention guidelines on Anogenital Warts.

 * Info sheets: this category covers info sheets, fact sheets, explainers, Q&As, flyers, pamphlets and other short, usually one- or two-page texts posted on the websites of various institutions and grassroots organizations. The texts may or may not be downloadable as separate documents. Examples include the Population Research Institute’s Mexico City Policy: An Introduction, the Centre for Disease Control and Prevention HPV and Oropharyngeal Cancer, and Amnesty International’s Youth Fact Sheet: Q&A.

* Judgement: A judgement is the final decision professed by a court in a specific court case. See for example Dobbs v. Jackson Women’s Health Organization (2022) by the US Supreme Court, and A, B and C. v. Ireland by the European Court of Human Rights. The judgement may include concurring and dissenting opinions, in addition to the majority opinion.

* Manual: is a document that gives instructions on a particular course of action, covering anything from policies to performing an activity.  This category also includes educational material of various lengths and some guidelines that are not necessarily informed by systematic reviews and not published by medical associations. Examples include the Canadian Institutes of Research Deliberative Priority Setting – A CIHR KT Module and A Guide to Researcher and Knowledge-User Collaboration in Health Research. See also Guideline.

* Newsletter: is a document containing information about the recent activities of an organisation and produced regularly for the benefit of its members. A newsletter typically consists of several items of varying lengths. It may also be titled Review or Bulletin. Examples include the Newsletter of Health Poverty Action, the Population Research Institute Review and the People’s Health Movement Dispatch Bulletins.

* Online_magazine: is a magazine published exclusively or primarily on the web. Examples in the SHE Corpus include The Conversation, Counterpunch, The Nation, Truthout, and POZ Magazine.

* Oral_argument: These documents are transcripts of the oral hearing that forms part of court proceedings. This is where the parties of the court case present their arguments, and judges can ask questions of the parties’ attorneys. See for example the transcript of the oral argument before the US Supreme Court in Dobbs v. Jackson Women’s Health Organization.   

* Parliamentary debate: Transcript of a debate having been undertaken in a legislative body at the federal or state level. See for example the transcript of a part of the debate on the Health (Regulation of Termination of Pregnancy) Bill in the Irish parliament. This includes both debates that does not end in a vote and parliamentary debates with the dedicated role of leading to a legislative decision.

* Report: is a published document that presents information on a particular topic in an organized format for a specific audience and purpose. Reports are written by professionals but are typically attributed to an organisation (as in the case of many WHO reports on various topics). Some reports follow the IMRAD structure typical of scientific articles, others do not. They typically feature an executive summary or abstract as well as a table of contents and a set of references at the end. Examples include the European Centre for Disease Prevention Control Report on Vaccine-preventable diseases and immunisation: Core competencies and the WHO Report of the WHO global technical consultation on public health and social measures during health emergencies.

* Resolution: is a document that details a set of official decisions or policies on which members of an institution agree. This category also includes draft resolutions, reports produced as part of the process of drafting resolutions, and agreements. Examples include the WHO Global Action on Patient Safety Resolution, the United Nations Global Media and Information Literacy Week: draft resolution and Paris Agreement.

* Speech/official statement: These are transcripts of speeches or official statements given by senior political actors, e.g. heads of government. See for example the speech by the Irish Taoiseach (Head of Government) Leo Varadkar in which he announced that a referendum on abortion would be held in Ireland. 

* Third party written argument: This is a document submitted by a third party, sometimes called an amicus curiae (“friend of the court”), as an intervention to a court case. They are a means by which interest groups and other interested parties, who are not parties to the judicial proceedings, can attempt to influence judicial decisions. Examples include amicus briefs submitted to the US Supreme Court, e.g. the amicus brief submitted by the National Women’s Law Center in the case of Dobbs v. Jackson.  

* Web-content: this category covers any relatively short stretch of text that appears on a webpage but is not downloadable as a separate document and is not labelled as e.g. press release, statement, blog, etc. Examples include Medact’s Safe and Legal Abortions: A Woman’s Right to Choose, and the World Economic Forum University health hubs can help us meet the Sustainable Development Goals.