Rethinking Systematic Literature Reviews as the Gold Standard for Interdisciplinary Topics

Susan M. Drake, Joanne L. Reid, Michael J. Savage

Brock University, Canada

Education Thinking, ISSN 2778-777X – Volume 1, Issue 1 – 2021, pp. 27–42. Date of publication: 3 November 2021.

Download full-text PDF

Cite: Drake, S. M., Reid, J. L., & Savage, M. J. (2021). Rethinking Systematic Literature Reviews as the Gold Standard for Interdisciplinary Topics. Education Thinking, 1(1), 27–42.

Declaration of interests: The authors declare that they have no conflicts of interest.

Authors’ notes: Susan Drake ( is a Full Professor at Brock University. Her primary research interest is integrated curriculum and its connections to student academic achievement and motivation. Joanne Reid ( worked for many years for the Education Quality and Accountability Office in Ontario. Upon her retirement, she has worked as a part-time instructor at Brock University, teaching courses on assessment and evaluation. Michael Savage ( is an Associate Professor at Brock University and a licensed clinical psychologist in the province of Ontario. His research interests include mental health and wellness in educational settings and positive education.

Copyright notice: the authors of this article retain all their rights as protected by copyright laws. However, sharing this article – although not in adapted form and not for commercial purposes – is permitted under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives BY-NC-ND 4.0 International license, provided that the article’s reference (including author names and Education Thinking) is cited.

Journal’s area(s) of research addressed by the article: 42-Mixed Research Methods; 49-Qualitative Research Methods; 67-Methodology of Literature Reviews.


As a team of diverse researchers, we sought a method to write a substantive literature review that could influence policy on integrated/interdisciplinary curriculum (IC). Simultaneously we engaged in action research during this process to improve as researchers. In two attempts to conduct a rigorous systematic literature review, we encountered numerous obstacles: multiple and amorphous definitions; dependency on authors’ keyword choices; the challenge of consistent application of inclusion criteria; our reluctance to include overlapping studies and to exclude respected qualitative studies; determining if the studies reflected true curriculum integration; and finally, measurement and validity issues. We concluded that systematic reviews may not be as surgical as we had hoped, but instead, can be messy and limiting. Our struggles serve as cautions for researchers investigating interdisciplinary topics such as IC. We offer our process and lessons learned for consideration: loosening inclusion criteria boundaries, ‘slow thinking’, and a prismatic approach to reviewing literature.


Literature Review, Integrated Curriculum, Action Research, Systematic Literature Review, Interdisciplinary Topics

It was a six-year journey with many unexpected twists and turns. Our goal was to write a substantive literature review on integrated or interdisciplinary curriculum (IC) that would help educators make policy decisions on implementing such curriculum. A substantive state-of-the-art literature review is an integral part of the research process that advances knowledge and theory-building (Webster & Watson, 2002). For us, a substantive review is one with the credibility and heft capable of influencing policy; we assumed that any type of review could be substantive.

The three main types of reviews are traditional, critical, and systematic (Woglemuth et al., 2017). A traditional review offers an overview of the literature but has no systematic review methods. Similarly, a critical review examines the primary literature but without a systematic review process. Research syntheses identify evidence-based practices and combine findings in a systematic way. A systematic review is intended to be updateable, rigorous, transparent, and reproducible; it offers a body of evidence from primary research results that can shape and inform research practice (Zawacki-Richter et al., 2020).

The purpose of this paper is to explore the practical and conceptual obstacles we encountered in our efforts to write a substantive literature review, with the goal of guiding educational policy. In pursuit of this goal, we attempted to write three literature reviews: one traditional narrative and two systematic reviews. The process and lessons learned are described in this paper. We hope they may alert other researchers of the challenges of undertaking a review that has credibility and heft. Our findings may be particularly important for those who are researching topics that are complex, interdisciplinary, and disparate in nature such as ours (Snyder, 2019; Zawacki-Richter et al., 2020).

Action Research

The origin of this project – the writing of a traditional narrative review on integrated (often defined as interdisciplinary) curriculum – was the beginning of deep discussions that seemed crucial to our understanding about what evidence should count. The evidence we gathered was not as rigorous as we would have liked. This perception turned our efforts toward a systematic literature review – the current gold standard for literature reviews. Disappointed with the results of the first attempt and determined to write a substantive review, we returned to the drawing board to redo the systematic review from scratch. The process and lessons learned are described in this paper.

A first “aha” was that a literature review that broadly describes the field (as in our traditional review) was not the only way to go about it. Indeed, the literature review itself can be considered a research process that systematically investigates a question (Newman & Gough, 2020). Our team brought diverse skills to the research: Michael had extensive experience in quantitative research, Susan in qualitative. Joanne’s background was in conceptual research. These differences in experiences and world views created an interesting balance and sparked a continual revisiting and reworking of our research methods.

A second “aha” was that our discussions and processes during our systematic review research offered a rich database for action research for deeper understanding of the issues we repeatedly encountered with our research for the project. Thus, we consciously engaged in a simultaneous avenue of research in the spirit of living action research which focuses on improving one’s practice (McNiff. 2016; Whitehead, 2020). Our question: “How can we improve our practice as researchers?”.

As action researchers, we followed the action research cycle: observe, reflect, act, evaluate, modify, and move in a different direction (McNiff & Whitehead, 2010). We met regularly as a team for over five years to update each other on our progress. At each meeting, there was a lively discussion over recurrent problems with the selection of studies, especially the suitability of the articles, reports, chapters, dissertations, books, and conference papers for inclusion in the review. We collected data through field notes on these discussions and documented the decisions, and the reasons for these decisions.

A generic qualitative method was used to analyse the data as suggested by Merriam and Tisdell (2016) resulting in an iterative and dynamic process throughout the data analysis. Emergent codes were identified and revisited in our discussions. Finally, themes (categories) were created. Ongoing results of the systematic reviews were presented at national conferences (Drake et al., 2018, 2019)

Traditional Narrative Literature Review

The catalyst for this paper was a narrative literature review that culminated in a report for International Baccalaureate (IB) schools (Drake et al., 2015). We collected and assessed articles, books and dissertations related to IC, mostly in North America. We followed what we understood as the standard process of reviewing existing literature and then grouping it into several coherent categories. In that report, we included qualitative and quantitative studies and categorized the literature into the following areas: diversity of definitions from multidisciplinary to transdisciplinary; research studies; the relationship of integrated or interdisciplinary curriculum (IC) and grades K-12 student success (e.g. academic achievement, student engagement), descriptions of classes and programs; implementation; anecdotal descriptions and explorations of the benefits and challenges of implementation.

We were pleased by the volume of literature related to IC and its variability across decades and across many subject areas and grade levels. However, much of the literature involved anecdotal narratives by enthusiastic practitioners and qualitative case studies (see, for example, Brand & Triplett, 2012; Vars, 1993). Indeed, critics of narrative review claim it is not objective (Zawacki-Richter et al., 2020). Evaluating our traditional review, we decided the studies in our IB overview were not convincing enough to withstand scrutiny in the era of accountability and evidence-based decision making that had permeated education since the end of the last century.

We questioned what type of review would be better. We noted a general shift across the humanities and social science fields from the traditional narrative reviews to systematic reviews. It seemed that the systematic literature review was now considered the gold standard across different fields including education (Timmermans & Berg, 2003; Hammersley, 2020). Many consider reviews that are not systematic as lacking rigour (Snyder, 2019). Thus, we decided that we needed to complete a systematic review to be useful to policymakers, one that focused on concrete evidence of student success (academic achievement) in integrated programs.

The First Attempt at a Systematic Review

There are multiple ways to address an evidence-based literature review (Newman & Gough, 2020). The main criterion is that a systematic method is followed. Choices include a research synthesis of best evidence, meta-analysis or narrative review, metanarrative, meta-study, critical interpretive synthesis, or critical construct synthesis (Grant & Booth, 2009; Woglemuth et al., 2017). The most powerful literature review is the meta-analysis which summarizes studies using statistical comparisons (Higgins & Green, 2011). Hattie (2012), for example, identified 250 influences that work best in schools to improve learning using effect sizes in an influential meta-analysis.

We knew we needed rigorous criteria and a more structured decision-making model if we were to provide a substantive review for policy makers’ consideration. Unfortunately, we had not found enough appropriate studies of IC for a meta-analysis. But a systematic literature review that was a synthesis of best evidence studies seemed possible. A systematic review is supposedly objective and “is an extremely efficient method for obtaining the ‘bottom line’ of what works and what doesn’t” (Uman, 2011, p.57).

We started over and applied Uman’s (2011) steps for a systematic review. First, we determined the criteria for inclusion and exclusion. Studies that did not meet these criteria were excluded. For example, studies on implementation with no reliable data about achievement were not considered. Teacher testimonials were also excluded.

Searching and Screening Process, Data Sources, Study Selection

Using the inclusion and exclusion criteria, we individually searched databases including Academic Search Complete, Education Research Complete, ERIC, Scholars Portal E-Journals, as well as journals such as Educational Researcher and Journal of Experiential Education included in Sage Journals Online. Initial search words included “curriculum” along with “multidisciplinary”, “interdisciplinary”, “transdisciplinary”, “fused”, and “cross-curricular”. Further searching considered the application of IC to “environmental curriculum”, “sustainability curriculum”, “service learning”, and “International Baccalaureate”.

Additionally, we searched within the reference lists of meta-analyses, systematic reviews, and relevant dissertations. We went to the original study if it was a secondary reference; we contacted the researchers or the foundation that funded the research if we could not find the original source or the foundation. A shared chart (Google doc) was developed to record findings with several categories.

This review proceeded over many months. We had expected that a rigorous application of the inclusion criteria would provide a clear-cut triage and a quasi-surgical determination of acceptable evidence. Through our individual search-and-screen strategy, over 250 studies were identified for potential inclusion in the analysis. Each researcher read all the 250 studies individually to analyse and interpret data. We then met as a group to compare our interpretations. Inclusion was determined when all three of reviewers agreed through consensus that the article fit the criteria. Having more than one researcher involved in the process increases inter-rater reliability for quality (Uman, 2011).

We agreed upon 15 articles. Fifteen is an exceptional number for a systematic review (Malouf & Taymans, 2016), yet we were left with a gnawing sense of dissatisfaction. Here’s why.

Issues Resulting from the First Systematic Review

As we encountered evermore quandaries, our frustration grew. Many obstacles challenged our attempt at consensus as described briefly below.

a)Multiple and amorphous definitions. Defining IC is notoriously difficult (Adlar & Filhan, 1997; Davison et al., 1995). We stumbled over the definition of “subject”. Is gardening a subject (Dirks & Orvis, 2005)? Is literacy itself, a subject or a component of Language Arts, or a skill set embedded in all subjects (Guthrie et al., 2013)? What about “the Arts”? Is an arts program that includes visual arts, dance, and music already an integrated program because dance, let’s say, is a subject? Are biology, physics and chemistry different subjects or components under the umbrella of the subject “science”? The infusion of social-emotional learning (Elias, 2003), environmental stewardship, 21st Century competencies and Information Technology programs into “traditional” subjects also gave us pause. Should we consider such programs as subjects?
b)Too few studies. Given our focus on evidence of effectiveness through academic achievement, we accepted only quantitative studies because academic results generally are reported using numbers. Most of the qualitative studies did not fit the criteria in one aspect or another. We set aside studies that presented rich qualitative data (see, for example, MacMath et al., 2010; Rennie et al., 2012).
c)Overlapping studies. We sometimes found that a strong study was repeated under slightly different titles. One example was Grouws et al. (2013) and Tarr et al. (2013). We wondered if we should count the multiple studies that applied the same program such as Concept-Oriented Reading Instruction (CORI) as unique stand-alones, or separately, as versions of the not-quite-the-same study but done in different contexts (see for example, Guthrie & Klauda, 2014; Guthrie et al., 2009; Guthrie et al., 2007). We wondered if other systematic researchers recognized overlapping studies.
d)Confounding variables. The relationship of “subjects” to evaluating the effectiveness of an integrated curriculum is further complicated by possible interrelated variables. If an integrated approach increases student engagement as some claim (Ahar, 1997; Alexander et al., 2008; Catterall et al., 2012; Guthrie et al., 2013) and if the experience of school is more like that of an ecosystem rather than as a series of separate unrelated episodes, then where does the effect of any curriculum start and stop?
e)True integration? In some studies, we questioned whether an intervention was truly integrated or whether the curriculum was mainly about one subject (often one on the testing roster), with another subject fused into it. Arts educators both applaud and critique the infusion of the arts as a support for mainstream subjects (Bintz & Gumike, 2018; Cosenza, 2005). Fingon (2011) described an example of enhancing physical education by reading about athletes. Howes et al. (2009) cautiously endorsed the integration of literacy and science, but questioned whether reading skills, a focus of the national testing agenda, trumped the science content. As they put it, reading about animals to improve decoding skills (literacy) is not the same as reading about animals to find answers to research questions (inquiry). We shared this hesitation.
f)Measuring apples and oranges. Our inclusion criteria specified that there be a comparison and that the measurement tool be some form of arm’s length objective assessment such as a standardized test. While such a tool allowed for more reliable comparisons, it raised validity issues when an integrated program, by definition, did not rely on disciplinary instruction and assessment. In their case studies, Rennie et al. (2012) found that students who studied in a disciplinary program did better in disciplinary assessments compared to students in an integrated program. However, the students in an integrated program outperformed their disciplinary counterparts in twenty-first century competencies such as problem-solving.
g)Acceptance of “respected” studies. We excluded some studies that had wide acceptance by other scholars if we could not find the original publications. We questioned our audacity in excluding such frequently cited studies and noted that they were in our traditional narrative review. In insisting on reviewing the original study rather than trusting its evaluation by a researcher who had included it in a literature review, we spent countless hours tracking down dissertation studies that the author had just thrown out (for example, Hartzler, 2000), or articles that could not be found (Reardon, 2005; Toops, 1954), or pestering researchers who did not respond to our pleas.

Although we believed we had performed a review in a systematic way and it confirmed our hopes that IC and academic achievement were positively associated, we were disillusioned. As in quicksand, the more we struggled, the deeper we sank. We had spent hours testing assumptions and clarifying our communal interpretation of the inclusion criteria. This had meant re-evaluating earlier decisions against revised criteria. We were not confident that we had met our goal; we still wanted to provide a “gold standard” review. Following the action research cycle, we agreed that we needed to undertake a second action research cycle. We thought after that first attempt that we could be more consistent and confident in doing so.

The Second Systematic Review

  We decided to redo the systematic review from scratch. Perhaps our research skills had been lacking. As well, new articles were constantly being added to the database making it hard to keep up. This was to be a brand-new start.


In many ways the second systematic review matched the process we followed during the first attempt. However, this time was much more absolute with no discursive meandering. The rules were the rules. This time we selected seven databases that the university education library specialist recommended. We reviewed the original inclusion and exclusion list of criteria from our first attempt at a systematic review (Uman, 2011). We discussed previous difficulties with these criteria, modifying as necessary, believing that now we had a clear understanding of what the criteria meant in practice. This time we vowed we would not second guess them. We searched for journal articles, government and educational organization reports and graduate education theses. Books were not included. We did not look beyond the results of our search.

  We followed the instructions to the letter to find our assigned databases. In an effort to avoid the winding discursive conversations of systematic review #1, we vowed not to do any searches using anything but the keywords we agreed on in the exact combination stated in the instructions – even if the search returned very few, or no, results. Each person searched two or three databases using the common search terms: pairing academic achievement with integrated curriculum, then interdisciplinary curriculum, and finally transdisciplinary curriculum.

Individually we recorded our findings on a new master list in Google Drive while keeping meticulous track of our own search process. Again, after reading all potential selections, we met to analyse and interpret each piece to determine if it met the criteria needed for inclusion; if an article did not meet the criteria, it was not included. Lobbying for the inclusion of an article that did not meet the criteria was a thing of past efforts.

Issues Resulting from the Second Systematic Review

When we compared the search results of the first systematic review to the second one, we found that the final list included the same articles, but there were many more dissertations in the second round than in the first one. We assumed that dissertations represented good research as they had passed through a committee and external examiners much like a refereed article would have. Surprised by the amount of solid research completed by doctoral students, we wondered why so little of it appeared later as publications. Similarly, Greer et al. (2014) found many dissertations in their review of student learning disabilities and on-line learning but noted that little of it was ever published. We wondered: were journals not interested in the topic of integrated curriculum or had the doctoral researchers just not persisted through to publication?

  The problem of multiple and amorphous definitions persisted. In the second systematic review, our keywords defined the parameters. We accepted the articles that the search engines provided. No more asking people to search in their closets for old data sets. No more diversions. Since our keywords included “integrated”, “interdisciplinary” or “transdisciplinary”, we eliminated the terms “multidisciplinary”, “fused”, “cross-curricular”, “environmental curriculum”, “sustainability curriculum”, “service learning”, and “International Baccalaureate”. Offloading decision-making to the keywords chosen by the studies’ authors eliminated our need to decide. Yet we still wondered if the programs described really were interdisciplinary. We also wondered if other researchers of IC had questioned the authenticity of the integration or if they had accepted the researchers’ description. How reliable are keywords for any study, and particularly an interdisciplinary one?

  Other issues also revolved around keywords. Seemingly obvious omissions emerged. Where were the articles on student success in Finland’s celebrated 2016 educational reform of cross-curricular phenomenological teaching (Taimela & Halkilahti, 2020)? In research on the transdisciplinary subject-free Finnish curriculum, the emphasis was on student acquisition of the transversal competencies (an updated definition of the 21st century competencies). We pondered if this research was missing because a) it was not published in academic arenas, b) because Finland was not measuring students with quantitative measures, or c) because our keywords did not match with the keywords of those researchers. Transdisciplinarity is hardly the stuff of traditional subject areas and perhaps that’s why this research did not surface in our search (Eronen et al., 2019). Previous Finnish success was measured with international measures such as PISA and TIMMS, but such descriptions are embedded in reports or books.

  It seemed as we culled through potential articles for inclusion that some areas were over- represented such as IC and the Arts. STEM or STEAM seemed, by definition, like a natural area for integration yet despite our expectations, few articles on STEM or STEAM emerged from the searches. Perhaps researchers used STEM or STEAM in their publications, thereby implying the obvious idea of integration without explicitly saying so and thus, did not use our search terms as their keywords. We wondered how important a balance across the subject areas was important to a policy-maker.

  The niggling questions around what constituted a subject still haunted us. After discussion, we rejected some IC studies that had qualified in the data search during Systematic Review #2 because we did not see the integration of at least two “subjects”. For example, was integrating digital mapping tools into geography really two subject areas (Cyvin, 2013)? Cyvin’s study looked at digital tools to accomplish geography curriculum objectives. We decided that digital tool use is not its own discipline/subject – and in this case, technology was not taught as a discipline. Was argumentation infused into mathematics really two subject areas (Cross, 2009)? No: mathematical argumentation is defined as the development of metacognitive and critical thinking skills and not a unique subject area. Is linking reading in geography with English Language Learners (ELL) really integration (Hinde et al., 2011)? From our perception, the study looked at strategies for ELLs to improve achievement in geography but reading itself was not treated as a “subject” so there was no real integration.

  Included studies were once again quantitative. We accepted this inevitability as part of the parameters that we had invoked given our research question and our conviction (inclusion criteria) that there needed to be an objective comparison to establish compelling evidence. Academic achievement is described using numbers such as grades or standardized test scores. Yet, so much of the literature consists of personal narratives of experiences with IC, which we dismissed because they were not “objective”.

We acknowledged that this quest for objective quantitative comparative data was leaving out a large part of the picture of effective IC. “To simplify the world enough that it can be captured by numbers means throwing away a lot of detail. The inevitable omissions can bias the data …” (Fry, 2021, p. 71). Surely the sheer quantity of the stories – largely positive – should carry weight, perhaps even more weight, than the few rigorous studies we had found over 100 years of history.

Discussion: Rethinking Systematic Literature Reviews

  Perhaps in hindsight, we were hoisted by our own petards. How many of our issues were self-propelled? Were we too ambitious? Did we lack the skills or technological expertise? Did we really need a systematic review? Was that even possible? Hammersley (2020) suggests that attempting an exhaustive review is one of the pitfalls of a systematic review and we had indeed fallen into it in our first systematic review. Our penchant for second guessing ourselves and engaging in countless hours of discussion over the acceptance of borderline articles was our seemingly worst weakness – especially in Systematic Review #1. Somehow, we had the belief that we could leave no stone unturned, or our review would not be taken seriously. Yet these same discussions may be considered a strength – a characteristic of thoughtful research and critical thinking.

  Perhaps it was not all the fault of us as naive researchers, but rather the nature of the beast for difficult-to-define interdisciplinary topics. Systematic reviews can have failings even following their own terms (Thompson, 2015). Lind (2020) noted that proceeding with a systematic review “requires deep knowledge of the method and a willingness to be reflexive and open about its messy realities; to tell of errors that researchers have made and judgements they have formed” (p. 66).

  Reviewing the wide variety of emerging ways to do a systematic review indicates that one size does not fit all. For example, emerging evidence syntheses also include realist reviews, mixed methods reviews, concept analyses and scoping reviews. As Snyder (2019) suggested, broad-based topics that are conceptualized differently and studied within diverse disciplines are not always suitable for systematic reviews and can indeed hinder the full process.

  Researchers conducting a systematic review on student engagement and educational technology in higher education found problems searching for what they defined as broad and fuzzy concepts within an interdisciplinary domain (Bedenlier et al., 2020). A review of student learning disabilities and on-line learning revealed obtuse definitions and a lack of precise descriptions of the study design (Greer et al., 2014). The researchers found it difficult to know exactly what learning disability a student had or what type of online learning was implemented. As well, it was often challenging to decipher exactly what the researchers had done.

  Similar issues plagued our study. Based on our definitions, many IC programs were not really as integrated as the researchers claimed. Additionally, we often came up against opaque descriptions of research procedures which prompted us to eliminate such studies from our inclusion list.

  Did our inclusion criteria establish objectivity? When so many articles with interesting insights into IC had to be discarded, we realized that our literature reviews were not that objective at all. They were bound by our own criteria, which reflected our biases. The same holds true for other systematic literature reviews. For our purposes and for our topic, thinking of the systematic review process as offering the gold standard of objectivity was an illusion.

Is Systematic Review Too Narrow?

We began this study in an age of accountability when standardized scores were a valued objective measure. In six years, the landscape of education changed. Accountability now goes beyond academic achievement to include socio-emotional learning (SEL), mental health, dealing with COVID-19, grit and resilience. The literature about IC shows a host of other benefits (e.g., engagement, professional growth) beyond academic success, but we focused strictly on achievement. To address teaching to the whole person requires asking different and perhaps more difficult but more meaningful questions that cannot be answered through standardized testing.

  In previous work, we had defined one approach to interdisciplinary curriculum as one or more subjects integrated with such 21st century skills as communication, critical thinking, creative thinking or digital literacy (see, for example, Drake et al., 2015). Transversal and 21st century competencies are a part of most educational jurisdictions’ policy. Indeed, a governmental evidence-based report of 21st century skills (a topic often connected to IC) was a narrative review that did not offer any method for its selection of evidence (Lamb et al., 2017). Within that study, the researchers used Hattie’s effect sizes from a meta-analysis on factors influencing effective teaching to point out that 21st century competencies are related to academic achievement in varying degrees, with problem solving having the largest relationship. There was no attempt to ground this review in a justification for its evidence selection.

  For the systematic reviews, we found studies that integrated these 21st century skills. However, in contradiction to our past selves, we excluded studies if they involved one subject only. Thus, we reluctantly excluded studies we previously had considered valuable and relevant to the current context.

  As the landscape has changed so have the reviews. Systematic reviews emerged from the evidence-based medicine movement of the 1980s and the primacy of randomized controlled trials (RCTs) as the best source of evidence for the efficacy of interventions. RCTs and systematic reviews are still regarded as the best sources of evidence in the applied health sciences (Hammersley, 2020). In education, however, RCTs are much rarer and, as we have noted, qualitative research studies are much more common. As a result, the systematic review method has recently faced a lot of criticism from education scholars and researchers who are increasingly calling for alternative ways of synthesizing and analyzing research findings (Hammersley, 2020).

Doing Things Differently

Was looking at statistics for academic scores the right path to understanding IC’s impact and thoughtful policy making? Had we fallen into the trap described by Kahneman (2011) in Thinking, fast and slow? He suggested humans often swap a difficult question for an easy one without realizing that they are doing so. For example, in education we ask what makes “good” education and then measure that goodness with standardized measures for content acquisition. This does not really address the question of what makes education “good” and may lead to practices that are not necessarily good at all (Fry, 2021). Similarly, using only academic scores may not actually address the question of the efficacy of IC for student achievement, let alone student growth and health.

  What might we do differently? It seems that our discursive conversations over the years could be considered in the light of fast and slow thinking. Our systematic reviews on academic achievement involved fast thinking: an easy way to a quick answer. We had adopted the current practices of fields such as psychology, medicine, and health care where such reviews are prominent. Our thoughtful delaying discussions could be thought of as slow thinking and effective collaboration. So many times, for example, we talked about expanding our inclusion criteria, but stuck begrudgingly to the original in the name of objectivity, believing that objectivity would be more convincing to education policy makers. Ironically, educators make sense of evidence using their professional judgement in their own contexts and not just by its rigour or objectivity (Nelson & Campbell, 2017). Slow thinking led us to a more in-depth understanding of the efficacy of IC. Yet our fast-thinking end-product did not reflect that depth.

  For us, a next step would be to expand the inclusion criteria to allow existing rich qualitative work. Systematic reviews do accept qualitative studies. For example, to discover which factors influence the personal experience of receiving a mental health diagnosis, researchers did a systematic review of the qualitative data from all of the stakeholders including the person with the diagnosis, the clinician, the carer and family (Perkins et al., 2018). Practitioners’ narratives could find a rightful place. The findings supported an individualised, collaborative, and holistic approach to mental health diagnosis.

Lessons Learned – Researching for Deep Understanding of an Interdisciplinary Topic

What insights can we offer other researchers who wish to provide a rigorous literature review for an interdisciplinary topic?

First, for a topic that is ill-defined itself, consider your inclusion criteria thoughtfully as these will be the ties that bind. Boundaries for the criteria for inclusion should be loosely bound, although the research should be “good evidence” provided by rigorous research. Perhaps Black and Wiliam’s (1998, 2010) work offered a more realistic model. They completed a highly influential substantive literature review about formative assessment, selecting 250 from 650 articles, indicating that “the boundary for the research reports and reviews that have been included has been loosely rather than tightly drawn. The principal reason for this is that the term formative assessment does not have a tightly defined and widely accepted meaning (1998, p.7).” This lack of precise definition is also true of IC and offers a good argument to extend beyond the narrow limits we were using for the systematic review. Along with Malouf and Taymans (2016), we suggest expanding the boundaries of acceptability if we want to make a difference. We prefer a lattice fence over a solid brick wall.

Second, interdisciplinary topics lend themselves to thoughtful research – slow thinking. Allow time for discussion and thinking and rethinking. As action researchers, we had completed another cycle with Systematic Review #2. Initially we would have been happy to put forward a systematic review to support or negate our “hypothesis” that IC increases student achievement. However, as we became mired in the messiness, we understood that a systematic review was only a starting point or just one facet of a prism. It could not be the only form of evidence we could accept. A new cycle of research needed to begin.

Third, consider a prismatic approach. For next steps, we need a multifaceted approach. One facet would be our existing systematic review of best evidence synthesis. A second facet would be to develop an historical review surveying the time periods when IC was in and out of favour. A third facet would be a traditional substantive literature review in the spirit of Black and Wiliam (1998). Ultimately, all three facets would be interwoven and situated in their political, social and philosophical contexts. We liken this approach to a prism that captures the light and breaks it into its different colours. A prismatic approach to investigation may avoid reducing research to a dry sludge.

  Using living action research, we reflect on how we have improved our practice as researchers. We honed our critical thinking skills for evaluating other’s research. Perhaps the real lesson learned was how difficult it was to be held hostage to inflexible criteria. Armed with these lessons, we go forward as more sensitive and careful researchers.


Adler, M., & Filhan, S. (1997). The interdisciplinary curriculum: Reconciling theory, research and practice. National Research Center on English Learning and Achievement, State University of Albany.

Ahar, J. (1997). The effects of interdisciplinary teaming on teachers and students. In J. Irvin (Ed.), What current research says to the middle-level practitioner (pp. 49-56). National Middle School Association.

Alexander, J., Walsh, P., Jarman, R., & McClune, B. (2008). From rhetoric to reality: Advancing literacy by cross-curricular means. The Curriculum Journal, 19(1), 23-35.

Bedenlier, S., Bond, M., Buntins, K., Zawacki-Richter, O., & Kerres, M. (2020). Learning by doing? Reflections on conducting a systematic review in the field of educational technology. In O. Zawacki-Richter, M. Kerres, S. Bedenlier, S. Bond, & K. Buntins (Eds), Systematic reviews in educational research: Methodology, perspectives and application (pp. 111-129). Springer eBook.

Bintz, W. P., & Gumike, M. (2018). Interdisciplinary curriculum: Using poetry to integrate reading and writing across the curriculum. Middle School Journal, 49(3), 36-48.

Black, P., & Wiliam, D. (1998). Assessment and classroom learning. Assessment in Education, 5(1), 7-74.

Black, P., & Wiliam, D. (2010). Inside the black box: Raising standards through classroom assessment. Phi Delta Kapan, 92(1), 81-90.

Brand, B. R., & Triplett, C. F. (2012). Interdisciplinary curriculum: An abandoned concept? Teachers and Teaching: Theory and Practice, 18(3), 381-393.

Catterall, J. S., Dumais, S. A., & Hampden-Thompson, G. (2012). The arts and achievement in at-risk youth: Findings from four longitudinal studies. National Endowment for the Arts Research, University of California at Los Angeles.

Cosenza, G. (2005). Implications for music educators of an interdisciplinary curriculum. International Journal of Education and the Arts, 6(9), 1-7.

Cross, D. I. (2009). Creating optimal mathematics learning environments: Combining argumentation and writing to enhance achievement. International Journal of Science and Mathematics Education, 7, 905-930.

Cyvin, J. (2013). Challenges related to interdisciplinary use of digital mapping technology in primary and lower secondary schools. Norwegian Journal of Geography, 67(3), 162-171.

Davison, D. M., Miller, K. W., & Metheny, D. L. (1995). What does integration of science and mathematics really mean? School Science and Mathematics, 95(5), 226-230.

Dirks, A. E., & Orvis, K. (2005). An evaluation of the junior master gardener program in third grade classrooms. HortTechnology, 15(3), 443-447.

Drake, S. M., Savage, M. J., Reid, J. L., Bernard, M., & Beres, J. (2015). An exploration of transdisciplinarity in the PYP. Hague, Netherlands: International Baccalaureate.

Drake, S. M., Savage, M. J., & Reid, J. (2018). Exploring the effectiveness of interdisciplinary curriculum through a systematic literature review. Presentation at the Canadian Society for the Study of Education Annual Conference in Regina, Saskatchewan, May 30, 2018.

Drake, S. M., Reid, J., & Savage, M. J. (2019). Exploring what counts as “evidence” for a substantive literature review. Presentation at the Canadian Society for the Study of Education Annual Conference in Vancouver, British Columbia, Canada, June 4, 2019.

Elias, M. J. (2003). Academic and social-emotional learning. International Academy of Education, International Bureau of Education.

Eronen, L., Kokko, S., & Sormunen, K. (2019). Escaping the subject-based class: A Finish case study of developing transversal competencies in a transdisciplinary course. The Curriculum Journal, 30(3), 264-278.

Fingon, J. C. (2011). Integrating children’s books and literacy into the physical education curriculum. Strategies: A Journal for Physical and Sports Educators, 24(4), 10-13.

Fry, H. (2021, March 29). What really counts? The New Yorker, 70-73.

Grant, M., & Booth, A. (2009). A typology of reviews: An analysis of 14 review types and associated methodologies. Health Information and Libraries Journal, 26(2), 91-108.

Greer, D., Rice, M., & Dykman, B. (2014). Reviewing a decade (2004-2014) of published, peer-reviewed research on online learning and students with disabilities. In R. Ferdig & K. Kennedy (Eds.), Handbook of research on K to12 online learning and blended learning (pp. 135-159). Carnegie Mellon University ETC Press.

Grouws, D. A., Tarr, E. J., Chavez, O., Sears, R., Soria, V. M., & Taylan, R. D. (2013). Curriculum and implementation effects on high school students’ mathematics learning from curricula representing subject-specific and integrated content organization. Journal for Research in Mathematics Education, 44(2), 416-463.

Guthrie, J. T., McRae, A., & Klauda, S. L. (2007). Contributions of concept-oriented reading instruction to knowledge about interventions for motivations in reading. Educational Psychologist, 42, 237-250.

Guthrie, J. T., McRae, A., Coddington, C. S., Klauda, S. L., Wigfield, A., & Barbosa, P. (2009). Impacts of comprehensive reading instruction on diverse outcomes of low-achieving and high-achieving readers. Journal of Learning Disabilities, 42(3), 195-214.

Guthrie, J. T., Klauda, S. L., & Ho, A. N. (2013). Modeling the relationships among reading instruction, motivation, engagement, and achievement for adolescents. Reading Research Quarterly, 48(1), 9-26.

Guthrie, J. T., & Klauda, S. L. (2014). Effects of classroom practices on reading comprehension, engagement and motivations for adolescents. Reading Research Quarterly, 49(4), 387-416.

Hammersley, M. (2020). Reflections on the methodological approach of systematic reviews. In O. Zawacki-Richter, M. Kerres, S. Bedenlier, S. Bond, & K. Buntins (Eds), Systematic reviews in educational research: Methodology, perspectives and application (pp. 23-39). Springer eBook.

Hartzler, D. S. (2000). A meta-analysis of studies conducted on integrated curriculum programs and their effects on student achievement. Unpublished dissertation. Indiana University.

Hattie, J. (2012). Visible learning: A synthesis of over 800 meta-analyses related to achievement. Routledge.

Higgins, J. P. T., & Green S. (Eds.). (2011). Cochrane Handbook for Systematic Reviews of Interventions (Version 5.1.0). The Cochrane Collaboration.

Hinde, E. R., Osborn Popp, S. E., Jimenez-Silva, M., & Dorn, R. I. (2011). Linking geography to reading and English language learners’ achievement in US elementary and middle school classrooms. International Research in Geographical and Environmental Education, 20(1), 47-63.

Howes, E., Lim, M., & Campos, J. (2009). Journeys into inquiry-based elementary science literacy practices, questioning and empirical study. Science Education, 93(2), 189-217.

Kahneman, D. (2011). Thinking, fast and slow. Allen Lane.

Lamb, S., Maire, Q., & Doecke, E. (2017). Key Skills for the 21st Century: An evidence-based review. New South Wales Department of Education.

Lind, M. (2020). Teaching systematic review. In O. Zawacki-Richter, M. Kerres, S. Bedenlier, S. Bond, & K. Buntins (Eds), Systematic reviews in educational research: Methodology, perspectives and application (pp. 55-68). Springer eBook.

MacMath, S., Roberts, J., Wallace, J., & Chi, X. (2010). Curriculum integration and at-risk students: A Canadian case study examining student learning and motivation. British Journal of Special Education, 37(2), 87-94.

Malouf, D. B., & Taymans, J. M. (2016). Anatomy of an evidence base. Educational Researcher, 45(8), 454-459.

McNiff, J., & Whitehead, J. (2010). You and your action research project. Routledge.

McNiff, J. (2016). You and your action research project. Routledge

Merriam, S. B., & Tisdell, E. J. (2016). Qualitative research: A guide to design and implementation (4th ed.). Jossey-Bass.

Nelson, J., & Campbell, C. (2017). Evidence-informed practice in education: meanings and applications. Educational Research, 59(2), 127-135.

Newman, M., & Gough, D. (2020). Systematic reviews in educational research: Methodology, perspectives and application. In O. Zawacki-Richter, M. Kerres, S. Bedenlier, S. Bond, & K. Buntins (Eds), Systematic reviews in educational research: Methodology, perspectives and application (pp. 3-22). Springer eBook.

Perkins, A., Ridler, J., Browes, D., Peryer, G., Notley, C., & Hackman, C. (2018). Experiencing mental health diagnoses: A systematic review of service users, clinician and career perspectives across clinical settings. The lancet. Psychiatry, 5, 747-764.

Reardon, C. (2005). Deep in the arts of Texas: Dallas public schools are boosting student achievement by integrating arts into the curriculum. Ford Foundation Report, 36(1), 23-29.

Rennie, L., Vennie, G., & Wallace, J. (2012). Knowledge that counts in a global community: Exploring the contribution of integrated curriculum. Routledge.

Snyder, H. (2019). Literature review as research methodology: An overview and guidelines. Journal of Business Research, 104, 333-339.

Taimela, I., & Halkilahti, H. (2020). Phenomenon-based learning: A look at how Finland teaches skills for the future. Canadian Teacher Magazine, 3, 7-10.

Tarr, J. E., Grouws, D. A., Chavez, O., & Soria, V. M. (2013). The effects of content organization and curriculum implementation on students’ mathematics learning in second-year high school courses. Journal for Research in Mathematics Education, 44(4), 683-729.

Thompson, C. (2015). Sins of omission and commission in systematic reviews in nursing: A commentary on McRae et al. (2015). International Journal of Nursing Studies, 52(7), 1277-1278.

Timmermans, S., & Berg, M. (2003). The gold standard: The challenge of evidence-based medicine and standardization in health care. Temple University Press.

Toops, M. (1954). Core program does improve reading proficiency. Educational Administrators Supplement, 40, 484-603.

Uman, L. S. (2011). Systematic reviews and meta-analyses. Journal of the Canadian Academy of Child and Adolescent Psychiatry, 20(1), 57-59.

Vars, G. F. (1993). Interdisciplinary Teaching: Why and How. National Middle School Association.

Webster, J., & Watson, R. T. (2002). Analyzing the past to prepare for the future: Writing a literature review. MIS Quarterly, 26(2), xiii-xxiii.

Whitehead, J. (2020). Contributing to moving action research to activism with living theory research. Canadian Journal of Action Research, 20(3), 55-73.

Woglemuth, J. R., Hicks, T., & Agosto, V. (2017). Unpacking assumptions in research synthesis: A critical construct synthesis approach. Educational Researcher, 46(3), 131-139.

Zawacki-Richter, O., Kerres, M. Bedenlier, S., Bond, M., & Buntins, K. (2020). Introduction. In O. Zawacki-Richter, M. Kerres, S. Bedenlier, S. Bond, & K. Buntins (Eds), Systematic reviews in educational research: Methodology, perspectives and application (pp. v-xiv). Springer eBook.

Share this Post