J Eng Teach Movie Media > Volume 25(2); 2024 > Article
Spring and Takeda: Teaching Phrasal Verbs and Idiomatic Expressions Through Multimodal Flashcards*


This study investigates the effectiveness of various practice modes in learning multiword units, specifically phrasal verbs and idiomatic expressions, through a multimodal online flashcard delivery system that allows students to read, write, listen to, and speak the target multiword vocabulary items. The study, based on data from 229 first-year Japanese EFL university students, investigates how various practice modes in reading, writing, speaking, and listening impact short-term memorization and long-term comprehension of multiword units. The analysis involved multiple regression using random forests to ascertain significance, alongside dominance analysis with relative weights for determining relative importance. Results indicate that while productive practice modes (writing and speaking) are crucial for short-term memorization, register-specific practice modes (speaking and listening) play a significant role in long-term acquisition. This study highlights the importance of integrating multimodal flashcard practice with practical use activities in the classroom to enhance vocabulary learning outcomes. Finally, the results of this study suggest that encouraging students to interact with their target language through multiple modes is meaningful for greater acquisition of multiword vocabulary. These findings contribute to a deeper understanding of effective strategies for EFL learners in acquiring multiword vocabulary units and underscores the importance of multimodal approaches in language education.


Multiword vocabulary units are commonly-used sequences of words which may or may not take on unexpected meanings when combined. Examples of such multiword vocabulary units may include formulaic chunks like collocations (i.e., words that tend to be streamed together such as “reach a consensus” or “be widely adopted”), phrasal verbs (i.e., a verb and particle that tend to produce new meaning such as “wind up” or “get back”), and idiomatic expressions (i.e., chunks of language the form a meaning unpredictable by the parts such as “once in a blue moon” or “kick the bucket”). Though there is some discrepancy as to how exactly to classify each expression, and there is some inherent overlap (Crowley et al., 2023; Haugh & Takeuchi, 2023), many studies have pointed to the prevalence of these chunks of language in everyday English (e.g., Garnier & Schmitt, 2015; Martinez & Schmitt, 2012), suggesting their importance in English learning.
In recent years, increased attention has been directed towards the significance of vocabulary acquisition in EFL learning (e.g., Boers, 2021; Brown et al., 2022), but more specifically towards the acquisition of multiword vocabulary units such as formulaic chunks (e.g., Boers, 2021; Simpson-Vlach & Ellis, 2010), phrasal verbs (e.g., Haugh & Takeuchi, 2023; Spring, 2018, 2019), and idiomatic expressions (e.g., Crowley et al., 2023; Wolter, 2020). While single-word vocabulary is undoubtedly important in EFL and often correlates with higher general proficiency (e.g., Brown et al., 2022; Kyle & Crossley, 2015; McLean et al., 2020), recent studies have also pointed to the fact that learning multiword vocabulary units can help increase oral fluency (Wolter, 2020) and the ability to comprehend indirect and idiomatic language (Crowley et al., 2023; Hamagami et al., 2024). Although the importance of acquiring multiword vocabulary units in EFL learning is well recognized, there remains a significant gap in understanding the most effective methods for learning these units.
Several methods of learning EFL vocabulary have been suggested over the years. While some proponents argue that incidental learning is a large driver of vocabulary learning (e.g., Lee & Pulido, 2017), others point out that learners with particular L1-L2 pairings may be susceptible to negative transfer, which would affect their incidental learning of multiword vocabulary units. For example, Inagaki (2002) has shown that L1 Japanese learners have difficulty learning motion verbs with path prepositions (e.g., “He swam under the bridge”) because no such structure exists for motion expressions in Japanese, which causes the learners to misinterpret these motion expressions as location expressions (i.e., to mean “He was already under the bridge and happened to be swimming”). Therefore, it is uncertain how effective incidental vocabulary learning is for different types of multiword vocabulary units. Other studies suggest the importance of explicit vocabulary learning, which can be done through purposeful lessons and prepared materials (e.g., Kim, 2019; Koh, 2020; Seo, 2014) or vocabulary learning tools such as flashcards (e.g., Kaku, 2018; Nakata, 2020) or gap-fill activities (e.g., Strong & Boers, 2019; Strong & Leeming, 2024). However, most of the aforementioned vocabulary learning methods suggested tend to be single-mode tools, whereas more recent studies have suggested that online flashcards can be created to be multimodal, containing multiple choice questions (e.g., Takeda, 2023), writing exercises (e.g., Strong & Leeming, 2024), speaking and listening exercises (e.g., Spring & Takeda, 2023), and even video examples (e.g., Hamagami et al., 2024). However, the majority of these studies still focus on single-word vocabulary. Therefore, this study investigates the effectiveness of learning multiword vocabulary using different modes provided by a single multimedia online flashcard delivery system, with the aim of understanding how specific modes influence learning outcomes.


1. Multiword Vocabulary in EFL

While general vocabulary acquisition is crucial in second language learning and EFL proficiency, there has been a growing focus on acquiring multiword vocabulary units alongside single words in recent years (e.g., Boers, 2021; Crowley et al., 2023; Wolter, 2020). Learning vocabulary in formulaic chunks—sequences of commonly used words, which may or may not carry idiomatic meanings when combined—has been proposed to improve contextual vocabulary acquisition, leading to enhanced fluency and more precise language production (e.g., Boers, 2021; Haugh & Takeuchi, 2023; Koh, 2020; Simpson-Vlach & Ellis, 2010). However, distinctions have been made between general collocations of words that co-occur without creating new and unexpected meanings from the individual parts (e.g., “relate to,” “show a trend”), and units of individual words that combine to produce meanings beyond their literal interpretations, such as phrasal verbs (e.g., “look around,” “throw away”) and idiomatic expressions (e.g., “kick the bucket,” “be in someone’s shoes”; Crowley et al., 2023; Sag et al., 2002).
Phrasal verbs are typically defined as combinations of a verb and a particle, which can either be adverbial or prepositional, that come together to form a new meaning, often related to motion, change, aspect, or idiomatic expression (Gardner & Davies, 2018; Haugh & Takeuchi, 2023; Spring, 2018, 2019). A number of studies indicate that EFL learners with verb-framed L1s, such as Japanese and Korean, often struggle learning these units due to a lack of awareness in how the verb and particle combine to create new meanings (Haugh & Takeuchi, 2023; Spring, 2018, 2019; Yasuda, 2010). Phrasal verbs have also been gaining increased attention in EFL teaching and learning studies because of their prevalence in English and polysemy, making them essential, but very difficult to acquire for many EFL learners (e.g., Al-Otaibi, 2019; Birdsell & Kavanagh, 2023; Rudzka & Ostyn, 2003; Spring, 2018, 2019; White, 2012; Yasuda, 2010).
Idiomatic expressions are not clearly defined, but share some similarities with phrasal verbs, especially in their figurative interpretations. While the term “idiomatic” encompasses various linguistic phenomena like metaphors and similes, this paper specifically focuses on non-compositional expressions, which had distinct literal and figurative meanings (Titone & Connine, 1999). For the sake of clarity, we exclude phrasal verbs, which consist of a single verb and a single particle (Gardner & Davies, 2018), from the category of idiomatic expressions to create completely separate categories. Although there is less research on the impact of L2 learning on idiomatic expressions in general, Cooper (1999) highlights the challenges that learners face with them in EFL. Additionally, scholars such as Haugh and Takeuchi (2023) and Yasuda (2010) have observed that the idiomatic meanings of phrasal verbs pose particular difficulties for EFL learners. However, idiomatic expressions themselves have not received nearly as much attention in EFL teaching and learning studies as phrasal verbs, given the large number of papers on learning the latter, specifically, and the rather sparse number dedicated to learning the former.
Finally, it is important to note that possessing a deeper understanding of multiword vocabulary, which encompasses phrasal verbs and idiomatic expressions, correlates with higher overall EFL proficiency (e.g., Boers, 2021; Kim, 2019), as well as higher scores on standardized tests (Crowley et al., 2023). Therefore, while multiword vocabulary may not receive as much emphasis as single-word vocabulary, it remains a crucial aspect of language learning that warrants investigation.

2. Using Online Tools to Learn Vocabulary

Traditional, deliberate acquisition of vocabulary, which may come in the form of word lists in textbooks or stacks of paper flashcards, have received criticism because they are monomodal and often decontextualized (e.g., Barcroft, 2020). However, a number of studies have suggested that online vocabulary learning tools can be used to acquire L2 vocabulary, specifically in the L2 context. In particular, technological advances in online flashcards have allowed for various practice modes that provide can provide contextualized multiple-choice questions (e.g., Nakata, 2020; van den Broek et al., 2023), gap-fill activities (e.g., Strong & Boers, 2019; Strong & Leeming, 2024), and technologyinfused speaking and listening practice (e.g., Spring & Takeda, 2023). One particular advantage suggested to using digital flashcards over more traditional paper-based study include increased motivation. For example, Ying et al. (2021) found that young learners showed more interest in studying after implementing online flashcards as a part of their foreign language class. Similarly, Spring and Takeda (2023) found that university EFL learners found flashcards motivating but noted that adding point systems to the flashcards improved their motivational impact. Another advantage to using digital flashcards is that they can encourage learners’ item recall when answers are shown on a delay (e.g., van den Broek et al., 2023). Specifically, van den Broek et al. (2023) have shown that when learning a foreign language, providing a cue followed by correct answers or choices after a short pause increases the odds of students remembering the words in the future. While paper-based flashcards or covering answers in a book could also provide this sort of practice, teachers do not have control of student behavior, so cannot ensure that students are practicing as efficiently as possible. Conversely, digital flashcards allow for delayed actions, forcing students to undergo recall before answers are shown (e.g., Strong & Leeming, 2023). Finally, some studies have suggested that the interactive nature of digital flashcards can make them more engaging. With paper-based practice, students generally only have to turn a page or shuffle a paper flashcard. However, digital flashcards can be modified to force students to click on correct answers (e.g., Nakata, 2020; van den Broek et al., 2023), write answers (e.g., Spring & Takeda, 2023; Strong & Leeming, 2024), or even watch and interact with videos (Hamagami et al., 2024). This is important because it is also widely thought that increased student engagement leads to higher motivation to learn, especially in online learning environments (e.g., Martin & Bolliger, 2018).
However, it should be mentioned that the effectiveness of these online tools seems to depend on how much they activate memory recall (van den Broek et al., 2023), a learner’s initial and corrected guesses (Strong & Boers, 2019), and the practice mode used (Spring & Takeda, 2023). Furthermore, though Strong and Boers (2019) and Strong and Leeming (2024) show the effects of such online tools on the learning of phrasal verbs, specifically, most studies to date have focused on the using of these tools for learning single word vocabulary items (Nakata, 2020; Spring & Takeda, 2023; van den Broek et al., 2023) and there are no studies that we are aware of that explore the use of such tools to learn idiomatic expressions that are not phrasal verbs. This is important because idiomatic expressions are often more likely to be used in spoken contexts (Zareva, 2016), suggesting that text-based practice methods such as gap-fill activities and multiple-choice flashcards might be less effective for acquiring these items for use in oral contexts, where they are particularly relevant.

3. Learning Vocabulary Though Multiple Modes

In order for vocabulary to be truly acquired, learners should be able to process the word in reading, writing, listening, and speaking modes. For this reason, Nation (2020) recommends vocabulary learning take place across these multiple skills. However, in the context of learning a foreign language, learners frequently encounter vocabulary in reading and writing situations, primarily because these skills are often taught using textbooks, and in the Asian EFL context, many learners often attempt to memorize vocabulary through paper-based word-lists such as vocabulary books (e.g., Yoshitomi et al., 2006). To give students more access to auditory and oral vocabulary practice, there has been a push from EFL practitioners to incorporate videos, audio, and other multimedia resources into learning curriculums and teaching practices (e.g., Kaku, 2018; Kim, 2019; Koh, 2020; Spring, 2019, etc.). Theoretically, this is very important to compliment the more traditional reading and writing focus often found in Asian EFL contexts. However, despite these efforts, it has been difficult to point to the exact benefit of learning vocabulary through multimedia in a controlled learning context.
One reason it has been difficult to see the specific benefits of specific speaking activities on specific multiword vocabulary acquisition is also an issue with many flashcards and online tools; they tend to offer mostly reading or writing skills. For example, Quizlet flashcards often simply allow learners to pair L1 meanings to L2 words, but are generally based on visual input, making them reading focused (e.g., Kaku, 2018). However, studies such as Strong and Leeming (2024) and Nakata (2020) have also suggested that similar online tools can be created in which students write their answers. Furthermore, Takeda (2023) also suggested flashcards that include speaking and listening questions (based on text-to-speech listening and automatic speech recognition technology) may promote language acquisition. Spring and Takeda (2023) conducted an initial study that compared how multimodal flashcards that encouraged reading, writing, speaking, and listening could be utilized in the EFL classroom and verified that such multimodal usage helped students to obtain higher scores on subsequent quizzes. Furthermore, they discovered that using a variety of modes, and writing in particular, when studying was more effective than multiple-choice reading-based practice alone. However, their study focused on derivational single-word vocabulary and did not include a delayed test in the target context, so it is still unclear if the same trend will be seen for multiword vocabulary, which are more common in oral registers. Furthermore, it is also not yet known how multimodal practice will affect students’ performance on a delayed test that requires contextual listening and understanding of the multiword vocabulary units.
Based on the previous studies introduced above, there is reason to believe that learning multiword vocabulary units could have a discernible impact on learners’ general EFL proficiency and that learning them through online tools could facilitate this learning. However, it is still not yet clear what the impact of learning through multiple modes is on the ability to (1) remember the units initially, or (2) listen to and comprehend these multiword vocabulary units in conversations later. In order to investigate this in detail and uncover the role of various modes in EFL multiword vocabulary acquisition, we pose the following research questions:
1) Which modes of practice (reading, writing, speaking, and listening) had the greatest impact on initial retention of multiword vocabulary unit memorization?
2) Which modes of practice (reading, writing, speaking, and listening) had the greatest impact on delayed ability to listen to and comprehend multiword vocabulary units in conversations?


1. Participants

A total of 229 L1 Japanese EFL learners who were first-year university students agreed to participate in this study. They were provided with informed consent and allowed to withdraw participation in the study at any time in accordance with the ethical review board at the authors’ institution. The participants had taken the TOEFL ITP® test of general academic English proficiency one month prior to the study and exhibited a wide range of scores with most students being around a high CEFR A2 level to a low CEFR B1 level; 367-653 (M = 500.3, SD = 39.53).
The students participated in the activities described in this study as part of their university designated general English education classes. Students took one class focused on increasing listening and speaking skills taught by one of the two authors, and one class focused on reading and writing skills taught by another teacher. Students in the authors’ classes had to learn lists of phrasal verbs and idiomatic expressions as designated by their curriculum and course textbook, Pathways to Academic English, 4th edition (Spring & Scura, 2023). The expressions for the textbook were selected from Martinez and Schmitt (2012) and Simpson-Vlach and Ellis (2010) based on co-occurrence in both lists and suitability to the definitions of phrasal verbs and idiomatic expressions provided in Section 2.1. Learning these multiword vocabulary units was included in the speaking and listening centric class due to the fact that the selected expressions are argued to be common in spoken academic contexts.
Classroom time for both authors consisted of using worksheets condoning pair and group speaking exercises, listening to audio files or watching videos meant to contextualize the target vocabulary, and later assigning weekly homework assignments meant to practice the target vocabulary and skills through reading, writing, listening, and speaking.

2. Vocabulary Learning Tool

In order to help students learn and memorize the lists of phrasal verbs and idiomatic expressions, the authors created a multimodal flashcard system in which students could learn the multiword vocabulary units in L1-L2 matching, reading, writing, listening, and speaking modes. The model and technology mirrors that of Spring and Takeda (2023) and an example is shown in Figure 1. The L1-L2 matching flashcards presented learners with the definition of the target multiword vocabulary unit in their L1 and four randomized multiple-choice options, one of which was the target unit. The reading flashcards were nearly identical, but learners were prompted with a sentence containing a blank in which one of the multiple-choice options would be acceptable. The writing flashcards also contained a sentence with a blank, but then asked students to write the missing piece of the phrase or expression, following Strong and Boers (2019). The speaking and listening flashcards present the users with a short phrase or sentence containing the target multiword vocabulary unit in context as well as “hear” and “speak” buttons. Upon clicking the “hear” button, students heard the phrase or sentence pronounced by in-device text-to-speech technology. Upon clicking the “speak” button, the user’s in-device automatic speech recognition application programming interface (API) is utilized to record the user’s voice as they attempt to pronounce the phrase or sentence. The flashcard then analyzes the text output from the API and checks for the multiword vocabulary unit, considering inclusion of the unit as a “correct” pronunciation.
Students were provided with login credentials so that their data could be tracked and managed by their teachers. Each multiple-choice click, writing attempt, listening click, and speaking click was recorded and the number of attempts and correct usages were available for students to look at and for teachers to check. For this study, the authors asked students to use the flashcards and do a minimum of 30 total attempts before weekly quizzes on sets of the multiword vocabulary units but did not dictate the mode that students were required to use. The authors downloaded their students’ data just before the quizzes and once again before their final tests at the end of the semester.

3. Testing and Data Collection

Students took two quizzes of phrasal verbs, each containing 16 phrasal verbs, and four quizzes of idiomatic expressions, each containing between 11 and 20 expressions. The quizzes were administered weekly in the classroom through Google Forms, for a total of six weeks. The quizzes were created multimodally to promote the acquisition of the units in a variety of contexts (Spring & Takeda, 2023; Uchihara, 2022). Specifically, each quiz contained four question modes containing five questions each:
1) Listening questions in which students had to listen to a short phrase or sentence and choose the multiword vocabulary unit that they heard from four options (multiple-choice)
2) Meaning matching questions in which students had to choose the correct meaning of the target multiword vocabulary unit (multiple-choice)
3) Context matching questions in which students had to read a sentence containing a blank and choose the multiword vocabulary unit that best completed the sentence (multiple-choice)
4) Writing questions in which students had to read a sentence containing part of the multiword vocabulary expression and then fill in the missing piece of the expression (written, free answer)
Later, during the final class of a 15-week semester, the authors gave a final test to their students that contained 12 short conversation questions and 4 long conversation questions. All of these items were designed to test students’ delayed ability to listen to the multiword vocabulary units studied throughout the course and comprehend their meaning in the given context, with half of the questions were designed to test the comprehension of phrasal verbs in context and the other half were designed to test idiomatic expressions. The test also included other questions to test different skills not relevant to this study, but we isolated the data for these questions to obtain a measure of how well students had acquired the ability to listen to phrasal verbs and idiomatic expressions and understand them in context.
For this study, the total number of multimodal flashcard clicks were used to gauge the amount of practice undertaken in each mode. Additionally, we assessed students’ average quiz scores to measure their initial memorization of the multiword vocabulary items. As mentioned above, we considered students’ final test scores to evaluate their overall delayed ability to listen to and comprehend the target items in context. Lastly, we used students’ TOEFL ITP® test scores taken before the learning had begun to represent students’ initial general English proficiency.

4. Data Analysis

To access the impact of different practice modes via flashcards on learning, we employed Mizumoto’s (2023) method of multiple regression with random forests the determine significance, along with dominance analysis with relative weights for relative importance. This statistical method was used due to random forests being more accurate than p values in multiple regression models for indicating predictive significance, and because dominance analysis helps counteract co-variance when uncovering the relative importance of each variable in the model. The first set of regression models used average quiz scores as the predictor variable and the second set used final test scores. Both sets of models included the four practice modes and TOEFL ITP® scores as the dependent variables. The TOEFL ITP® tests scores were included to account for initial English proficiency. We expected TOEFL ITP® scores to be the largest predictors of quiz and test scores as the learning phrase took place over a matter of just seven weeks. We then considered any other variables confirmed as significant by the random forests to have had a significant impact on learning. Furthermore, we used the relative importance metrics as a way to check which practice modes had the most impact on overall learning. We then make generalized comments based both on these analyses, the descriptive data, and contextualize the results based on other similar studies.


The descriptive statistics regarding students clicks, average quiz scores and final test scores are presented in Table 1 and 2. They show that multiple choice questions were the most popular practice mode and that there was a wide range of both average quiz scores and final test scores. These initial results are similar to that of Spring and Takeda (2023) and suggest that when given a choice, students default to doing less-active reading and L1-L2 matching questions (i.e., multiple-choice questions), perhaps because they are easiest or most familiar to students in the Japanese EFL context. Furthermore, it is clear that while most students did at least some practice with the online flashcards, some did not use either the phrasal verbs or the idiomatic expressions flashcards at all, and most students did not use the listening or speaking practice modes for either vocabulary type. Furthermore, the results suggest that the quizzes were easier for students than the final tests. This could be due in part to the fact that the final tests required listening ability in addition to knowledge of the multiword vocabulary units. However, it could also be due in part to the fact that students simply forgot some of the target items by the end of the semester.
The first two multiple regression analysis models significantly predicted students’ average quiz scores for both phrasal verbs (F = 18.23; p < .01; R2 = .282) and idiomatic expressions (F = 15.33; p < .01; R2 = .263), explaining 28% and 26% of the respective variances. The results of this analysis are presented in Table 3 and 4 and graphically represented in Figure 2 and 3. The random forests confirmed all of the modes had some impact on the average quiz scores for both phrasal verbs and idiomatic expressions. However, the relative importance analyses show that after adjusting for initial English proficiency (i.e., TOEFL ITP® scores), the two modes of practice that had the greatest impact on average quiz scores were writing and speaking, which held true for both phrasal verbs and idiomatic expressions. This suggests that although initial English proficiency was important for quiz scores, productive practice, specifically speaking and writing, were the most helpful for students to remember the multiword vocabulary items in the short term.
The second two multiple regression analysis models significantly predicted students’ final test scores for both phrasal verbs (F = 37.65; p < .01; R2 = .456) and idiomatic expressions (F = 33.53; p < .01; R2 = .447), explaining 46% and 45% of the respective variances. The results of this analysis are presented in Table 5 and 6 and graphically represented in Figure 4 and 5. The random forests confirmed that in addition to TOEFL ITP® scores, writing, listening, and speaking practice impacted final test scores of phrasal verbs, and that only listening and speaking practice impacted final test scores of idiomatic expressions. Furthermore, relative importance scores showed that while TOEFL ITP® scores obviously had the most predictive power within the model, the number of listening clicks and speaking clicks were far more important for the delayed final test scores than multiple choice or writing practice. This means that for studying both phrasal verbs and idiomatic expressions, speaking and listening practice had much more of an impact on students’ final test scores than multiple choice (i.e., reading) or writing practice.


First, the analysis of the average quiz scores suggests that writing and speaking had the most impact on short-term memorization. Specifically, these two practice modes had the most explanatory power in the average quiz score regression models, and this was equally true for both phrasal verbs and idiomatic expressions. This finding aligns partially with the results of Spring and Takeda (2023) who identified writing as the most crucial mode for learning derivative vocabulary. The heightened importance of speaking practice in this study could be attributed to the nature of multiword vocabulary expressions. Unlike single-word derivations, multiword vocabulary expressions are language chunks. Hence, learners could have received additional benefit from producing the entire chunk through the speaking practice mode, as opposed to only part of it in the writing mode. Furthermore, the results of this test show some symmetry with those of Strong and Lemming (2024) in that gap-fill exercises helped learners to improve their phrasal verb knowledge, which could be explained in part due to the nature of trial-and-error which is present in the writing practice mode.
Comparing the results of the predictive models of average quiz scores to those of final test scores suggests that speaking and listening had more impact on long-term acquisition or on domain-specific comprehension. Specifically, the relative importance of flashcard usage was much lower when predicting final test scores than when predicting quiz scores, and this trend was the same for both phrasal verbs and idiomatic expressions. This finding indicates two possibilities: either students started forgetting the target multiword items after several weeks, or the in-context listening questions on the final test were considerably more challenging that the rather straightforward quiz questions. While we cannot be certain which possibility is the more likely explanation, we do find it particularly noteworthy that speaking and listening practice modes significantly predicted performance of the final test, surpassing the predictive value of writing or multiple-choice modes. This discrepancy in final test score predictability versus quiz score predictability might be attributed, at least partially, to the fact that practicing listening and speaking modes aligned more directly with the oral format of the test questions. Though there were listening questions on the quizzes as well, those questions simply tested the ability of students to pick out the multiword vocabulary unit that they heard, rather than being able to listen to them and comprehend them in context. Therefore, there was likely some influence from the style of the questions asked in quizzes as opposed to tests, but it is also clear that practicing in the oral register allowed for greater long-term effects when tested later in the same register.
Second, it is also noteworthy that while writing practice had positive short-term effects on item knowledge retention, this knowledge did not necessarily transfer over to listening comprehension ability, as measured by students’ final tests. This could also be due in part to the difference in registers between writing and listening comprehension. However, it might also be due in part to the fact that while writing helped prepare students for straightforward form-recognition questions on the quizzes, it did not help prepare them for understanding the meanings in broader contexts.
Taken together, the findings of this study highlight the importance of employing various modes in learning multiword vocabulary. Firstly, active practice such as writing is important for short-term multiword vocabulary retention, and secondly, it underscores the value of oral practice for enhancing long-term acquisition, including the ability to comprehend the multiword vocabulary in conversational or oral-based English contexts. These findings carry several implications. Firstly, they suggest that educators should promote interactive and multimodal engagement with the target language. Secondly, they suggest that while acquiring vocabulary is an important endeavor, solely engaging in reading-based activities or L1-L2 paired practice is not enough to promote true acquisition. Rather, students need to actively use the target language, particularly in its intended context or register; i.e., just reading vocabulary books will not promote speaking and listening skills. While this notion has been somewhat intuitive among EFL educators, this study presents controlled evidence of the fact that a variety of practice modes, specifically practice in the target register, is extremely important. Lastly, the results of this study indicate that students often gravitate to the easiest or most familiar modes of learning. Therefore, teachers must encourage and guide their students to interact with the target language in a variety of ways. While multimodal flashcards, as demonstrated in this study, are able to help aid in learning, their limited impact on delayed final test scores as compared with short-term quiz scores suggest they may be most effective as a supplementary tool. In other words, online practice should be considered the starting point to promote initial learning, but subsequent practical activities in the classroom may be essential for long-term proficiency gains.


This study investigated the effects of different practice modes (reading, writing, speaking, and listening) on the short-term retention and long-term comprehension of multiword vocabulary units. Students studied phrasal verbs and idiomatic expressions through online reading, writing, speaking, and listening flashcards over several weeks and subsequentially took a series of multimodal quizzes to determine their short-term retention of the multiword vocabulary units. Weeks later, students underwent contextualized final tests to determine their lasting vocabulary acquisition. The results suggest that utilizing multimodal flashcards can benefit the learning of multiword vocabulary, having an impact on immediate quiz score improvements and sustained, albeit weakened, impact on contextual comprehension. The study also revealed that while use of more varieties of study modes had positive impacts on students, students’ most preferred mode was clearly simple multiple-choice reading and L1-L2 pairing practice. Consequently, we recommend encouraging students to use more of the various modes available and also incorporating multimodal flashcard practice alongside practical classroom activities to reinforce learning.
It should be noted that this study has a number of limitations. First, the data from our multiple-choice flashcards is somewhat problematic because the answers were not presented on a timed delay, which can hinder recall (i.e., Strong & Leeming, 2024; van den Broek et al., 2023), and our data did not discern how many multiple-choice clicks were L1-L2 meaning matching or contextual practice. Addressing these points in the future might provide further insights about learner practice preferences and how the two types of multiple-choice practice influenced learning outcomes differently. Additionally, due to our inability to collect quantitative data on this aspect, it remains unknown to what extent students’ engagement in classroom activities or through homework contributed to their acquisition of phrasal verbs and idiomatic expressions. For instance, students engaged in various activities such as watching videos containing these expressions, completing worksheets, and practicing listening exercises for homework. However, we were unable to determine the level of active participation from students in these tasks. Collecting more data on these activities in the future could potentially shed light on these factors, allowing for a better understanding of the variance in the models and providing additional insights into integrating multimodal flashcard practice with in-class and homework activities. However, the fact that the random forests did confirm some of the flashcard practice as influential on both quiz and test scores shows that this practice plays at least a mild role in student learning, and therefore we can still recommend this type of practice.
Based on the results and limitations discussed above, we would also like to recommend that future studies continue to examine how various modes of practice impact learners’ skills and explore what can be done to further motivate students to not only use the tools more, but also use a wider variety of modes. Some possible areas that could be studied regarding the former include a one-to-one matching of practice mode and testing mode, e.g., examining if writing practice modes will promote writing recall, specifically, more than other modes of practice. Some areas that should be further explored regarding motivation include gamification, increased interactivity, and adaptive features such as intelligent tutoring systems.

Examples of Phrasal Verbs Flashcards
Random Forests and Relative Importance of Phrasal Verb Average Quiz Scores
Note: MC = multiple choice, sh = shadow
Random Forests and Relative Importance of Idiomatic Expressions Average Quiz Scores
Note: MC = multiple choice, sh = shadow
Random Forests and Relative Importance of Phrasal Verb Final Test Scores
Note: MC = multiple choice, sh = shadow
Random Forests and Relative Importance of Idiomatic Expression Final Test Scores
Note: MC = multiple choice, sh = shadow
Table 1.
Descriptive Statistics for Clicks and Average Scores of Quizzes and Final Tests (Phrasal Verbs)
Flashcard clicks Scores
Multiple choice Writing Listening Speaking Quiz Final test
Range 0-1657 0-960 0-305 0-496 .6-1 .0625-1
Average 142.7 45.7 13.1 18.5 .8376 .6809
Median 115 11 0 0 .85 .6875
SD 139.4 89.2 35.1 45.9 .0874 .1799
Table 2.
Descriptive Statistics for Clicks and Average Scores of Quizzes and Final Tests (Idiomatic Expressions)
Flashcard clicks Scores
Multiple choice Writing Listening Speaking Quiz Final test
Range 0-1142 0-1033 0-370 0-348 .4875-1 .0625-1
Average 215.0 71.7 20.5 24.1 .8765 .6898
Median 184 25 0 0 .9125 .6875
SD 145.0 119.5 57.4 57.1 .1044 .1826
Table 3.
Regression Model Statistics Predicting Phrasal Verb Average Quiz Scores
B β VIF t Random forest Relative importance
TOEFL ITP® score .000878 .397 1.022 6.968 Confirmed 55.1%
Multiple choice clicks -.000034 -.055 2.428 -.631 Confirmed 6.1%
Writing clicks .000238 .243 2.573 2.735 Confirmed 17.7%
Listening clicks .000175 .070 1.330 1.103 Confirmed 5.7%
Speaking clicks .000337 .177 1.348 2.746 Confirmed 15.4%
Table 4.
Regression Model Statistics Predicting Idiomatic Expression Average Quiz Scores
B β VIF t Random forest Relative importance
TOEFL ITP® score .001025 .379 1.032 6.414 Confirmed 59%
Multiple choice clicks .000027 .038 1.291 .570 Confirmed 4.1%
Writing clicks .000187 .214 1.659 2.911 Confirmed 22.1%
Listening clicks -.000029 -.016 1.129 -.261 Confirmed 0.8%
Speaking clicks .00022 .120 1.271 1.871 Confirmed 14%
Table 5.
Regression Model Statistics Predicting Phrasal Verb Final Test Scores
B β VIF t Random forest Relative importance
TOEFL ITP® score .00303 .667 1.022 13.177 Confirm 95%
Multiple choice clicks .00005 .040 2.428 .516 Reject 0.45%
Writing clicks -.0000325 -.016 2.573 -.204 Confirm 0.38%
Listening clicks -.0005637 -.011 1.330 -1.941 Confirm 3.26%
Speaking clicks .000326 .083 1.348 1.455 Confirm 0.82%
Table 6.
Regression Model Statistics Predicting Idiomatic Expression Final Test Scores
B β VIF t Random forest Relative importance
TOEFL ITP® score .000002 .653 1.032 12.186 Confirm 93%
Multiple choice clicks .000011 .001 1.291 .021 Reject 0.04%
Writing clicks -.000314 .007 1.659 .104 Reject 0.53%
Listening clicks .000271 -.099 1.129 -1.806 Confirm 3.60%
Speaking clicks .003083 .085 1.271 1.457 Confirm 2.83%


Al-Otaibi, G. M. (2019). A cognitive approach to the instruction of phrasal verbs: Rudzak-Ostyen’s model. Journal of Language and Education, 5(2), 4-15. https://doi.org/10.17323/jle.2019.8170.
Barcroft, J. (2020). Key issues in teaching single words. In S. Webb (Ed.), The Routledge handbook of vocabulary studies. pp 479-492. Routledge: https://doi.org/10.4324/9780429291586.
Birdsell, B. J., & Kavanagh, B. (2023). Turned on a quasi-experimental design with phrasal verbs: Does the type of learning intervention matter?. In R. J. Dickey & H. Lee (Eds.), AsiaTEFL proceedings 2023: Papers from the 21st AsiaTEFL conference. pp 165-179. AsiaTEFL.
Boers, F. (2021). Evaluating second language vocabulary and grammar instruction: A synthesis of the research on teaching words, phrases, and patterns. Routledge.
Brown, D., Stoeckel, T., Mclean, S., & Stewart, J. (2022). The most appropriate lexical unit for L2 vocabulary research and pedagogy: A brief review of the evidence. Applied Linguistics, 43(3), 596-602. https://doi.org/10.1093/applin/amaa061.
Cooper, T. C. (1999). Processing of idioms by L2 learners of English. TESOL Quarterly, 33(2), 233-262. https://doi.org/10.2307/3587719.
Crowley, K., Haugh, S., & Spring, R. (2023). An examination of correlations between multiword expression interpretability and general proficiency test scores. APU Journal of Language Research, 8, 46-61. https://doi.org/10.34409/apujlr.8.1_47.
Gardner, D., & Davies, M. (2018). Sorting them all out: Exploring the separable phrasal verbs of English. System, 76, 197-209. https://doi.org/10.1016/j.system.2018.06.009.
Garnier, M., & Schmitt, N. (2015). The PHaVE list: A pedagogical list of PVs and their most frequent meaning senses. Language Teaching Research, 19(6), 645-666. https://doi.org/10.1177/1362168814559798.
Hamagami, K., Spring, R., Nakamura, S., & Otsuki, A. (2024). Short videos for learning indirect speech: Impacts on outcomes and effects of video modulation. In A. Leis & M. Wilson (Eds.), Screen media in language education (in progress). Castledown.
Haugh, S., & Takeuchi, O. (2023). Learner knowledge of English phrasal verbs: Awareness, confidence, and learning experiences. International Journal of Applied Linguistics, 34(2), 656-671. https://doi.org/10.1111/ijal.12523.
Inagaki, S. (2002). Japanese learners’ acquisition of English manner-of-motion verbs with locational/directional PPs. Second Language Research, 18(3), 3-27. https://doi.org/10.1191/0267658302sr196oa.
Kaku, J. (2018). Discourse analysis and electronic flashcard software: Using films and TV dramas on DVD as learning materials and Anki as an individual study tool. Teaching English Through Movies: ATEM Journal, 23, 3-16.
Kim, S. (2019). The effects of using images and video clips for vocabulary learning: Single words vs phrasal verbs. STEM Journal, 20(2), 19-42. https://doi.org/10.16875/stem.2019.20.2.19.
Koh, S. (2020). Initial formulaic sequence approach for EFL learners: Based on film corpus-based patterns. STEM Journal, 21(4), 1-17. https://doi.org/10.16875/stem.2020.21.4.1.
Kyle, K., & Crossley, S. A. (2015). Automatically assessing lexical sophistication: Indices, tools, findings, and application. TESOL Quarterly, 49(4), 757-786. https://doi.org/10.1002/tesq.194.
Lee, S., & Pulido, D. (2017). The impact of topic interest, L2 proficiency, and gender on EFL incidental vocabulary acquisition through reading. Language Teaching Research, 21(1), 118-135. https://doi.org/10.1177/1362168816637381.
Martinez, R., & Schmitt, N. (2012). A phrasal expression list. Applied Linguistics, 33(3), 299-320. https://doi.org/10.1093/applin/ams010.
Martin, F., & Bolliger, D. U. (2018). Engagement matters: Student perceptions on the importance of engagement strategies in the online learning environment. Online Learning, 22(1), 205-222. https://doi.org/10.24059/olj.v22i1.1092.
McLean, S., Stewart, J., & Batty, A. O. (2020). Predicting L2 reading proficiency with modalities of vocabulary knowledge: A bootstrapping approach. Language Testing, 37(3), 389-411. https://doi.org/10.1177/0265532219898380.
Mizumoto, A. (2023). Calculating the relative importance of multiple regression predictor variables using dominance analysis and random forests. Language Learning, 73(1), 161-196. https://doi.org/10.1111/lang.12518.
Nakata, T. (2020). Learning words with flash cards and word cards. In S. Webb (Ed.), The Routledge handbook of vocabulary studies. pp 304-319. Routledge: https://doi.org/10.4324/9780429291586.
Nation, I. S. P. (2020). The different aspects of vocabulary knowledge. In S. Webb (Ed.), The Routledge handbook of vocabulary studies. pp 15-29. Routledge: https://doi.org/10.4324/9780429291586.
Rudzka, B., & Ostyn, P. (2003). Word power: Phrasal verbs and compounds. Mouton de Gruyter.
Sag, I., Baldwin, T., Bond, F., Copestake, A., & Flickinger, D. (2002). Multiword expressions: A pain in the neck for NLP. In A. Gelbukh (Ed.), Lecture notes in computer science: Vol. 2276. Computational linguistics and intelligent text processing. pp 1-15. Springer: https://doi.org/10.1007/3-540-45715-1_1.
Seo, J. (2014). A study on language development through movie utterances: On the basis of Desperate Housewives. Teaching English Through Movies: ATEM Journal, 19, 89-103. https://doi.org/10.24499/atem.19.0_89.
Simpson-Vlach, R., & Ellis, N. C. (2010). An academic formulas list: New methods in phraseology research. Applied Linguistics, 31(4), 487-512. https://doi.org/10.1093/applin/amp058.
Spring, R. (2018). Teaching phrasal verbs more efficiently: Using corpus studies and cognitive linguistics to create a particle list. Advances in Language and Literary Studies, 9(5), 121-135. https://doi.org/10.7575/aiac.alls.v.9n.5p.121.
Spring, R. (2019). Using short animations to help teach phrasal verbs from a cognitive typology standpoint: Implications from two years of comparative data. STEM Journal, 20(4), 105-122. https://doi.org/10.16875/stem.2019.20.4.105.
Spring, R., & Scura, V. (Eds.). (2023). Pathways to academic English (4th ed.). Tohoku University Press.
Spring, R., & Takeda, J. (2023). The effect of multimodal flashcards on L2 derivational vocabulary knowledge: A preliminary analysis of attempts and quiz question modes. Japan Association of Language Education and Technology Kanto Chapter, 8, 1-24. https://doi.org/10.24781/letkj.8.0_1.
Strong, B., & Boers, F. (2019). The error in trial and error: Exercises on phrasal verbs. TESOL Quarterly, 53, 289-319. https://doi.org/10.1002/tesq.478.
Strong, B., & Leeming, P. (2024). Evaluating the application of a gap-fill exercise on the learning of phrasal verbs: Do errors help or hinder learning? TESOL Quarterly, 58(2), 726-750. https://doi.org/10.1002/tesq.3248.
Takeda, J. (2023). Pathways to academic English: Multi-modal online tools: Initial analysis of advantages and student feedback. Tohoku University, Institute for Excellence in Higher Education Annual Bulletin, 9, 43-48.
Titone, D. A., & Connine, C. M. (1999). On the compositional and noncompositional nature of idiomatic expressions. Journal of Pragmatics, 31, 1655-1674. https://doi.org/10.1016/s0378-2166(99)00008-9.
Uchihara, T. (2022). How does the test modality of weekly quizzes influence learning the spoken forms of second language vocabulary? TESOL Quarterly, 57(2), 595-617. https://doi.org/10.1002/tesq.3176.
van den Broek, G. S. E., Gerritsen, S. L., Oomen, I. T. J., Velthoven, E., van Boxtel, F. H. J., Keter, L., & van Gog, T. (2023). Optimizing multiple-choice questions for retrieval practice: Delayed display of answer alternatives enhances vocabulary learning. Journal of Educational Psychology, 115(8), 1087-1109. https://doi.org/10.1037/edu0000810.
White, B. J. (2012). A conceptual approach to the instruction of phrasal verbs. The Modern Language Journal, 96(3), 419-438. https://doi.org/10.1111/j.1540-4781.2012.01365.x.
Wolter, B. (2020). Key issues in teaching multiword items. In S. Webb (Ed.), The Routledge handbook of vocabulary studies. pp 493-510. Routledge: https://doi.org/10.4324/9780429291586-31.
Yasuda, S. (2010). Learning phrasal verbs through conceptual metaphors: A case of Japanese EFL learners. TESOL Quarterly, 44(2), 250-273. https://doi.org/10.5054/tq.2010.219945.
Ying, Y., Marchelline, D., & Wijaya, G. (2021). Using technology-flashcards to encourage students learning Mandarin. Journal of Physics: Conference Series, 1764, 012138https://doi.org/10.1088/1742-6596/1764/1/012138.
Yoshitomi, A., Umino, T., & Negishi, M. (2006). Readings in second language pedagogy and second language acquisition: In Japanese context. John Benjamins Publishing.
Zareva, A. (2016). Multi-word verbs in student academic presentations. Journal of English for Academic Purposes, 23, 83-98. https://doi.org/10.1016/j.jeap.2016.07.001.
METRICS Graph View
  • 0 Crossref
  •  0 Scopus
  • 619 View
  • 11 Download
Related articles

Editorial Office
#1219, Bugak building, Kookmin University,
Jeongneung-ro 77, Seongbuk-gu, Seoul 02707, Korea
E-mail: stem@stemedia.co.kr                

Copyright © 2024 by The Society for Teaching English through Media.

Developed in M2PI

Close layer
prev next