A Semi-Autonomous Pronunciation Learning Framework: Integrating Motivation and Skill Development Through Online Video Tasks*
Article information
Abstract
This study designs and evaluates the My Vocabulary Database (MVD) activity, a semi-autonomous, ICT-enhanced out-of-class assignment for pronunciation courses that integrates individual learning (video-based vocabulary extraction, speech-style shadowing) and collaborative learning (cloud sharing, peer evaluation). Addressing the limited out-of-class second language (L2) exposure in EFL contexts, the MVD framework targets Self-Determination Theory needs—competence through brief instructor feedback and relatedness via peer monitoring. Implemented with Japanese university English majors, the study examined perceived outcomes for listening, pronunciation, and vocabulary, the motivational role of recording submissions and feedback, and the effects of collaborative elements. Data were collected through online surveys and analyzed using descriptive statistics and chi-square tests. Survey results indicated high perceived effectiveness: nearly all participants reported gains in listening, vocabulary, and pronunciation, expressed strong appreciation for concise feedback, and showed over 99% intention to continue video-based study. Findings suggest that combining learner autonomy (personal choice), competence (supported through concise instructor feedback), and relatedness (peer/instructor interaction) sustains motivation and supports habit formation. Overall, structured ICT tasks that blend individualized practice with social accountability foster engagement and autonomy more effectively than either approach alone, reinforcing the value of integrating diverse elements within the Self-Determination Theory framework to promote lifelong L2 learning.
I. INTRODUCTION
In EFL contexts such as in Japan, a long-standing challenge is how learners can achieve sustained exposure to English beyond formal classes. When naturalistic input is limited, it is difficult for learners to improve pronunciation, listening comprehension, and vocabulary. Therefore, it is essential to increase both the amount and quality of learning time outside the classroom. Self-directed learning mediated by ICT tools has attracted attention as a promising response to this challenge. However, autonomy should not be regarded merely as an internal individual attribute; its maintenance and development necessitate adequate scaffolding from others (often referred to as autonomy support), as Dörnyei (2003) emphasizes, as well as satisfying the need for relatedness through interaction with peers (Deci & Ryan, 2012). Self-Determination Theory (SDT) (Deci & Ryan, 1985, 2012) is central to this study because it conceptualizes motivation as the fulfilment of three basic psychological needs—autonomy, competence, and relatedness. These needs provide a useful framework for understanding how semi-autonomous tasks can foster sustained engagement in out-of-class learning contexts. Therefore, the primary goal of this study is to design a learning activity grounded in Second Language Acquisition (SLA) principles and informed by SDT, which effectively integrates individual and collaborative elements to satisfy learners’ basic psychological needs—autonomy, competence, and relatedness.
Audiovisual media offer learning opportunities at the discourse and pragmatic levels that are often lacking in pedagogically simplified English materials, through rich situational contexts and important pragmatic cues (Sherman, 2003). Furthermore, the growth of streaming platforms has been reported to contribute to maintaining motivation and increasing task time by enabling learners to view L2 content iteratively and repeatedly based on their interests (Dizon, 2018). These findings suggest the need to incorporate key requirements such as learner choice (Dörnyei, 2003), rich language input (Sherman, 2003), minimal instructor support (Dizon, 2018), and mutual accountability among peers (Deci & Ryan, 2012) into the design of out-of-class learning in EFL settings. Moreover, the visibility of how these components function is indispensable for sustaining self-directed learning.
Despite growing research on ICT-mediated autonomous learning, few studies have examined how semiautonomous tasks can integrate both individual and collaborative elements to enhance motivation in pronunciation-focused EFL courses. This gap highlights the need for empirical research that examines how learners respond to ICT-based assignments designed to promote autonomy and sustained engagement outside the classroom.
Based on this challenge, Ramsden (2020) developed a “semi-” autonomous learning task, Video Report, in which learners summarize key points and engage in reflection using video materials. The study showed the efficacy of semi-autonomous learning1 with minimal instructor support. Ramsden and Matsui (2025) further extended the Video Report approach by implementing an out-of-class assignment that combined viewing subtitled videos with a cloud-shared spreadsheet in translation and TOEIC classes. They reported its contribution to vocabulary acquisition and the establishment of out-of-class study habits.
Building on this line of inquiry, this study designed and implemented an out-of-class assignment suitable for pronunciation classes, the My Vocabulary Database (MVD) activity. This activity comprises the following four core components:
(a) Individual learning with subtitled video watching and extraction of target vocabulary
(b) Creating a vocabulary database (MVD) through cloud-based sharing
(c) Audio recordings following the shadowing task and concise instructor feedback
(d) Production of a one-minute video presentation as a capstone for learning, followed by peer evaluation
This paper sought to answer the following research questions:
RQ1: What perceived outcomes does MVD yield for learners’ listening, pronunciation, and vocabulary acquisition?
RQ2: To what extent are recording submissions and instructor feedback related to learner motivation and intention to continue learning?
RQ3: Do the collaborative elements of MVD—peer monitoring through shared spreadsheets and peer evaluation through on-demand, one-minute video viewing—affect learner motivation?
This paper examined the design and effects of MVD. It first organized prior literature and the theoretical underpinnings, then details the implementation procedures of MVD as an out-of-class assignment for a pronunciation course. Subsequently, the results of a learner survey conducted at a Japanese university with 167 English majors are reported. Finally, the implications of a semi-autonomous task that integrates individual and collaborative learning to foster learner motivation and autonomy are discussed.
II. LITERATURE REVIEW AND THEORETICAL BACKGROUND
1. Developments in Motivation Research
Motivation has been established as a central factor shaping learning outcomes in SLA research. Gardner and Smythe (1975) distinguished between integrative (aimed at identifying with the target culture) and instrumental (based on pragmatic goals) orientations. They showed that attitude and motivation are related to learners’ persistence or attrition in L2 studies. Dörnyei (1998, 2003) developed L2 motivation research into a situated approach, emphasizing that the quality of the learning environment and task design directly influence learners’ motivation.
SDT was originated by Deci and Ryan (1985) and has since been developed across a range of fields. Deci and Ryan (2012) consolidate three basic psychological needs as the core of the framework sustaining learners’ motivation and well-being: (1) autonomy, the desire to act on the basis of one’s own choices; (2) competence, the desire to exercise and demonstrate ability through successful task completion; and (3) relatedness, the desire to experience connection and a sense of belonging with others. When these requirements are satisfied, optimal motivation and sustained learning engagement are possible.
Furthermore, SDT distinguishes intrinsic motivation (driven by inherent interest and enjoyment) from extrinsic motivation (driven by external rewards or evaluations) (Deci & Ryan, 2012). In their study of Canadian L2 learners, Noels et al. (2000) demonstrated that learners’ perceptions of competence and freedom of choice (autonomy) were positively related to intrinsic motivation and negatively related to L2 anxiety, thereby strengthening their intention to persist in L2 learning. Collectively, these findings support the view that the SDT framework, which encompasses basic psychological needs (autonomy, competence, and relatedness), effectively models motivation and learning persistence in L2 acquisition.
When learners perceive that no effort will improve their outcomes, a state of learned helplessness (or amotivation) can arise (Dörnyei, 1994). This finding aligns with the reformulated learned helplessness theory, which explains that when people feel their actions have no effect on outcomes, they tend to lose motivation and give up (Abramson et al., 1978). In L2 settings, it is associated with reduced persistence when success expectancy is low (Dörnyei, 1998). In Japanese universities, while an extrinsic motive (earning course credit) is present during the semester, once that goal has been achieved, many students lose reasons to continue, and some lapse into amotivation at the end of the course, making out-of-class learning difficult.
2. The Significance of Using Authentic Materials
In SLA research, conditions for effective input have been discussed from multiple perspectives. Krashen’s (1985) Input Hypothesis posits that comprehensible input provides the foundation for language acquisition. Meanwhile, Schmidt’s (1990) Noticing Hypothesis stresses the importance of learners’ conscious attention to linguistic form. Against this theoretical backdrop, authentic video materials, such as films and TV dramas, are considered to play an important role in language acquisition. King (2002) argues that feature films are an effective instructional resource in EFL contexts, enhancing learners’ intrinsic motivation and fostering their intercultural understanding. Quaglio (2009) further shows that sitcoms provide a rich repository of naturally occurring conversational language. Kobayashi (2017) points out that TV series may be more suitable than films for language learning, as they may offer greater repetition of expressions across episodes, an abundance of natural colloquial language, a consistent context, and the relative ease of forming a sustained viewing habit. These findings indicate that films and TV dramas offer learners natural and diverse language situations, serving as powerful materials that support sustained learning.
The recent spread of streaming services reinforces this trend. Dizon (2018) conducted a learner study using Netflix and found that regular viewing fostered vocabulary learning and listening comprehension. Moreover, Dizon and Thanyawatpokin (2021) reported that dual subtitles in Language Learning with Netflix2 significantly promoted vocabulary learning and supported listening comprehension more than other onscreen textual aids. These findings are directly relevant to the present study, as MVD also integrates subtitled video viewing as a central component of out-of-class learning. Both studies highlight how personalized, interest-driven exposure to authentic media through streaming platforms can enhance comprehension, engagement, and motivation. Furthermore, they illustrate how advances in ICT have expanded access to authentic English input and provided a foundation for sustained learning.
In EFL environments, including in Japan, a persistent gap has been noted between pedagogically adjusted classroom materials and English use. Kobayashi (1999, 2006) argues that learners’ dependence on textbooks can lead to vulnerability in real-world communicative situations. Kobayashi (2006) explicitly identifies this as “helplessness.” Moreover, in EFL contexts, studies tend to remain confined to school subjects, and a salient L2 community is often absent (Dörnyei 2003). Consequently, learners are prone to feeling disconnected between classroom instruction and practical language use, which makes it difficult to sustain motivation.
In this context, incorporating authentic materials into out-of-class assignments is crucial for supporting learners’ sustained engagement in learning. Utilizing video materials that simulate authentic language use both inside and outside the classroom effectively prepares learners for communication beyond the classroom and holds particular significance in EFL contexts. Thus, authentic video materials are positioned as a practical means not only to meet theoretical conditions such as comprehensibility, noticing, authenticity, and multimodality but also to enhance learner motivation and support sustained learning. Materials that use films, TV dramas, and streaming services may offer effective solutions to the challenges faced in the EFL context.
However, much of the existing evidence focuses primarily on in-class activities, leaving relatively limited empirical knowledge on the design of sustained out-of-class use and maintaining motivation. This study seeks to bridge this gap by designing and evaluating a semi-autonomous out-of-class assignment that integrates collaborative elements centered on authentic videos and ICT.
III. PATHWAYS FOR SUPPORTING AUTONOMOUS LEARNING WITH ICT
1. ICT-Supported Autonomous Learning
The proliferation of streaming services and online materials has enabled learners to access authentic English outside the classroom at their own pace. However, as Benson (2013) pointed out, autonomy is not fostered by a laissez-faire approach; rather, it requires structured support. Holec (1981, p. 3) defines autonomy as “the ability to take charge of one’s own learning.” It is posited as the capacity to take personal responsibility for goal setting, content/method selection, and monitoring/evaluation of one’s progress. To promote learner autonomy, it is necessary to provide meaningful choices regarding learning content and methods, and to build support for those choices (see Deci & Ryan, 2012; Dörnyei, 2003; Noels et al., 2000).
Visualizing learning progress (via self-assessment and feedback), goal setting, and learner cooperation can help sustain motivation (Dörnyei, 2003). In addition, learner–learner interactions can create conditions that enhance learning quality and performance (Akiyama, 2017). Learners seek connections with their peers, and learning within these relationships becomes a major factor in sustaining continued engagement. This aligns with SDT’s need for relatedness (Deci & Ryan, 2012). Therefore, for persistence outside class, it is crucial to integrate a light form of peer-to-peer mutual accountability into assignment design.
In response to these issues, Ramsden proposed a Video Report activity, requiring learners to select a video, summarize its content, and reflect on their language use. This activity has a demonstrably effective framework for autonomous learning that holds learners accountable even outside class (Ramsden, 2020).
2. Positioning of the Present Study
Previous research has demonstrated the effectiveness of semi-autonomous tasks that impose learner accountability in ensuring the continuity of out-of-class learning in EFL contexts. Ramsden’s (2020) Video Report adopted a format requiring learners to select videos and report them to the instructor through summarization and reflection, thereby preventing the setbacks often associated with laissez-faire autonomous learning. Extending this framework, Ramsden and Matsui (2025) implemented an extension of the task for translation and TOEIC classes. This involved combining subtitled video viewing with a cloud-shared spreadsheet for vocabulary extraction and database creation, demonstrating its effectiveness in vocabulary acquisition and the formation of study habits.
However, these prior implementations were limited to vocabulary-focused designs and did not support the integrated acquisition of other language components. They also showed room for improvement in addressing “learned helplessness”—a known inhibitor of sustained learning (Abramson et al., 1978; Dörnyei, 2003). When learners feel that effort does not lead to results, motivation declines rapidly, and continuing out-of-class learning becomes difficult. To address this problem, it is essential to design interventions that satisfy competence and relatedness, as articulated in SDT (Deci & Ryan, 2012).
MVD proposed in this study is designed to meet this theoretical requirement. Specifically, brief instructor feedback on the post-shadowing recording task signals the possibility of improvement and supports learners’ self-efficacy, thereby fulfilling their sense of competence. Simultaneously, peer evaluation of the video presentation and the cloud-shared vocabulary database satisfy relatedness through learner–learner interaction and alleviate feelings of isolation. This task design is expected to prevent the emergence of amotivation and contribute to the maintenance of motivation.
Figure 1 illustrates how the MVD introduced in this study enhances and sustains learner motivation. Learners enhance their motivation by pursuing interest and enjoyment through video viewing, sharing information with instructors and peers, and connecting with the course content. In addition, by extracting authentic English expressions and up-to-date language use from videos, they experience authenticity, which can further strengthen their motivation. Ultimately, this cyclical structure leads learners toward autonomous learning and contributes to the consolidation of study habits.
The MVD integrates individual and collaborative learning by implementing both elements online. Specifically, in each session, learners submit a recorded task, and the instructor provides positive feedback and a minimal set of points for improvement via chat. This exchange is intended to foster a growth mindset without undermining learner motivation, while simultaneously cultivating a sense of connection with the instructor. Thus, this structure integrates personalization and collaboration within a single task, thereby incorporating a cycle of motivation, habit formation, and skill development into a semi-autonomous out-of-class assignment.
IV. MVD IMPLEMENTATION DESIGN
1. Details of the MVD Framework
This section provides a detailed explanation of the MVD framework, drawing on the present implementation. MVD was implemented over 12 weeks within a one-semester elective course that primarily focused on pronunciation improvement. The participants were 167 first-year English majors at a private university in Japan enrolled in four sections of the same course offered across the spring and autumn semesters.
Within the MVD framework, learners select video materials aligned with their purposes and preferences, watch them, collect vocabulary items that they deem necessary, and create an original vocabulary list (MVD) using a cloud spreadsheet. Additionally, learners shadow sentences containing the collected vocabulary, modeling the video speakers’ utterances, and submit the resulting audio recordings.
One cycle of the assignment consists of video viewing through the submission of the audio recording file, and it was conducted as an out-of-class task from Week 2 through Week 13 of the 14-week semester. The number of expressions collected in the vocabulary list was at the learner’s discretion; however, for the recording task, learners selected one item from the list: “the phrase they would most like to use in daily life.” Table 1 summarizes the overall structure of the MVD activity, including its main components, procedures, and the platforms used for implementation.
2. Vocabulary Database Created and Shared via a Cloud Spreadsheet
Learners select a 5–10-minute video3 from streaming services, including YouTube, Netflix, Amazon Prime Video, and Disney+. The selection pool includes movies, TV series, anime dubbed in English, celebrity interviews, and social media video posts. Learners are encouraged to choose content relevant to their interests.
Learners view the selected clips in three stages. The first viewing is with Japanese subtitles, focusing on overall comprehension and enjoyment. The second viewing is with English subtitles, simultaneously reading the captions and parsing the audio. The third viewing is for extracting expressions—drawn from vocabulary that appeared in the video—that are useful in everyday life (the criterion of usefulness being determined by the learner). While there is no quota for the number of items extracted, the only rule is that learners must engage in the task at least once per week.
Learners record the extracted words and phrases, together with the source sentence and translation, in a cloud spreadsheet and enter metadata such as the date, platform, and genre. They are instructed to complete all tasks online, and the spreadsheet is shared among groups of five to ten students to promote “peer monitoring.” Here, learners mutually check one another’s progress, viewed materials, and extracted vocabulary. The instructions given to the students are as follows:
• Instructions for Students to Create MVD
1) Cooperatively share individual spreadsheets within groups for peer monitoring.
2) Choose a 5–10-minute video clip from a streaming service based on your personal interests.
3) View the video with Japanese subtitles, and then with English subtitles.
4) Use a cloud-based spreadsheet to record expressions, creating a repository.
Note: There is no required number of items to list, but students must complete the task once per week.
A sample of the spreadsheet is presented in Figure 2 below.
3. Course-Integrated Pronunciation Practice and Audio Recording Tasks
Following the creation of the MVD, learners select one sentence (or a coherent segment of dialogue) each week from the extracted set. They then shadow the audiovisual material, imitating the speakers’ utterances, and submit the audio files. This shadowing is positioned not as mere repetition but as “speech-style mimicry,” where learners imitate the style of the video speaker (e.g., actor, presenter). Learners are instructed to choose a role model—someone they feel they would like to emulate—and to faithfully reproduce their rhythm, intonation, and emotional expressions. The audio recording task should focus on English-like rhythm and intonation, rather than on the clarity of individual phonemes (vowels and consonants). The instructions given to learners are as follows:
• Instructions to Learners for the Audio Recording Task
1) Select a sentence or short dialogue from the transcriptions in the spreadsheet for shadowing practice.
2) Use the video for practice, emphasizing English rhythm over precise articulation.
3) Submit an audio file via Microsoft Teams chat function to receive comments.
The workflow of the audio recording task is shown in Figure 3.
For the submitted recording files, the instructor returns brief but positive comments and a minimal set of points for improvement through chats. This brief exchange is intended to raise awareness of improvement without undermining learners’ motivation, while simultaneously cultivating a sense of connection with the instructor.
Although MVD was primarily designed as an out-of-class assignment, the first five minutes of each class session in the present implementation were set aside for students to briefly share the videos they selected, the expressions they extracted, and to exchange opinions. This in-class activity fostered learner cohesion and enhanced mutual accountability. While priority was placed on English rhythm and intonation for the audio recording task, learners were also instructed, when feasible, to attend to the phonetics and phonology topics covered in each lesson to align with the pronunciation course content.
4. Final Task: On-Demand Video Presentation
At the end of the semester, learners complete a final on-demand assignment by creating a one-minute English presentation video and sharing it online. In this course implementation, the videos were shared with the entire class using Microsoft Teams. For the presentation, learners select one work from those viewed for MVD during the term and explain why that specific work is effective for English learning.
To create the presentation, the choice of tools and methods is left to the learners’ discretion (e.g., producing a video with PowerPoint, filming on a smartphone, recording in Zoom, or editing via a combination of methods). The submission format must be a video file, such as MP4. Specific instructions for learners are as follows:
• Instructions for students to create presentation videos
1) Create a one-minute presentation video.
2) You may use any production method (e.g., PowerPoint, smartphone, Zoom, or any combination).
3) Choose exactly one video you watched for your Video Report and introduce it.
4) Deliver the entire presentation in English.
5) Submit the video file (e.g., MP4) to the designated Microsoft Teams folder by the deadline.
• Components to be included in the presentation
1) Title of the recommended video (Japanese title)
2) Title of the recommended video (English/original title)
3) Video genre (e.g., drama, comedy, horror, documentary, interview)
4) Platform on which the video is available (e.g., Netflix, Amazon Prime Video, YouTube)
5) Reason for recommending the video
6) How this video contributes to your English learning
The use of AI and translation functions was permitted for creating the presentation videos in the classroom implementation. However, students were instructed to avoid overly difficult vocabulary and expressions that their classmates might not have comprehended. Therefore, it was emphasized that students prioritize “presenting in English that the audience can understand.”
In the present implementation, assignment details were presented in the latter half of the course (Week 9). Students were then required to submit a draft outline that included the items in (4) above. The instructor provided feedback on the outline and, where necessary, encouraged students to adjust the direction or focus of the presentation.
Presentation videos were shared online at the class level (via Microsoft Teams), and learners viewed the presentations on demand. After viewing, learners provided peer feedback to their pre-assigned group members, and all comments were compiled into a file and submitted to the instructors. This arrangement discouraged unconstructive comments and officially positions peer feedback as an assessment element.
In this implementation, the students were divided into groups of approximately ten, and the Microsoft Teams chat function was used for peer feedback. This marked the conclusion of the out-of-class learning cycle. Learners then conducted a self-assessment, and the instructor provided the final evaluation of the MVD content and the on-demand presentation.
V. RESULTS
Participation in the post-course reflection survey was voluntary. As the survey involved self-assessment of learning progress, it was not anonymous. Participants were informed that their names would not appear in any publication, that the survey results would have no effect on their course grades, and that their data would be used for research purposes only with their consent. In addition, items other than the self-assessment were optional, and participants could choose whether to respond. The analysis was conducted based on the data collected under these ethical conditions. The results presented in this section address the research questions by describing learners’ responses to the MVD implementation and the motivational factors identified through both quantitative and qualitative data.
1. Results of the Learner Survey on MVD Implementation
1) Learner Demographics and Viewing Habits
The proficiency distribution of the 167 participants showed that the B1 level was the largest group with 82 students (49.1%), followed by A2 with 73 students (43.7%). Only 12 students (7.2%) were classified as A1 or B2 (Table 2). Overall, most learners were at the lower to mid-intermediate level.
Regarding video-viewing habit before and after the start of the MVD tasks, more than 90% reported an increase in viewing, with 39.5% stating it “increased considerably” and 52.1% “increased slightly.” “No change” was reported by 7.8% of the participants, and very few reported a decrease. These changes in viewing habits are summarized in Table 3.
2) Perceived Learning Effectiveness
Regarding perceived learning effects, students evaluated video viewing positively in three specific areas: vocabulary learning, listening improvement, and grammar learning. Vocabulary showed the strongest perceived benefit, with 100% of students rating video viewing as either “very helpful” (59.9%) or “somewhat helpful” (40.1%). Listening improvement was also rated highly, with 98.8% of students reporting positive effects (61.1% “very helpful” and 37.7% “somewhat helpful”). In addition, 83.8% of students considered video viewing helpful for grammar learning. Table 4 summarizes these positive responses for each of the three skill areas.
These results can be interpreted within the framework of Krashen’s (1985) Input Hypothesis and Schmidt’s (1990) Noticing Hypothesis. Learners’ reports of improved listening and vocabulary suggest that frequent video viewing provided rich, comprehensible input. Furthermore, the task elements of “vocabulary extraction” and “shadowing recording” promoted “heightened attention” to linguistic forms, indicating that the task successfully promoted noticing during language processing.
In terms of preferred viewing mode, “English audio + English subtitles” overwhelmingly dominated at 74.4%, far exceeding no subtitles (9.8%) and L1 subtitles (15.9%).
3) Evaluation of Pronunciation Improvement and Task Design
In response to the question, “In the Video Report task, you watched English-audio videos, practiced shadowing, and submitted a recording of your performance. Do you think this method helped improve your pronunciation?” perceived pronunciation improvement was high, with 53.9% “strongly agree” and 43.7% “somewhat agree.” Concretely, “natural English shadowing” (65.3%) and “regular viewing” (45.5%) were reported as effective strategies.
Regarding the recording tasks overall, 97.6% reported that they were effective for improving pronunciation: 48.5% stated this was because the tasks “compel practice,” and 35.9% because they “raise motivation.” Concerning teacher feedback on the recording tasks, 83.2% indicated that “it is better to have feedback even if only brief,” which demonstrated the efficacy of instructional intervention.
4) Continuation Intent and Habit Formation
Regarding intent to continue, over 99% indicated some commitment, with 74.3% wishing to continue “voluntarily” and 25.1% “if assigned.” Crucially, 94.0% answered that they would like to continue “as entertainment” as well, not just for purely learning purposes. This suggests that the habit of watching English-language videos had become integrated into their daily lives.
2. Perceptions of the Collaborative Learning Activities
1) Spreadsheet Sharing
Regarding spreadsheet sharing, students’ reasons for agreement were diverse: 64 students (38.3%) agreed that they “wanted to see others’ expressions,” 44 (26.3%) “wanted to know the titles of videos others were watching,” and 3 (1.8%) cited “other reasons.” Conditional approval (“I would share if instructed”) was given by 33 students (19.8%). Only 22 students (13.2%) were opposed, with most stating that they “did not want others to see their expressions.”
2) Presentation Video Sharing
As for sharing the one-minute presentation video as a final task, 45 students (26.9%) agreed, and 87 (52.1%) said they “would permit it if instructed,” meaning that approximately 80% were positive toward sharing when conditional approval was included. This result indicates that many learners viewed video sharing favorably as an element of collaborative learning. However, 35 students (21.0%) opposed it, revealing that a segment of the learners felt anxious about the visibility of the video task.
3) Analysis of Open-Ended Responses (Qualitative Results)
Of 167 participants, 73 (43.7%) provided open-ended responses. Among them, 30 (41.1%) described “gratitude/satisfaction,” 19 (26.0%) “perceived learning effects,” and 18 (24.7%) “changes in motivation,” with 91.8% giving an overall positive evaluation. Specifically, many comments noted, for example, “I deepened my understanding in a lecture focused on pronunciation” and “I discovered the enjoyment of continuing to learn through films.” Conversely, requests (6.8%) included a small number of suggestions for improvement, such as “increase instruction on English rhythm and stress.” Table 6 summarizes the distribution of the qualitative responses.
A chi-square test comparing the proportions of positive and negative responses showed that positive responses were significantly more frequent, χ²(1) = 50.97, p < .001, consistent with the trends observed in the quantitative results.
VI. DISCUSSION
The MVD implemented in this study is designed to combine individual and collaborative learning elements. The survey results indicate that individual tasks (such as vocabulary extraction and shadowing) support the autonomy and personalization of learning. By contrast, collaborative tasks (including peer monitoring and feedback) tend to enhance learners’ motivation and instill a sense of responsibility in group members. This suggests that integrating individual and collaborative elements may effectively sustain learner motivation. These findings build on previous semi-autonomous learning frameworks (e.g., Ramsden & Matsui, 2025) by demonstrating that such an approach is also effective in pronunciation-focused courses. This expansion suggests that semi-autonomous frameworks can be flexibly adapted to different learning goals and course types, thereby broadening their pedagogical applicability in EFL contexts.
A second key finding relates to feedback and motivation. Although the teacher’s feedback was minimal, the motivational effect was pronounced. The learners felt that their efforts were acknowledged through concise comments, which contributed to their continued engagement. For instructors, sending detailed feedback to every student in large classes each week is highly labor-intensive. However, this study confirmed that even short comments delivered via chat were sufficient to sustain learners’ motivation. This result suggests that instructors do not necessarily need lengthy and detailed evaluations; effective support can be offered through concise and regular responses. Furthermore, the motivational effect of learner choice and peer interaction observed in this study supports Dörnyei’s (2003) argument that task design and autonomy support are crucial for sustaining engagement in EFL contexts. This connection highlights how integrating structured peer collaboration within semi-autonomous frameworks can positively influence strengthening motivation and sustaining engagement in out-of-class learning.
A further point concerns the role of authentic materials. Streaming services provide the learners with authentic input that is highly motivating and easily accessible. Furthermore, the combined use of Japanese and English subtitles functioned as scaffolding for comprehension, supporting a gradual transition toward more autonomous viewing. This finding underscores the critical role of authentic materials in facilitating autonomous learning.
A limitation also emerged regarding autonomy and future challenges. Although many learners expressed a willingness to continue studying after the assignment ended, their tendency to rely on structured tasks highlighted the difficulty of achieving full autonomy within a single semester. To transition learners into independent lifelong learners, it is necessary to combine organizational support with phased scaffolding.
Finally, an additional consideration relates to collaborative sharing. While sharing via spreadsheets was widely accepted, some learners felt anxious about self-presentation in relation to video sharing. This result suggests that although collaborative learning activities hold educational value, tasks that involve high performance visibility may generate psychological resistance. Therefore, careful implementation that considers learners’ psychological safety is required. According to Edmondson (1999, p. 354), psychological safety refers to “a shared belief that the team is safe for interpersonal risk taking.” In education, it means that learners must feel comfortable expressing themselves without fear of negative evaluation. In the present study, maintaining psychological safety requires a careful balance between visibility and privacy when learners share recordings and peer feedback. Such consideration is vital for maintaining trust and motivation in technology-mediated collaborative learning. Nevertheless, the interaction between the individual learning tasks included in the MVD design (vocabulary extraction and shadowing) and collaborative learning tasks (peer monitoring and feedback) was found to contribute to sustaining learners’ motivation. It is crucial to incorporate task design methods that maximize the benefits of collaborative sharing and ensure psychological safety, while considering these varied learner responses.
VII. CONCLUSION
This study demonstrates the potential of a learning approach that integrates ICT-based tasks to reinforce in-class learning while fostering autonomy through semi-autonomous out-of-class assignments. The strategic use of ICT has proven to be effective in ensuring convenience and enabling learners to easily engage with tasks.
The video-based MVD was confirmed to effectively improve listening, vocabulary, and pronunciation, while simultaneously enhancing learner motivation. A key finding was that combining monitoring with both individual and collaborative learning elements enabled learners to sustain engagement at a level that was unattainable through either approach alone. Furthermore, teacher feedback positively influenced motivation. Ultimately, cultivating learner autonomy through university courses appears to be essential for students as they transition into society.
Grounded in SDT (Deci & Ryan, 1985, 2012), this study demonstrates how fulfilling the three basic psychological needs—autonomy, competence, and relatedness—enhances sustained motivation in EFL learning. The mechanisms built into the MVD activity concretely address these needs: teacher feedback fosters competence by signaling that improvement is achievable, while cloud-based sharing and peer feedback promote relatedness and a sense of belonging. Additionally, learners’ freedom to select video materials based on personal interests contributes to autonomy and self-regulation. Together, these factors form a motivational cycle that supports persistence in out-of-class learning.
These findings also extend prior research (e.g., Dörnyei, 2003; Ramsden, 2020; Ramsden & Matsui, 2025) by showing that semi-autonomous frameworks integrating individual and collaborative learning can be effectively applied in pronunciation-focused contexts. Furthermore, the use of authentic audiovisual materials aligns with findings by King (2002), Quaglio (2009), and Dizon (2018), emphasizing their role in promoting intrinsic motivation and sustained engagement.
The semi-autonomous, ICT-enhanced assignment presented in this study incorporates mechanisms that concretely satisfy the “competence” and “relatedness” emphasized in SDT. Specifically, teacher feedback signals to learners that improvement is possible and supports their sense of competence, whereas cloud-based sharing and peer feedback help them feel connected to their classmates. Furthermore, combining these elements with video viewing tailored to learners’ personal interests creates powerful factors that sustain motivation.
In conclusion, this study highlights the importance of designing ICT-enhanced learning models that balance individual choice, peer collaboration, and instructor feedback within an SDT-based framework. Future research should explore how this framework can be applied to EFL learners of various proficiency levels, particularly those at the lower levels (e.g., CEFR A1), where learner autonomy is often less developed. In addition, longitudinal studies would be valuable to examine how the sustained use of semi-autonomous ICT tasks contributes to long-term language development and learner autonomy.
Notes
The term semi-autonomous is used because minimal scaffolding is provided by the instructor, and although the task is conducted outside the classroom, it is assessed as part of the course; therefore, it is not fully autonomous learning.
Language Learning with Netflix was later renamed Language Reactor. URL: https://www.languagereactor.com/
For films or TV dramas, learners are instructed to segment the material at an appropriate point into a 5–10-minute portion and view that segment.