Oral Communication Skills and Literacy Gains Through AI Conversational Practice Across Proficiency Levels*
Article information
Abstract
This study investigated the effects of AI conversational practice on university students’ speaking and listening skills, ChatGPT literacy, and perceptions of ChatGPT use across different English proficiency levels. Participants were divided into low- and high-proficiency groups. Both groups participated in oral practice with ChatGPT, targeting speaking and listening skills. Quantitative data were collected through pre- and post-listening and speaking tests and a 20-item ChatGPT literacy questionnaire that measured perceptions of usefulness, ease of use, affective responses, and strategic use of ChatGPT. Paired-samples t-tests, independent-samples t-tests, and ANCOVA were conducted to examine within-group gains and between-group differences. Qualitative data were obtained from openended survey responses and were analyzed using a thematic analysis. The results showed that both proficiency groups demonstrated statistically significant improvements in speaking and listening skills, although highproficiency learners exhibited larger gains in both areas. Both groups also showed statistically significant growth in ChatGPT literacy, and no statistically significant differences between proficiency levels were found. The findings highlight that learners at different proficiency levels derive distinct affective versus cognitive benefits from ChatGPT, underscoring the need for proficiency-sensitive instructional design and supporting its use as a supplementary tool to enhance oral proficiency and ChatGPT literacy.
I. INTRODUCTION
The rapid development of artificial intelligence (AI) has profoundly influenced all areas of contemporary society, with education undergoing a swift transition toward technology-based learning environments. In this context, the effective integration of AI tools and platforms into classrooms has become essential to provide learners with meaningful learning experiences, signaling a paradigm shift in pedagogy. Foreign language education, in particular, has increasingly embraced AI not merely as a supplementary tool but as a means of fostering critical thinking, creative problem-solving, and digital literacy, which encompasses competencies regarded as central to 21st-century learning (Kim & Kim, 2025). Zhai et al. (2024) further emphasize that the ability to understand and evaluate AI technologies constitutes a vital skill in modern society.
Among the various generative AI applications, ChatGPT has emerged as one of the most influential tools in language education. Developed by OpenAI and based on large language models and generative pre-trained transformer (GPT) technology, ChatGPT has been widely adopted since its release in November 2022. Although not originally designed for translation, it has been applied across multiple language pairs (Khoshafah, 2023) and is now used for diverse tasks such as answering questions, generating texts, correcting grammar, summarizing, problem solving, and translation (Kim et al., 2025).
ChatGPT literacy plays a critical role in enabling language teachers to design effective instructional materials and provide pedagogical support that responds to learners’ diverse proficiency levels and individual needs (Liu, 2025). For instructors, such literacy supports the development of data-informed instructional practices and the provision of individualized feedback aligned with students’ linguistic readiness. For learners, ChatGPT literacy functions as a practical competence that allows them to engage meaningfully with adaptive AI technologies, benefit from personalized feedback, and interact strategically with ChatGPT as a learning resource.
The pedagogical potential of ChatGPT can be interpreted through Gass et al.’s (2020) model of second language acquisition, which views language learning as an interactive process involving input, interaction, and feedback. By providing accessible input, immediate feedback, and opportunities for meaningful interaction, ChatGPT supports both the cognitive and social dimensions of foreign language learning.
In educational contexts, ChatGPT enables teachers to design personalized learning environments (Baidoo-Anu & Ansah, 2023), supports text and dialogue generation (Crosthwaite & Baisa, 2023), and provides immediate feedback that enhances immersive language learning (AlAfnan et al., 2023; Choe, 2023; Im, 2023; Yu & Yoo, 2024). Moreover, studies report that ChatGPT-assisted instruction increases both intrinsic and extrinsic motivation (Ali et al., 2023; Aydın Yıldız, 2023; Kim et al., 2025), and also sustain learners’ engagement and interest in language study (Song & Kim, 2025).
Empirical research has generally reported positive effects of AI integration across the four language skills—reading, writing, listening, and speaking—although findings remain mixed depending on factors such as task type, learner variables, and instructional design. While several studies highlight improvements in learner achievement (Han, 2023; Kim & Park, 2023; Yu & Yoo, 2024), others suggest that AI-based instruction does not always yield significant differences compared to traditional methods (Kim et al., 2025). Affectively, AI tools have been shown to expand opportunities for communication, enhance motivation, and provide interactive and personalized learning experiences (Im, 2023; Jeong, 2024). Learners also report that engaging with ChatGPT promotes self-directed learning and contributes to the development of critical and creative thinking by requiring them to analyze, evaluate, and creatively apply AI-generated information (Kim & Kim, 2025; Li et al., 2025). Nevertheless, limitations such as occasional inconsistencies in content, reduced accuracy, and the need for faster response times have been noted (Kim & Kim, 2025; Lee et al., 2024; Xiao & Zhi, 2023).
Despite these promising findings, the existing literature remains limited in several respects. Most studies to date have focused on short-term interventions with small samples and have often examined only one or two language skills or specific tasks, making it difficult to generalize the results. In addition, relatively few studies have investigated differential effects across proficiency levels or systematically examined learners’ AI-related literacy, indicating a need for further research that adopts more diverse samples and multi-dimensional outcome measures. Moreover, little attention has been paid to how ChatGPT-assisted speaking practice shapes learners’ oral communication skills and their development of ChatGPT literacy. To address these gaps, the present study examines the impact of oral practice with ChatGPT on students’ speaking and listening outcomes, their ChatGPT literacy, and their perceptions of ChatGPT use, focusing on differences across English proficiency levels.
Accordingly, the study addresses the following research questions:
1. To what extent does ChatGPT-assisted oral practice influence students’ speaking and listening skills across different English proficiency levels?
2. How does ChatGPT-assisted oral practice affect students’ ChatGPT literacy according to their proficiency level?
3. How do students’ perceptions of ChatGPT-assisted oral practice differ depending on their proficiency level?
II. LITERATURE REVIEW
1. Enhancing Language Skills via ChatGPT Use
Technologically, ChatGPT distinguishes itself from earlier AI tools through its foundation in advanced large language models and generative pre-trained transformer architectures. This enables it to surpass previous chatbots in versatility, depth of understanding, and applicability, offering creative outputs across diverse tasks and presenting new possibilities for foreign language education.
From a theoretical perspective, the use of ChatGPT aligns with Gass et al.’s (2020) model of second language acquisition, which conceptualizes language learning not as mere knowledge transmission but as a complex cognitive and social process. The model emphasizes the stages of input, comprehension, interaction, output, and feedback, illustrating how learners internalize language and develop communicative competence. Gass particularly highlights the central role of interaction and feedback in accelerating language development, as learners actively engage in communication, correct errors, and repeatedly practice language use in meaningful contexts. In this regard, ChatGPT functions as an effective tool by providing rich linguistic input, real-time feedback, and opportunities for meaningful interaction, thereby creating a balanced environment that integrates cognitive approaches with social contexts. Ultimately, Gass’s framework offers a theoretical justification for the integration of technology-based interactive learning, positioning ChatGPT as a pedagogically sound innovation in foreign language education.
Recent studies have examined the pedagogical effects of ChatGPT across multiple language skills, highlighting its potential to support oral proficiency as well as cognitive aspects of language learning. Within the field of second language acquisition, oral proficiency is commonly conceptualized as the ability to communicate verbally in a functional and accurate manner in the target language. A high level of oral proficiency further entails the ability to transfer linguistic knowledge to novel contexts and communicative situations (Omaggio, 1986). Building on this conceptualization, recent work has increasingly explored how AI-driven conversational agents such as ChatGPT can be used to foster learners’ oral skills.
In the area of speaking, ChatGPT has been shown to promote oral communication through interactive, simulated dialogues. By providing immediate feedback and a low-anxiety environment, ChatGPT enables learners to practice fluency and pronunciation more comfortably than in traditional classroom settings (Noh, 2024). Empirical findings further indicate that ChatGPT-assisted speaking practice contributes to improvements in vocabulary use and pronunciation while simultaneously reducing speaking anxiety, suggesting its effectiveness in enhancing overall speaking proficiency (Cheon, 2023; Noh, 2024).
Song and Kim (2025) demonstrated that the degree of engagement plays a critical role in the effectiveness of ChatGPT-assisted speaking tasks. Although both high- and low-engagement learners showed improvement in speaking performance, the high-engagement group achieved significantly greater gains. Importantly, learner perceptions varied by engagement level: low-engagement learners primarily valued the convenience and emotional support provided by ChatGPT, whereas high-engagement learners reported more substantive linguistic benefits, such as vocabulary expansion, increased confidence, and reduced speaking anxiety. These findings suggest that ChatGPT can meaningfully enhance language learning outcomes when instructional tasks are deliberately designed to promote sustained and active participation.
Extending this AI-assisted perspective, Kim (2025) examined the overall effectiveness of ChatGPT-assisted practice for Korean university students (N = 51). Although both the ChatGPT-assisted experimental group (n = 26) and the traditionally instructed control group (n = 25) showed significant improvements in speaking and listening skills, no statistically significant differences were found between the groups, suggesting that ChatGPT-assisted practice did not lead to additional gains beyond traditional instruction in this context. Even so, the experimental group showed clear progress in ChatGPT literacy, with notable gains in technical proficiency, critical evaluation, communication skills, and creative application, though ethical competence remained relatively unchanged. Learners also expressed positive perceptions of ChatGPT use, highlighting increased confidence, improved interaction skills, and greater accessibility to practice opportunities, despite persistent challenges such as speech recognition errors and occasional unnatural responses.
In reading instruction, ChatGPT has been utilized as a scaffold to support comprehension and meaning-making processes. Studies have shown that ChatGPT assists learners in thematic reading-to-writing tasks by helping them synthesize information, organize ideas, and engage more actively with texts (Caruana et al., 2022). Additionally, the supportive and interactive nature of AI-mediated instruction has been found to reduce anxiety often associated with traditional reading tasks, thereby encouraging deeper cognitive engagement (Caruana et al., 2022).
Building on this line of research, Kim et al. (2025) examined the impact of integrating ChatGPT into EFL reading courses on learners’ ChatGPT literacy and perceptions. In a study with 37 undergraduate students enrolled in a general English course, learners engaged in structured discussions with ChatGPT based on reading materials, participating in activities such as applying key concepts, exploring interpretations, and checking comprehension. Analysis of pre- and post-course questionnaires and semi-structured interviews revealed significant gains across five core dimensions of ChatGPT literacy. Students reported generally trusting ChatGPT while remaining cautious about the accuracy of its output, and they employed strategies such as asking follow-up questions, refining prompts, and cross-checking information with reliable sources. These findings suggest that the effective integration of AI tools in EFL education requires the development of learners’ critical thinking, ethical awareness, and adaptive use strategies.
With regard to writing, research has demonstrated that ChatGPT facilitates instant feedback and text refinement. By offering personalized suggestions on vocabulary choice and textual organization, ChatGPT supports learners in developing more coherent and sophisticated written output (Choe, 2023; Jeong, 2024). Moreover, continuous individualized feedback provided by AI tools has been associated with reduced writing anxiety and increased learner confidence, contributing to more positive writing experiences (AlAfnan et al., 2023; Choe, 2023).
ChatGPT has also been found to support grammar and vocabulary development. AI-based tools are effective in identifying grammatical errors and providing corrective feedback, which enhances learners’ linguistic accuracy and overall language competence (Woo & Choi, 2021). Learners generally perceive such tools as accurate and helpful, and this perception has been linked to increased motivation and engagement in language learning tasks (Ai, 2017; Kao, 2020).
Despite the growing interest in ChatGPT-assisted language learning, previous studies have consistently reported several limitations that require careful pedagogical consideration. One of the most frequently discussed issues concerns the accuracy and reliability of AI-generated responses. Learners have noted that ChatGPT sometimes produces repetitive, inconsistent, or partially inaccurate information, which raises concerns about uncritical reliance on AI output (Cheon, 2023; Noh, 2024). Although ChatGPT is often perceived as efficient for idea generation and time management, problems related to factual accuracy and source credibility persist, highlighting the need for instructional support such as source comparison and critical evaluation strategies (Kim & Kim, 2025).
Closely related to information reliability are concerns regarding academic integrity and learner responsibility. Several studies have cautioned that excessive dependence on AI-generated content may reduce learners’ active engagement and increase the risk of plagiarism. However, research also suggests that these risks can be mitigated when learners are encouraged to critically evaluate, revise, and contextualize ChatGPT-generated responses rather than accepting them passively (Choe, 2023; Han, 2023; Kim & Kim, 2025).
Another limitation widely discussed in the literature involves the quality of interaction. While ChatGPT is capable of simulating conversational exchanges, AI-mediated interaction has been found to lack the emotional depth, spontaneity, and pragmatic richness inherent in human communication (Dizon, 2020; Park, 2023). As a result, opportunities for developing interactional competence—such as real-time turn-taking, pragmatic negotiation, and affective responsiveness—may be constrained.
Technical constraints further limit the effectiveness of ChatGPT-assisted interaction. Prior studies have reported difficulties related to speech recognition accuracy, particularly in noisy classroom environments or when learners produce non-standard pronunciation, which can disrupt conversational flow (Noh, 2024). In addition, limitations in contextual understanding may lead to responses that do not fully align with learners’ communicative intentions, thereby reducing coherence and naturalness in interaction (Lee & Park, 2024; Song & Kim, 2025).
Recent studies have increasingly examined how AI-driven tools support learners’ oral performance. For instance, several ChatGPT-assisted interventions have reported gains in speaking fluency, pronunciation, and overall oral performance among EFL learners (Mingyan et al., 2025; Muniandy & Selvanathan, 2025). These studies generally suggest that sustained interaction with AI conversation partners and AI-powered speaking apps can enhance learners’ motivation and provide frequent, low-anxiety opportunities for oral practice. However, most of this work has focused on short-term gains in specific subskills (e.g., pronunciation or fluency) and has rarely compared outcomes across different proficiency levels.
Beyond ChatGPT, there is also research employing other AI tools to examine learners’ speaking skills across proficiency levels. For example, Chen et al. (2023) investigated how college EFL students perceived the use of Google Assistant for language learning and found that students enjoyed interacting with the assistant and considered it useful for improving speaking and listening skills, partly because its pronunciation was natural and easy to understand. At the same time, proficiency-related differences emerged: higher-level learners communicated more effectively, whereas lower-level learners experienced difficulties when their mispronunciations led to breakdowns in interaction. Similar patterns have been reported in studies using other intelligent personal assistants and AI-powered mobile apps, where more proficient learners tend to benefit more from extended AI-mediated speaking practice than their lower-proficiency peers (Mingyan et al., 2025).
Despite this growing body of research, the current literature remains limited in several important respects. First, few studies explicitly address learners’ AI-related or ChatGPT-specific literacy, even though such literacy is increasingly recognized as a key 21st-century competence. Second, research on oral skills has tended to focus on speaking outcomes alone, with relatively little attention to the integrated development of listening and speaking or to learners’ perceptions of AI-based oral practice across proficiency levels. Consequently, further empirical work is needed to examine how learners at different proficiency levels engage in proficiency-adaptive ChatGPT use for oral communication, and to clarify how such practices can promote more equitable and effective language learning across diverse learner groups.
2. Learners’ Perceptions of ChatGPT in Language Learning
Beyond skill development, previous research has consistently reported positive learner perceptions of ChatGPTassisted language learning. Learners view ChatGPT as a valuable learning resource that offers immediate feedback and personalized support, thereby enhancing the overall learning experience (Xiao & Zhi, 2023). The use of ChatGPT has also been shown to foster both intrinsic and extrinsic motivation by stimulating curiosity and sustaining interest in language learning activities (Aydın Yıldız, 2023; Kim & Kim, 2024). Furthermore, multiple studies have highlighted the role of AI tools in reducing language learning anxiety, particularly in speaking contexts, which contributes to increased learner confidence and willingness to communicate (Cheon, 2023; Noh, 2024).
Aydın Yıldız (2023) investigated the impact of integrating ChatGPT-generated dialogues into language teaching materials on learner motivation. Sixty second-year university students participated, and their motivational strategies were assessed. Results revealed significant differences across majors in motivation subcategories such as selfregulation, intrinsic values, and test anxiety. In another study (AlAfnan et al, 2023), opportunities identified include offering students a platform to answer theory-based questions, generate ideas for application-based tasks, and enabling instructors to integrate technology into classrooms and workshops. Challenges include risks of unethical use by students, which may lead to reduced critical engagement and difficulties for instructors in distinguishing between diligent and automation-dependent learners, as well as in assessing learning outcomes.
From a cognitive perspective, engaging with ChatGPT can impose substantial cognitive demands on learners. Users are often required to simultaneously formulate prompts, interpret AI responses, evaluate their accuracy, and generate language output, which may overburden lower-proficiency learners in the absence of appropriate scaffolding (Kim, 2025; Pokrivčáková, 2019; Woo & Choi, 2021). Furthermore, when instructional goals are not clearly defined, excessive reliance on AI support may diminish opportunities for productive struggle and impede the development of learner autonomy (Han, 2023).
In addition, several scholars have argued that the pedagogical use of generative AI is still in its early stages. Empirical, classroom-based research validating the effectiveness of ChatGPT remains limited, and existing studies tend to focus on specific tools or general learning outcomes rather than on higher-order skills such as critical thinking, creative thinking, and AI literacy (Khoso et al., 2025; Pokrivčáková, 2019; Woo & Choi, 2021). Moreover, research examining how ChatGPT-supported learning operates across different learner proficiency levels is still scarce, making it difficult to determine how AI-assisted instruction can be optimally adapted to diverse learner needs (Liu & Ma, 2023; Yilin et al., 2023).
In light of this, AI literacy provides a broad conceptual framework that encompasses more specific forms of literacy, including ChatGPT literacy and the concrete skills and knowledge required for educational use (Liu, 2025; Ma et al., 2024). Aligning these literacies enables educators and learners to navigate the rapidly evolving landscape of AI technologies and to leverage ChatGPT’s educational potential in ethical and pedagogically meaningful ways (Ma et al., 2024). However, to fully realize this potential, it remains essential to investigate how ChatGPT can be systematically applied across diverse proficiency levels through differentiated instructional designs and assessment of learner outcomes.
Taken together, these findings suggest that although ChatGPT holds considerable pedagogical potential, its effectiveness depends largely on thoughtful instructional design, critical engagement, and ethical guidance. They also underscore the need for further research that explores proficiency-related variations, pedagogically grounded approaches to integrating ChatGPT into EFL classrooms.
III. METHOD
1. Participants
This study was conducted with first-year students enrolled in a required general English course at A university in the spring semester of 2024. The course met twice a week for 75 minutes per session, and students were placed into different class levels based on their mock TOEIC scores taken at entry. Among the classes taught by the researcher, one was assigned as the high-proficiency group and the other as the low-proficiency group.
A total of 48 students participated: the low-proficiency group consisted of 25 students (average TOEIC score approximately 310), while the high-proficiency group included 23 students (average TOEIC score approximately 550). The classes focused on communication activities emphasizing listening and speaking. For instructional materials, the study employed Interchange Intro (Richards et al., 2021), which provided structured communicative activities appropriate for the low-proficiency. The high-proficiency group used Skillful 2 Listening & Speaking (Macmillan, 2018) as the main course book, which offered structured speaking and listening tasks for the highproficiency. Each group engaged in vocabulary, grammar, listening, and speaking activities based on their respective textbooks, and all classes incorporated topic-based conversations as a common practice.
Table 1 summarizes the basic information of learners and their reasons for studying English according to proficiency level. In the low proficiency group (n = 25), participants were between 19 and 24 years of age, representing a range of different majors, which are detailed in Table 1. Only 2 students reported overseas experience, while the remaining 23 had none. The primary reasons for learning English were academic performance (16), followed by career development (10), travel (8), recognition of English as a global language (6), interest (4), and selfimprovement (2).
In contrast, the high proficiency group (n = 23) consisted of students aged 19 to 23, representing a range of different majors (Table 1). Unlike the low proficiency group, three students reported overseas experience, while 20 had none. Their motivations for learning English were led by academic performance (16), followed by global language value (11), travel (9), career development (7), and interest (5).
Regarding prior experience with ChatGPT, only two students in the low-proficiency group had previously used it for English learning, while none of the high-proficiency students reported prior experience. When asked to select two preferred learning areas using ChatGPT, low-proficiency learners showed equal interest in grammar, speaking, and writing (10 each), followed by reading (6), listening (3), and vocabulary (3). In contrast, high-proficiency learners expressed the strongest interest in speaking (13), followed by grammar (8), vocabulary (7), reading (6), listening (4), and writing (4). Overall, both groups demonstrated a strong preference for using ChatGPT for speaking practice, along with grammar and vocabulary development.
2. Procedures
This study was conducted in a compulsory General English course offered over 15 weeks. Classes were held twice a week, each lasting 75 minutes, with a focus on listening and speaking activities. At the beginning of the semester, students took a mock TOEIC test administered by the university and were placed into groups according to their proficiency level. The researcher was responsible for one low‑proficiency group (26 students) and one high‑proficiency group (25 students). During the semester, three students withdrew, resulting in a final sample of 25 students in the low‑proficiency group and 23 students in the high‑proficiency group.
The two groups used different textbooks (Richards et al., 2021). The low‑proficiency group studied everyday topics such as birth, places, time, occupations, food, sports, movies, and emotions, while the high‑proficiency group focused on a broader range of themes including food, business, environment, movies, health, travel, emotions, and current trends (Macmillan, 2018). All classes were taught by the same instructor, and the use of ChatGPT, the duration of speaking activities, and the types of listening tasks were kept consistent across groups to ensure fairness in the study. The lessons were organized by units, with pre‑listening activities consisting of brainstorming and speaking practice, and post‑listening activities requiring students to express their opinions on the given topics. ChatGPT‑assisted speaking practice was integrated both before and after the listening tasks, each lasting approximately 5–10 minutes. Both groups engaged in oral practice with ChatGPT during class sessions.
In this study, AI-based conversational activities were employed to enhance students’ English speaking proficiency. Specifically, students engaged in weekly conversations on assigned topics using ChatGPT 3.5. These activities were designed to resemble traditional pair-work tasks and were conducted in an interactive question-and-answer format.
The conversational activities were implemented in two main forms. First, students initiated interactions by posing questions directly to ChatGPT and receiving responses. Second, ChatGPT generated topic-related questions, to which students were required to respond. This dual structure encouraged students not only to produce answers but also to explore a range of ideas through topic-based brainstorming.
In addition, all students interacted individually with ChatGPT using their personal mobile phones. This setup allowed them to practice speaking freely without temporal or spatial constraints and facilitated repeated conversational experiences, which helped promote spontaneous language production and improve speaking fluency.
Accordingly, the methodology of this study was designed to enable students to practice speaking in an environment that closely resembles authentic communicative situations through question–answer–based interactions with ChatGPT. This approach contributed to increased learner engagement and enhanced learning outcomes. Figure 1 presents a screenshot of a student from the low‑proficiency group interacting with ChatGPT, while Figure 2 shows a screenshot of a student from the high‑proficiency group engaging in a conversation.
To evaluate the effectiveness of the study, pre‑ and post‑tests in speaking and listening were administered. In addition, pre‑ and post‑surveys were conducted to examine changes in students’ ChatGPT literacy. The pre‑survey collected demographic information, while the post‑survey gathered open‑ended responses regarding students’ experiences with ChatGPT, including perceived benefits, drawbacks, and suggestions for improvement. The data collected were then used to analyze learners’ affective factors in the context of ChatGPT‑supported English instruction.
3. Instruments
1) Pre- and Post-English Speaking and Listening Tests
Oral communication skills in this study encompass both speaking and listening abilities required for interactive communication. To operationalize this construct, learners’ English speaking and listening proficiency was assessed using a mock TOEIC Speaking and Listening test.
The speaking component was adapted from the TOEIC Speaking section of a commercially available preparation book and focused on the “Express an Opinion” task. Students were asked to respond within one minute to the following prompt: “Some people prefer living in a smaller home in the city, while others prefer a larger home in the countryside. Which do you prefer and why?” Responses were scored using the TOEIC Speaking rubric on a scale of 0–5 points.
The listening test consisted of 30 items divided into four parts: Part 1 (3 items), Part 2 (15 items), Part 3 (6 items), and Part 4 (6 items), with each item scored as one point. Identical pre- and post-tests were administered in Weeks 2 and 13, respectively. Given the substantial interval between the two administrations and the fact that neither test items nor answer keys were disclosed after the pre-test, potential practice effects or task familiarity were minimized. Accordingly, the use of identical tasks at pre- and post-test follows common practice in previous studies employing similar assessment designs (Kim et al., 2025; Song & Kim, 2025).
Two raters participated in the scoring process: the researcher and a second evaluator, a university instructor with more than 20 years of teaching experience. To ensure consistency, the raters first reviewed and discussed the assessment criteria, then jointly evaluated approximately five sample students, compared scores, and reached consensus before proceeding with the full evaluation. After this norming session, each rater independently assessed the remaining students. The resulting reliability coefficient was .93, indicating a high level of inter‑rater agreement.
2) Pre- and Post-Questionnaires
To obtain a comprehensive understanding of the participants’ background information, a pre-survey was administered prior to the intervention. The survey included items on participants’ major, age, overseas residence experience, English proficiency level based on the College Scholastic Ability Test (CSAT), and reasons for learning (up to two areas). In addition, participants were asked to identify the areas in which they hoped to use ChatGPT for learning purposes (up to two areas). The survey also asked whether participants had prior experience using ChatGPT for English learning purposes.
Following the initial survey, an additional ChatGPT literacy questionnaire was administered to both groups. The instrument comprised 34 items organized into five sub-factors, adapted from Lee and Park (2024). Technical Proficiency (TP, 8 items) assessed learners’ understanding of ChatGPT’s structure and functions, as well as their ability to troubleshoot and integrate the tool into learning tasks. Critical Evaluation (CE, 8 items) measured their capacity to judge the accuracy, reliability, and potential bias of ChatGPT outputs and to verify information using other sources. Communication Proficiency (CP, 7 items) focused on using appropriate language for different contexts, formulating effective prompts, and engaging in interactive communication for collaborative purposes. Creative Application (CA, 6 items) captured the creative use of ChatGPT for idea generation, storytelling, and enhancing productivity in pursuit of learning goals. Finally, Ethical Competence (EC, 5 items) evaluated learners’ awareness of ethical and legal issues, including data privacy, responsible use, and ethical decision-making when using AI tools. Internal consistency estimates for the five subscales were high, with reliability coefficients ranging from .91 to .95.
All questionnaire items were assessed using a six-point Likert scale to capture participants’ perceptions in a structured manner. To complement these closed-ended items, the post-survey included open-ended questions aimed at eliciting more detailed reflections. The qualitative section consisted of six questions, asking students to describe: (1) the perceived advantages of using ChatGPT, (2) its disadvantages, (3) suggestions for improvement, (4) different instructional preferences & AI use for English learning, (5) the optimal timing for its application in learning, and (6) perceived learning effectiveness across different domains. By combining quantitative and qualitative measures, the study was able to provide a comprehensive understanding of learners’ attitudes and strategies for integrating ChatGPT into EFL contexts.
4. Data Analysis
The instruments included the TOEIC Listening and Speaking tests, a ChatGPT literacy questionnaire, and openended survey items. Prior to conducting the t-tests, the assumption of normality was examined using the Shapiro– Wilk test, which is considered appropriate for small sample sizes. The results indicated that all variables met the normality assumption (p > .05). Therefore, the use of parametric statistical analyses was deemed appropriate.
To analyze changes in learning achievement within each group, paired-samples t-tests were conducted on pre- and post-test TOEIC listening and speaking scores. In addition, to compare post-test performance between the two groups, an analysis of covariance (ANCOVA) was conducted to examine group differences while controlling for variations in English proficiency. Pre-test scores were included as a covariate to account for initial proficiency differences. This approach enabled a more precise evaluation of the treatment effect by reducing the influence of prior language ability.
Later, both groups completed a ChatGPT literacy questionnaire before and after the experiment. The questionnaire consisted of closed-ended items measured on a six-point Likert scale (1 = strongly disagree, 6 = strongly agree). To examine within-group changes in literacy competence, paired-samples t-tests were conducted, while independentsamples t-tests were employed to compare differences between the two groups.
In addition to the quantitative measures, qualitative data were collected to capture learners’ perceptions of ChatGPT-assisted English learning. This component employed an open-ended questionnaire designed to elicit more in-depth responses. The analysis followed a systematic multi-step procedure. First, two independent researchers repeatedly read all responses to familiarize themselves with the data and generate initial codes. The researchers then engaged in a consensus-building process to examine coding consistency and merge conceptually similar codes into broader themes. To ensure the reliability of the coding process, inter-coder reliability was examined by comparing the independently coded data and calculating the agreement rate, and any discrepancies were resolved through discussion until consensus was reached. Finally, the finalized themes were organized into categories reflecting students’ perceived benefits, limitations, and suggestions regarding the use of ChatGPT. Within each category, response frequencies were calculated to identify salient patterns, allowing the qualitative findings to be presented in a systematic and reliable manner rather than as descriptive accounts alone.
IV. RESULT
1. Effects of ChatGPT on Oral Communication Skills
1) Changes in Speaking and Listening Skills Within Each Proficiency Group
The purpose of this study was to examine the impact of using ChatGPT as a supplementary tool in English speaking and listening activities, with particular attention to learners’ proficiency levels. Specifically, the study aimed to investigate whether ChatGPT-assisted learning leads to improvements in students’ speaking and listening skills, whether it influences ChatGPT literacy depending on proficiency level, and whether differences in perception emerge between groups. To address these objectives, participants were divided into two groups according to their English proficiency, and each group engaged in interactive speaking activities supported by ChatGPT.
Table 2 presents the results of paired sample t‑tests for the low proficiency group. The findings indicate significant improvements in both skills measured. For speaking, the mean score increased from .960 (SD = .763) in the pre‑test to 2.140 (SD = .784) in the post‑test, with the difference being statistically significant (t = -11.390, p < .01). Similarly, for listening, the mean score rose from 10.440 (SD = 3.318) to 12.200 (SD = 3.189), also showing a significant improvement (t = –3.689, p < .01). These results suggest that AI conversational practice contributed to notable gains in opinion expression and listening ability among low proficiency learners.
Table 3 presents the results of paired sample t‑tests for the high proficiency group. The findings show significant improvements in both skills. For speaking, the mean score increased from 2.435 (SD = .529) in the pre‑test to 3.761 (SD = .541) in the post‑test, with the difference being statistically significant (t = –14.378, p < .01). Similarly, for listening, the mean score rose from 17.739 (SD = 3.805) to 20.391 (SD = 3.071), also showing a significant improvement (t = –4.507, p < .01). These results indicate that AI conversational practice contributed to notable gains in opinion expression and listening ability among high proficiency learners, reinforcing its effectiveness across different skill levels.
The findings of the present study align with previous research in demonstrating the positive effects of ChatGPTassisted speaking practice on learners’ oral proficiency (Kim, 2025; Noh, 2024; Song & Kim, 2025). Regardless of proficiency level, both low- and high-proficiency learners showed improvement in speaking and listening abilities after the experiment. This suggests that interaction with ChatGPT can facilitate language development by increasing opportunities for meaningful output and providing accessible listening input.
Previous studies have similarly reported that sustained engagement in ChatGPT-assisted speaking activities leads to improvements in fluency, pronunciation, and vocabulary use, while also reducing speaking anxiety and creating a more comfortable learning environment (Cheon, 2023; Kim, 2025; Noh, 2024; Song & Kim, 2025). Taken together, these findings indicate that ChatGPT-assisted oral practice contributes to speaking and listening development across proficiency levels, supporting its pedagogical value as a supplementary tool in EFL instruction.
2) Comparative Analysis of Speaking and Listening Improvements Across Proficiency Levels
The primary research objective was to determine whether the integration of ChatGPT in classroom activities led to any significant differences in students’ language performance—specifically in the domains of listening and speaking. To examine differences in speaking and listening improvement between low‑ and high‑proficiency students, an ANCOVA was conducted.
Table 4 presents the results of ANCOVA for speaking and listening performance across proficiency groups. In speaking, the low proficiency group improved from a pre‑test mean of .96 (SD = .763) to a post‑test mean of 2.140 (SD = .78), with an adjusted mean of 2.677 (SE = .118). The high proficiency group increased from 2.435 (SD = .529) to 3.761 (SD = .541), with an adjusted mean of 3.177 (SE = .125). The difference was statistically significant (F = 6.134, p < .05). In listening, the low proficiency group rose from 10.440 (SD = 3.318) to 12.200 (SD = 3.189), with an adjusted mean of 14.368 (SE = .555), while the high proficiency group improved from 17.739 (SD = 3.805) to 20.391 (SD = 3.071), with an adjusted mean of 18.035 (SE = .588).
This difference was highly significant (F = 15.195, p < .01). The ANCOVA results indicated a significant effect of learners’ proficiency level on both speaking and listening performance. Although both lower- and higher-proficiency groups showed statistically meaningful improvements following ChatGPT-assisted speaking practice, higherproficiency learners exhibited significantly higher adjusted posttest scores in both speaking and listening than lowerproficiency learners. These findings suggest that ChatGPT-assisted practice may differentially benefit learners depending on their proficiency level.
Building on this proficiency-related pattern, the findings of this study are consistent with previous research in demonstrating proficiency-related differences in AI-assisted language learning. For instance, Chen et al. (2023) reported that, in English learning using Google Assistant, high-proficiency learners were able to communicate more smoothly. Similarly, in the present study, low-proficiency learners often encountered communication breakdowns when interacting with ChatGPT, as their speech was not accurately recognized or they struggled to fully comprehend the AI’s responses, resulting in shorter and less sustained interactions. In contrast, high-proficiency learners were able to engage in more natural and continuous interactions with ChatGPT, which appears to have positively contributed to the development of their listening and speaking skills. Taken together, these findings suggest that learners’ proficiency level is a critical factor influencing the effectiveness of AI-assisted conversational practice.
2. Development of ChatGPT Literacy by Proficiency Group
To examine how students perceived ChatGPT-assisted speaking activities as a substitute for face-to-face conversations during class, pre- and post-surveys were conducted. Table 5 presents the results of the analysis, highlighting any significant differences in mean scores across the sub-factors between the pre- and post-survey stages. Specifically, the low proficiency group responded to each item before and after using the program, and the mean scores along with standard deviations were measured. These results provide a systematic analysis of how students’ perceptions shifted after engaging in ChatGPT-assisted speaking activities.
The pre- and post-survey results for the low-proficiency group revealed significant improvements across all measured domains. Technical Proficiency showed a substantial increase from a pre-test mean of 2.725 to a post-test mean of 3.565 (t = -4.947, p < .001). Similarly, Critical Evaluation improved from 3.085 to 3.575 (t = -3.245, p = .003). Communication Proficiency also increased significantly, rising from 3.006 to 3.686 (t = -3.993, p = .001), while Creative Application showed a notable improvement from 3.160 to 3.880 (t = -3.679, p = .001). Ethical Competence likewise demonstrated a significant increase, improving from 3.400 to 3.784 (t = -2.222, p = .036).
These findings indicate that low-proficiency learners experienced overall enhancement across all dimensions of ChatGPT literacy. In particular, substantial gains were observed in Technical Proficiency and Creative Application, suggesting that engagement with ChatGPT effectively supported both functional skill development and creative language use among lower-level learners.
As seen in Table 6, the pre- and post-survey results for the high-proficiency group indicated significant improvements in most dimensions of ChatGPT literacy. Technical Proficiency increased substantially from a pretest mean of 2.625 to a post-test mean of 3.679 (t = -7.399, p < .001). Similarly, Critical Evaluation improved from 3.223 to 3.815 (t = -3.246, p = .004). Communication Proficiency also showed a significant increase, rising from 3.261 to 3.857 (t = -3.928, p = .001), while Creative Application increased from 3.312 to 3.942 (t = -3.609, p = .002). All of these gains were statistically significant. In contrast, Ethical Competence increased from 3.678 to 3.930; however, this change did not reach statistical significance (t = -1.482, p = .153). These findings indicate that high-proficiency learners achieved notable development in technical, critical, communicative, and creative competencies through the use of ChatGPT, while ethical competence remained relatively stable.
These findings are consistent with Kim (2025), who similarly reported differential development across dimensions of ChatGPT literacy following AI-assisted language practice. Specifically, ChatGPT-assisted speaking practice led to significant improvements in four dimensions—technical proficiency, critical evaluation, communication skills, and creative application—whereas only minimal gains were observed in ethical competence.
This pattern suggests that ethical competence may not develop spontaneously through increased language proficiency or general AI use alone. Rather, learners may require explicit instructional scaffolding to critically reflect on ethical issues related to AI use, such as bias, responsibility, and appropriate reliance on AI-generated content. Accordingly, these findings underscore the importance of pedagogical interventions that intentionally integrate ethical dimensions into AI-assisted tasks. By embedding ethical reflection and discussion into task design, educators can promote more balanced development across all dimensions of ChatGPT literacy, ensuring that ethical awareness develops alongside technical and communicative competencies.
The pre-survey results indicated that although there were mean differences in ChatGPT literacy between the lowand high-proficiency groups, these differences were not statistically significant (Table 7). Across all dimensions, the high-proficiency group showed slightly higher mean scores than the low-proficiency group; however, no significant differences were found between the two groups at the p < .01 level.
According to the post-survey results, both the low- and high-proficiency groups demonstrated comparable levels of ChatGPT literacy overall (Table 8). Although the high-proficiency group showed slightly higher mean scores across all dimensions, which included Technical Proficiency, Critical Evaluation, Communication Proficiency, Creative Application, and Ethical Competence, no statistically significant differences were found between the two groups at the p < .01 level. These findings demonstrate that while learners’ English proficiency may account for some variation in ChatGPT literacy, such differences are not statistically reliable. This indicates that ChatGPT-related competencies may develop relatively independently of general language proficiency.
3. Perceptions of ChatGPT-Assisted Learning by Proficiency Group
Students’ perceptions of ChatGPT-assisted learning were systematically analyzed through open-ended questions. The qualitative data, which focused on advantages, disadvantages, suggestions for improvement, usage preferences, and optimal usage time, were examined by dividing respondents into low- and high-proficiency groups to identify differences in their perspectives.
Table 9 illustrates that the low-proficiency group highlighted several advantages. As seen in Table 9, the lowproficiency learner group reported positive learning experiences across multiple dimensions. First, learners noted that questions and conversations were tailored to their proficiency level, which reduced misunderstandings and enabled smoother interaction. In addition, the ability to speak without fear of making mistakes lowered anxiety and allowed learners to communicate more comfortably than in face-to-face settings. Receiving immediate and personalized feedback further enhanced the learning experience, creating an environment similar to one-on-one tutoring and contributing to improved learning outcomes.
Moreover, learners reported improvements in their English speaking ability and increased confidence, which enabled them to communicate more freely in English than before. By practicing listening and speaking simultaneously, learners were able to develop integrated language skills. They also had opportunities to broaden their learning experiences by engaging with diverse cultural content and information. Finally, interacting in a manner that resembled conversations with native speakers allowed learners to experience authentic communicative situations.
As seen in Table 10, the high-proficiency learner group experienced positive learning outcomes across multiple dimensions. First, through extensive speaking practice, learners improved both fluency and accuracy, enabling them to communicate more quickly and precisely than in interactions with human partners. In addition, receiving levelappropriate questions and support allowed for personalized interaction, which enhanced learning motivation. Learners also reported reduced anxiety, as they could speak comfortably even without perfect language use, experiencing less stress compared to conversations with other people.
In terms of learning efficiency, participants noted that they were able to process a large number of questions and access specialized information within a short period of time, making their learning more time-efficient. Furthermore, exposure to diverse cultural content and information broadened their knowledge base and increased enjoyment in the learning process. By practicing listening and speaking simultaneously, learners improved their listening comprehension and developed more integrated language skills. Finally, the use of more advanced vocabulary and refined expressions contributed to greater linguistic precision.
Compared to the low-proficiency group, the high-proficiency group tended to perceive the effects of AI use from more cognitive and strategic perspectives. While the low-proficiency group responded positively to emotional stability, reduced anxiety in speaking, and level-appropriate interaction (Cheon, 2023; Xiao & Zhi, 2023), the highproficiency group placed greater emphasis on improvements in speaking fluency(Noh, 2024), the use of advanced vocabulary, and learning efficiency (Cheon, 2023; Noh, 2024; Song & Kim, 2025). In addition, whereas the lowproficiency group highlighted psychological comfort and engagement through interaction with AI, the highproficiency group more actively recognized the learning benefits associated with the quality of questions, personalized feedback (Xiao & Zhi, 2023), and expanded access to information (Caruana et al., 2022).
Although improvements in integrated listening and speaking skills were observed in both groups, the highproficiency group perceived these gains as part of a more strategic learning process, whereas the low-proficiency group tended to emphasize the comfort and accessibility of the learning experience itself. These findings suggest that learners’ proficiency levels influence how they perceive the educational value of AI-assisted learning, with differing emphases placed on affective versus cognitive and strategic dimensions.
The low-proficiency group also identified several limitations in their use of AI during the learning process (Table 11). The most prominent issue was communication breakdowns, as learners reported that the AI sometimes failed to accurately understand their questions or repeatedly generated similar responses, which disrupted the flow of interaction. Concerns were also raised regarding the accuracy and reliability of responses, with some participants noting instances of repetitive or questionable information. In addition, some learners found the responses overly lengthy or lacking clarity, which made comprehension difficult. Difficulties related to speech recognition were also reported, occasionally hindering smooth communication. Furthermore, participants perceived limitations in maintaining a natural conversational flow, which restricted opportunities for spontaneous speaking practice. Challenges were also noted when vocabulary level or sentence complexity increased, making comprehension more demanding. Compared to interactions with real people, learners felt that emotional engagement and personal connection were limited when communicating with AI.
The high-proficiency group demonstrated a clear awareness of both the technical and cognitive limitations associated with the use of AI (Table 12). In particular, learners identified speech recognition issues, often caused by pronunciation differences or background noise, as one of the most significant challenges. They also reported instances in which the AI failed to accurately interpret the intent or context of their questions, resulting in responses that did not meet their expectations. Furthermore, participants perceived the level of natural interaction and immersion to be lower compared to real human communication, noting a tendency for interactions to shift toward listening-focused activities rather than balanced speaking engagement. Some learners also expressed concerns regarding the reliability of information, citing instances in which inconsistent or inaccurate responses were provided to similar questions. As a result, learners perceived that these limitations reduced the applicability of ChatGPT-assisted interactions to authentic communicative situations. They also highlighted the cognitive burden of continuous ChatGPT’s responses and emphasized that learning effectiveness was further reduced when their active engagement in the interaction was lacking.
Overall, while high-proficiency learners recognized the pedagogical value of AI-assisted learning, they also demonstrated a clear awareness of its limitations, particularly in terms of interactional authenticity, contextual understanding, and engagement.
The two groups differed in how they perceived the limitations of ChatGPT-assisted learning. The low-proficiency group mainly reported practical difficulties, such as communication breakdowns, repetitive or unclear responses (Kim, 2025; Lee & Park, 2024; Song & Kim, 2025), speech recognition problems (Kim, 2025; Noh, 2024), and challenges in understanding complex or lengthy language, which limited spontaneous speaking practice. In contrast, the highproficiency group showed a more critical awareness of AI-related constraints, emphasizing reduced interactional authenticity, contextual misunderstanding (Lee & Park, 2024; Song & Kim, 2025), information reliability issues, and increased cognitive load (Pokrivčáková, 2019; Woo & Choi, 2021). Overall, while both groups recognized the pedagogical value of ChatGPT, lower-level learners were more affected by accessibility and comprehension issues, whereas higher-level learners focused more on cognitive and qualitative limitations.
The low-proficiency group perceived that more structured question design and goal-oriented use of AI were necessary to enhance learning effectiveness (Table 13). They emphasized that clearer and more specific questions led to higher-quality responses, and suggested that proficiency-appropriate topics and speaking-focused activities would be more effective. Participants also noted that classroom noise hindered speech recognition, indicating that quieter or task-based learning environments would be more suitable. In addition, they stressed the importance of critically evaluating AI-generated responses and improving technical limitations. While recognizing constraints in achieving fully natural interaction, they suggested that more diverse question types and activities could further enhance learning outcomes.
The high-proficiency group emphasized the need for conversation-centered task design to enhance the effectiveness of AI use (Table 14). They preferred topic-based interactions over simple Q & A formats and highlighted the importance of guidance in question formulation. Participants also noted the need to adjust task difficulty to learners’ speaking levels and to incorporate peer discussion for idea expansion. In addition, they stressed the importance of feedback on language use, minimizing noise-related recognition issues, and clarifying whether AI use was intended for information retrieval or conversational practice. Overall, they emphasized the need for strategic and goal-oriented instructional design when using AI.
Both low- and high-proficiency groups recognized the need to improve AI-assisted learning, but their priorities differed. The low-proficiency group emphasized accessibility and stability, highlighting the need for structured questions, level-appropriate tasks, and supportive learning conditions. In contrast, the high-proficiency group focused on enhancing interaction quality through conversation-based tasks, strategic questioning, peer interaction, and feedback on language use. Overall, while low-proficiency learners emphasized structural support for participation, high-proficiency learners stressed the importance of instructional design to deepen learning (Kim, 2025).
In terms of different instructional preferences, both proficiency groups showed a clear tendency to favor a blended approach that combines AI with face-to-face interaction. Specifically, in the low-proficiency group, 15 students preferred blended use, while 8 chose AI-only practice and 2 opted for face-to-face interaction. In the high-proficiency group, 19 students favored blended use, with only 3 preferring face-to-face interaction and 1 selecting AI alone. Regarding willingness to use AI for future English learning was also high, with 23 low-proficiency and 21 highproficiency learners expressing positive intentions.
These findings align with previous studies, which have demonstrated that the use of ChatGPT fosters both intrinsic and extrinsic motivation by stimulating curiosity and sustaining learners’ interest in language learning activities (Aydın Yıldız, 2023; Kim & Kim, 2024). Regarding optimal usage time, most learners selected 5–10 minutes (low: 21; high: 18), while fewer preferred 10–15 minutes (low: 3; high: 4) or 15–20 minutes (low: 10; high: 1).
With respect to perceived learning effectiveness, both groups identified speaking and listening as the most positively affected areas. In the low-proficiency group, speaking (11) and listening (7) were rated highest, followed by vocabulary (5), grammar (1), and writing (1). Similarly, the high-proficiency group reported the greatest benefits in speaking (12) and listening (11), with no notable responses for other skills. Overall, learners across proficiency levels viewed ChatGPT as a supportive tool rather than a primary instructional medium, particularly effective for short, focused, and communication-oriented learning activities.
V. CONCLUSION
As generative artificial intelligence is being rapidly adopted in language education, there is growing interest in how AI-mediated interaction can support the development of learners’ oral communicative competence and AI literacy. However, empirical evidence remains limited regarding the extent to which such learning environments are effective for learners at different proficiency levels and whether their impact extends beyond immediate linguistic performance to broader dimensions of AI literacy. To address this gap, the present study explored the pedagogical potential of ChatGPT-assisted conversational practice and offered empirical insights by examining its effects on learners’ speaking and listening skills, multidimensional ChatGPT literacy, and proficiency-specific perceptions. The findings are summarized according to the three research questions.
With respect to oral communication skills, the results indicate that ChatGPT-assisted practice had a positive impact on both low- and high-proficiency learners. Both groups demonstrated clear improvements in speaking and listening skills after participating in the ChatGPT-assisted activities. Low-proficiency learners experienced meaningful gains in expressing opinions and improving listening comprehension, while high-proficiency learners also showed significant enhancement in both domains. These findings suggest that the effectiveness of oral practice with ChatGPT extends across proficiency levels.
The analysis of improvement patterns indicated proficiency-related differences in learning outcomes. Higherproficiency learners showed greater gains in speaking and listening performance. Although the extent and focus of improvement differed by proficiency level, the findings suggest that ChatGPT-assisted practice contributed to the development of oral communication skill, particularly among higher-proficiency learners (Chen et al., 2023; Mingyan et al., 2025).
Regarding ChatGPT literacy, the survey results revealed that both proficiency groups improved across multiple dimensions following the experiment. The low-proficiency group demonstrated significant gains in all areas, particularly in technical proficiency and creative application, suggesting that ChatGPT facilitated both functional competence and creative language use. The high-proficiency group also showed notable improvements in technical, critical, communicative, and creative competencies, while ethical competence remained relatively stable. These findings are largely consistent with those reported by Kim (2025), except for the dimension of ethical awareness, which exhibited a different pattern. This divergence highlights the need for educators to provide explicit instruction and guidance on ethical considerations when students engage with AI tools, underscoring the importance of integrating ethics-focused education into AI-supported learning environments.
Importantly, comparisons between groups revealed no statistically significant differences in overall ChatGPT literacy either before or after the intervention. After engaging in ChatGPT-assisted speaking practice, both groups reached comparable levels of ChatGPT literacy. These findings suggest that ChatGPT-related competencies can develop independently of learners’ general English proficiency, highlighting the broad applicability of ChatGPT as a learning-support tool.
Learners’ perceptions of ChatGPT-assisted learning were generally positive across proficiency levels, though the focus of their perceptions differed. Low-proficiency learners emphasized affective and accessibility-related benefits, such as reduced anxiety, level-appropriate questioning, immediate feedback, and increased willingness to participate. They also reported improved confidence and integrated listening–speaking ability, while identifying limitations such as communication breakdowns, repetitive or unclear responses, speech recognition errors, and limited conversational naturalness. Accordingly, they suggested structured questioning, proficiency-appropriate tasks, quiet learning environments, and critical evaluation of AI responses as key improvements.
In contrast, high-proficiency learners highlighted cognitive and strategic benefits, including improved fluency and accuracy, use of advanced vocabulary, learning efficiency, and personalized interaction. At the same time, they pointed out limitations related to speech recognition, contextual misunderstanding, reduced interactional authenticity, information reliability, and cognitive load. To enhance effectiveness, they emphasized conversation-centered task design, strategic questioning, peer discussion, feedback on language use, and task difficulty adjustment.
Analytically, these differences indicate that low-proficiency learners tend to prioritize emotional stability and accessibility, whereas high-proficiency learners place greater emphasis on cognitive depth and strategic learning processes. This suggests that learners’ perceptions of the educational value of ChatGPT-assisted learning vary according to proficiency level. In line with this, Kim and Park (2023) underscore the potential of integrating Artificial Intelligence, such as ChatGPT, into English teaching while emphasizing the importance of differentiated approaches based on students’ linguistic abilities. Taken together, these findings highlight the necessity of tailoring AI-supported instruction to diverse learner profiles.
In conclusion, ChatGPT-assisted learning demonstrated positive effects on linguistic skills, ChatGPT literacy, and learner perceptions across proficiency levels. While both low- and high-proficiency learners benefited from oral practice with ChatGPT, the nature of these benefits differed. These findings underscore the importance of differentiated instructional design that aligns ChatGPT use with learners’ proficiency levels. When appropriately designed, ChatGPT can serve as an effective supplementary tool that supports communicative language development, digital literacy, and learner engagement in EFL contexts (Kim, 2025; Song & Kim, 2025).
Despite its contributions, this study has several limitations. First, the relatively small sample size limits the generalizability of the findings, and future research should involve larger and more diverse learner populations. Second, the short intervention period makes it difficult to evaluate the long‑term effects of ChatGPT‑mediated learning, highlighting the need for longitudinal follow‑up studies. Third, the speaking assessment focused primarily on opinion‑expression tasks, which constrains the reliability and scope of the measure and calls for the development of more comprehensive and validated assessment instruments. Finally, because all participants engaged in ChatGPT‑assisted activities alongside their regular classroom instruction and no control group was included, it is difficult to determine whether ChatGPT functioned as an independent treatment or whether the observed learning gains can be attributed solely to ChatGPT‑assisted conversational practice.
Building on these limitations, future research should adopt more rigorous experimental designs that incorporate control groups to better verify the effects of AI use. In addition, research needs to move beyond a sole emphasis on speaking practice by developing a broader range of AI‑assisted tasks that foster diverse language skills, including vocabulary development, grammatical accuracy, and writing proficiency. Employing more in‑depth analytic approaches will allow for a richer understanding of learners’ cognitive and communicative processes during interaction with AI. Ultimately, such work could inform the development of concrete, pedagogically grounded instructional frameworks for the effective integration of AI into foreign language education.