Oral Communication Skills and Literacy Gains Through AI Conversational Practice Across Proficiency Levels*

Article information

J Eng Teach Movie Media. 2026;27(1):16-37
Publication date (electronic) : 2026 February 28
doi : https://doi.org/10.16875/stem.2026.27.1.16
1Associate Professor, College of General Education, Seoul Women’s University, 621 Hwarang-ro, Nowon-gu, Seoul, 01797, Korea
Corresponding Author, Associate Professor, College of General Education, Seoul Women’s University, 621 Hwarang-ro, Nowon-gu, Seoul, 01797, Korea (E-mail: shskim@swu.ac.kr)
* This work was supported by a research grant from Seoul Women’s University (2025-0135).
Received 2026 January 5; Revised 2026 February 2; Accepted 2026 February 21.

Abstract

This study investigated the effects of AI conversational practice on university students’ speaking and listening skills, ChatGPT literacy, and perceptions of ChatGPT use across different English proficiency levels. Participants were divided into low- and high-proficiency groups. Both groups participated in oral practice with ChatGPT, targeting speaking and listening skills. Quantitative data were collected through pre- and post-listening and speaking tests and a 20-item ChatGPT literacy questionnaire that measured perceptions of usefulness, ease of use, affective responses, and strategic use of ChatGPT. Paired-samples t-tests, independent-samples t-tests, and ANCOVA were conducted to examine within-group gains and between-group differences. Qualitative data were obtained from openended survey responses and were analyzed using a thematic analysis. The results showed that both proficiency groups demonstrated statistically significant improvements in speaking and listening skills, although highproficiency learners exhibited larger gains in both areas. Both groups also showed statistically significant growth in ChatGPT literacy, and no statistically significant differences between proficiency levels were found. The findings highlight that learners at different proficiency levels derive distinct affective versus cognitive benefits from ChatGPT, underscoring the need for proficiency-sensitive instructional design and supporting its use as a supplementary tool to enhance oral proficiency and ChatGPT literacy.

Keywords: secondary; tertiary

I. INTRODUCTION

The rapid development of artificial intelligence (AI) has profoundly influenced all areas of contemporary society, with education undergoing a swift transition toward technology-based learning environments. In this context, the effective integration of AI tools and platforms into classrooms has become essential to provide learners with meaningful learning experiences, signaling a paradigm shift in pedagogy. Foreign language education, in particular, has increasingly embraced AI not merely as a supplementary tool but as a means of fostering critical thinking, creative problem-solving, and digital literacy, which encompasses competencies regarded as central to 21st-century learning (Kim & Kim, 2025). Zhai et al. (2024) further emphasize that the ability to understand and evaluate AI technologies constitutes a vital skill in modern society.

Among the various generative AI applications, ChatGPT has emerged as one of the most influential tools in language education. Developed by OpenAI and based on large language models and generative pre-trained transformer (GPT) technology, ChatGPT has been widely adopted since its release in November 2022. Although not originally designed for translation, it has been applied across multiple language pairs (Khoshafah, 2023) and is now used for diverse tasks such as answering questions, generating texts, correcting grammar, summarizing, problem solving, and translation (Kim et al., 2025).

ChatGPT literacy plays a critical role in enabling language teachers to design effective instructional materials and provide pedagogical support that responds to learners’ diverse proficiency levels and individual needs (Liu, 2025). For instructors, such literacy supports the development of data-informed instructional practices and the provision of individualized feedback aligned with students’ linguistic readiness. For learners, ChatGPT literacy functions as a practical competence that allows them to engage meaningfully with adaptive AI technologies, benefit from personalized feedback, and interact strategically with ChatGPT as a learning resource.

The pedagogical potential of ChatGPT can be interpreted through Gass et al.’s (2020) model of second language acquisition, which views language learning as an interactive process involving input, interaction, and feedback. By providing accessible input, immediate feedback, and opportunities for meaningful interaction, ChatGPT supports both the cognitive and social dimensions of foreign language learning.

In educational contexts, ChatGPT enables teachers to design personalized learning environments (Baidoo-Anu & Ansah, 2023), supports text and dialogue generation (Crosthwaite & Baisa, 2023), and provides immediate feedback that enhances immersive language learning (AlAfnan et al., 2023; Choe, 2023; Im, 2023; Yu & Yoo, 2024). Moreover, studies report that ChatGPT-assisted instruction increases both intrinsic and extrinsic motivation (Ali et al., 2023; Aydın Yıldız, 2023; Kim et al., 2025), and also sustain learners’ engagement and interest in language study (Song & Kim, 2025).

Empirical research has generally reported positive effects of AI integration across the four language skills—reading, writing, listening, and speaking—although findings remain mixed depending on factors such as task type, learner variables, and instructional design. While several studies highlight improvements in learner achievement (Han, 2023; Kim & Park, 2023; Yu & Yoo, 2024), others suggest that AI-based instruction does not always yield significant differences compared to traditional methods (Kim et al., 2025). Affectively, AI tools have been shown to expand opportunities for communication, enhance motivation, and provide interactive and personalized learning experiences (Im, 2023; Jeong, 2024). Learners also report that engaging with ChatGPT promotes self-directed learning and contributes to the development of critical and creative thinking by requiring them to analyze, evaluate, and creatively apply AI-generated information (Kim & Kim, 2025; Li et al., 2025). Nevertheless, limitations such as occasional inconsistencies in content, reduced accuracy, and the need for faster response times have been noted (Kim & Kim, 2025; Lee et al., 2024; Xiao & Zhi, 2023).

Despite these promising findings, the existing literature remains limited in several respects. Most studies to date have focused on short-term interventions with small samples and have often examined only one or two language skills or specific tasks, making it difficult to generalize the results. In addition, relatively few studies have investigated differential effects across proficiency levels or systematically examined learners’ AI-related literacy, indicating a need for further research that adopts more diverse samples and multi-dimensional outcome measures. Moreover, little attention has been paid to how ChatGPT-assisted speaking practice shapes learners’ oral communication skills and their development of ChatGPT literacy. To address these gaps, the present study examines the impact of oral practice with ChatGPT on students’ speaking and listening outcomes, their ChatGPT literacy, and their perceptions of ChatGPT use, focusing on differences across English proficiency levels.

Accordingly, the study addresses the following research questions:

1. To what extent does ChatGPT-assisted oral practice influence students’ speaking and listening skills across different English proficiency levels?

2. How does ChatGPT-assisted oral practice affect students’ ChatGPT literacy according to their proficiency level?

3. How do students’ perceptions of ChatGPT-assisted oral practice differ depending on their proficiency level?

II. LITERATURE REVIEW

1. Enhancing Language Skills via ChatGPT Use

Technologically, ChatGPT distinguishes itself from earlier AI tools through its foundation in advanced large language models and generative pre-trained transformer architectures. This enables it to surpass previous chatbots in versatility, depth of understanding, and applicability, offering creative outputs across diverse tasks and presenting new possibilities for foreign language education.

From a theoretical perspective, the use of ChatGPT aligns with Gass et al.’s (2020) model of second language acquisition, which conceptualizes language learning not as mere knowledge transmission but as a complex cognitive and social process. The model emphasizes the stages of input, comprehension, interaction, output, and feedback, illustrating how learners internalize language and develop communicative competence. Gass particularly highlights the central role of interaction and feedback in accelerating language development, as learners actively engage in communication, correct errors, and repeatedly practice language use in meaningful contexts. In this regard, ChatGPT functions as an effective tool by providing rich linguistic input, real-time feedback, and opportunities for meaningful interaction, thereby creating a balanced environment that integrates cognitive approaches with social contexts. Ultimately, Gass’s framework offers a theoretical justification for the integration of technology-based interactive learning, positioning ChatGPT as a pedagogically sound innovation in foreign language education.

Recent studies have examined the pedagogical effects of ChatGPT across multiple language skills, highlighting its potential to support oral proficiency as well as cognitive aspects of language learning. Within the field of second language acquisition, oral proficiency is commonly conceptualized as the ability to communicate verbally in a functional and accurate manner in the target language. A high level of oral proficiency further entails the ability to transfer linguistic knowledge to novel contexts and communicative situations (Omaggio, 1986). Building on this conceptualization, recent work has increasingly explored how AI-driven conversational agents such as ChatGPT can be used to foster learners’ oral skills.

In the area of speaking, ChatGPT has been shown to promote oral communication through interactive, simulated dialogues. By providing immediate feedback and a low-anxiety environment, ChatGPT enables learners to practice fluency and pronunciation more comfortably than in traditional classroom settings (Noh, 2024). Empirical findings further indicate that ChatGPT-assisted speaking practice contributes to improvements in vocabulary use and pronunciation while simultaneously reducing speaking anxiety, suggesting its effectiveness in enhancing overall speaking proficiency (Cheon, 2023; Noh, 2024).

Song and Kim (2025) demonstrated that the degree of engagement plays a critical role in the effectiveness of ChatGPT-assisted speaking tasks. Although both high- and low-engagement learners showed improvement in speaking performance, the high-engagement group achieved significantly greater gains. Importantly, learner perceptions varied by engagement level: low-engagement learners primarily valued the convenience and emotional support provided by ChatGPT, whereas high-engagement learners reported more substantive linguistic benefits, such as vocabulary expansion, increased confidence, and reduced speaking anxiety. These findings suggest that ChatGPT can meaningfully enhance language learning outcomes when instructional tasks are deliberately designed to promote sustained and active participation.

Extending this AI-assisted perspective, Kim (2025) examined the overall effectiveness of ChatGPT-assisted practice for Korean university students (N = 51). Although both the ChatGPT-assisted experimental group (n = 26) and the traditionally instructed control group (n = 25) showed significant improvements in speaking and listening skills, no statistically significant differences were found between the groups, suggesting that ChatGPT-assisted practice did not lead to additional gains beyond traditional instruction in this context. Even so, the experimental group showed clear progress in ChatGPT literacy, with notable gains in technical proficiency, critical evaluation, communication skills, and creative application, though ethical competence remained relatively unchanged. Learners also expressed positive perceptions of ChatGPT use, highlighting increased confidence, improved interaction skills, and greater accessibility to practice opportunities, despite persistent challenges such as speech recognition errors and occasional unnatural responses.

In reading instruction, ChatGPT has been utilized as a scaffold to support comprehension and meaning-making processes. Studies have shown that ChatGPT assists learners in thematic reading-to-writing tasks by helping them synthesize information, organize ideas, and engage more actively with texts (Caruana et al., 2022). Additionally, the supportive and interactive nature of AI-mediated instruction has been found to reduce anxiety often associated with traditional reading tasks, thereby encouraging deeper cognitive engagement (Caruana et al., 2022).

Building on this line of research, Kim et al. (2025) examined the impact of integrating ChatGPT into EFL reading courses on learners’ ChatGPT literacy and perceptions. In a study with 37 undergraduate students enrolled in a general English course, learners engaged in structured discussions with ChatGPT based on reading materials, participating in activities such as applying key concepts, exploring interpretations, and checking comprehension. Analysis of pre- and post-course questionnaires and semi-structured interviews revealed significant gains across five core dimensions of ChatGPT literacy. Students reported generally trusting ChatGPT while remaining cautious about the accuracy of its output, and they employed strategies such as asking follow-up questions, refining prompts, and cross-checking information with reliable sources. These findings suggest that the effective integration of AI tools in EFL education requires the development of learners’ critical thinking, ethical awareness, and adaptive use strategies.

With regard to writing, research has demonstrated that ChatGPT facilitates instant feedback and text refinement. By offering personalized suggestions on vocabulary choice and textual organization, ChatGPT supports learners in developing more coherent and sophisticated written output (Choe, 2023; Jeong, 2024). Moreover, continuous individualized feedback provided by AI tools has been associated with reduced writing anxiety and increased learner confidence, contributing to more positive writing experiences (AlAfnan et al., 2023; Choe, 2023).

ChatGPT has also been found to support grammar and vocabulary development. AI-based tools are effective in identifying grammatical errors and providing corrective feedback, which enhances learners’ linguistic accuracy and overall language competence (Woo & Choi, 2021). Learners generally perceive such tools as accurate and helpful, and this perception has been linked to increased motivation and engagement in language learning tasks (Ai, 2017; Kao, 2020).

Despite the growing interest in ChatGPT-assisted language learning, previous studies have consistently reported several limitations that require careful pedagogical consideration. One of the most frequently discussed issues concerns the accuracy and reliability of AI-generated responses. Learners have noted that ChatGPT sometimes produces repetitive, inconsistent, or partially inaccurate information, which raises concerns about uncritical reliance on AI output (Cheon, 2023; Noh, 2024). Although ChatGPT is often perceived as efficient for idea generation and time management, problems related to factual accuracy and source credibility persist, highlighting the need for instructional support such as source comparison and critical evaluation strategies (Kim & Kim, 2025).

Closely related to information reliability are concerns regarding academic integrity and learner responsibility. Several studies have cautioned that excessive dependence on AI-generated content may reduce learners’ active engagement and increase the risk of plagiarism. However, research also suggests that these risks can be mitigated when learners are encouraged to critically evaluate, revise, and contextualize ChatGPT-generated responses rather than accepting them passively (Choe, 2023; Han, 2023; Kim & Kim, 2025).

Another limitation widely discussed in the literature involves the quality of interaction. While ChatGPT is capable of simulating conversational exchanges, AI-mediated interaction has been found to lack the emotional depth, spontaneity, and pragmatic richness inherent in human communication (Dizon, 2020; Park, 2023). As a result, opportunities for developing interactional competence—such as real-time turn-taking, pragmatic negotiation, and affective responsiveness—may be constrained.

Technical constraints further limit the effectiveness of ChatGPT-assisted interaction. Prior studies have reported difficulties related to speech recognition accuracy, particularly in noisy classroom environments or when learners produce non-standard pronunciation, which can disrupt conversational flow (Noh, 2024). In addition, limitations in contextual understanding may lead to responses that do not fully align with learners’ communicative intentions, thereby reducing coherence and naturalness in interaction (Lee & Park, 2024; Song & Kim, 2025).

Recent studies have increasingly examined how AI-driven tools support learners’ oral performance. For instance, several ChatGPT-assisted interventions have reported gains in speaking fluency, pronunciation, and overall oral performance among EFL learners (Mingyan et al., 2025; Muniandy & Selvanathan, 2025). These studies generally suggest that sustained interaction with AI conversation partners and AI-powered speaking apps can enhance learners’ motivation and provide frequent, low-anxiety opportunities for oral practice. However, most of this work has focused on short-term gains in specific subskills (e.g., pronunciation or fluency) and has rarely compared outcomes across different proficiency levels.

Beyond ChatGPT, there is also research employing other AI tools to examine learners’ speaking skills across proficiency levels. For example, Chen et al. (2023) investigated how college EFL students perceived the use of Google Assistant for language learning and found that students enjoyed interacting with the assistant and considered it useful for improving speaking and listening skills, partly because its pronunciation was natural and easy to understand. At the same time, proficiency-related differences emerged: higher-level learners communicated more effectively, whereas lower-level learners experienced difficulties when their mispronunciations led to breakdowns in interaction. Similar patterns have been reported in studies using other intelligent personal assistants and AI-powered mobile apps, where more proficient learners tend to benefit more from extended AI-mediated speaking practice than their lower-proficiency peers (Mingyan et al., 2025).

Despite this growing body of research, the current literature remains limited in several important respects. First, few studies explicitly address learners’ AI-related or ChatGPT-specific literacy, even though such literacy is increasingly recognized as a key 21st-century competence. Second, research on oral skills has tended to focus on speaking outcomes alone, with relatively little attention to the integrated development of listening and speaking or to learners’ perceptions of AI-based oral practice across proficiency levels. Consequently, further empirical work is needed to examine how learners at different proficiency levels engage in proficiency-adaptive ChatGPT use for oral communication, and to clarify how such practices can promote more equitable and effective language learning across diverse learner groups.

2. Learners’ Perceptions of ChatGPT in Language Learning

Beyond skill development, previous research has consistently reported positive learner perceptions of ChatGPTassisted language learning. Learners view ChatGPT as a valuable learning resource that offers immediate feedback and personalized support, thereby enhancing the overall learning experience (Xiao & Zhi, 2023). The use of ChatGPT has also been shown to foster both intrinsic and extrinsic motivation by stimulating curiosity and sustaining interest in language learning activities (Aydın Yıldız, 2023; Kim & Kim, 2024). Furthermore, multiple studies have highlighted the role of AI tools in reducing language learning anxiety, particularly in speaking contexts, which contributes to increased learner confidence and willingness to communicate (Cheon, 2023; Noh, 2024).

Aydın Yıldız (2023) investigated the impact of integrating ChatGPT-generated dialogues into language teaching materials on learner motivation. Sixty second-year university students participated, and their motivational strategies were assessed. Results revealed significant differences across majors in motivation subcategories such as selfregulation, intrinsic values, and test anxiety. In another study (AlAfnan et al, 2023), opportunities identified include offering students a platform to answer theory-based questions, generate ideas for application-based tasks, and enabling instructors to integrate technology into classrooms and workshops. Challenges include risks of unethical use by students, which may lead to reduced critical engagement and difficulties for instructors in distinguishing between diligent and automation-dependent learners, as well as in assessing learning outcomes.

From a cognitive perspective, engaging with ChatGPT can impose substantial cognitive demands on learners. Users are often required to simultaneously formulate prompts, interpret AI responses, evaluate their accuracy, and generate language output, which may overburden lower-proficiency learners in the absence of appropriate scaffolding (Kim, 2025; Pokrivčáková, 2019; Woo & Choi, 2021). Furthermore, when instructional goals are not clearly defined, excessive reliance on AI support may diminish opportunities for productive struggle and impede the development of learner autonomy (Han, 2023).

In addition, several scholars have argued that the pedagogical use of generative AI is still in its early stages. Empirical, classroom-based research validating the effectiveness of ChatGPT remains limited, and existing studies tend to focus on specific tools or general learning outcomes rather than on higher-order skills such as critical thinking, creative thinking, and AI literacy (Khoso et al., 2025; Pokrivčáková, 2019; Woo & Choi, 2021). Moreover, research examining how ChatGPT-supported learning operates across different learner proficiency levels is still scarce, making it difficult to determine how AI-assisted instruction can be optimally adapted to diverse learner needs (Liu & Ma, 2023; Yilin et al., 2023).

In light of this, AI literacy provides a broad conceptual framework that encompasses more specific forms of literacy, including ChatGPT literacy and the concrete skills and knowledge required for educational use (Liu, 2025; Ma et al., 2024). Aligning these literacies enables educators and learners to navigate the rapidly evolving landscape of AI technologies and to leverage ChatGPT’s educational potential in ethical and pedagogically meaningful ways (Ma et al., 2024). However, to fully realize this potential, it remains essential to investigate how ChatGPT can be systematically applied across diverse proficiency levels through differentiated instructional designs and assessment of learner outcomes.

Taken together, these findings suggest that although ChatGPT holds considerable pedagogical potential, its effectiveness depends largely on thoughtful instructional design, critical engagement, and ethical guidance. They also underscore the need for further research that explores proficiency-related variations, pedagogically grounded approaches to integrating ChatGPT into EFL classrooms.

III. METHOD

1. Participants

This study was conducted with first-year students enrolled in a required general English course at A university in the spring semester of 2024. The course met twice a week for 75 minutes per session, and students were placed into different class levels based on their mock TOEIC scores taken at entry. Among the classes taught by the researcher, one was assigned as the high-proficiency group and the other as the low-proficiency group.

A total of 48 students participated: the low-proficiency group consisted of 25 students (average TOEIC score approximately 310), while the high-proficiency group included 23 students (average TOEIC score approximately 550). The classes focused on communication activities emphasizing listening and speaking. For instructional materials, the study employed Interchange Intro (Richards et al., 2021), which provided structured communicative activities appropriate for the low-proficiency. The high-proficiency group used Skillful 2 Listening & Speaking (Macmillan, 2018) as the main course book, which offered structured speaking and listening tasks for the highproficiency. Each group engaged in vocabulary, grammar, listening, and speaking activities based on their respective textbooks, and all classes incorporated topic-based conversations as a common practice.

Table 1 summarizes the basic information of learners and their reasons for studying English according to proficiency level. In the low proficiency group (n = 25), participants were between 19 and 24 years of age, representing a range of different majors, which are detailed in Table 1. Only 2 students reported overseas experience, while the remaining 23 had none. The primary reasons for learning English were academic performance (16), followed by career development (10), travel (8), recognition of English as a global language (6), interest (4), and selfimprovement (2).

Demographic Information

In contrast, the high proficiency group (n = 23) consisted of students aged 19 to 23, representing a range of different majors (Table 1). Unlike the low proficiency group, three students reported overseas experience, while 20 had none. Their motivations for learning English were led by academic performance (16), followed by global language value (11), travel (9), career development (7), and interest (5).

Regarding prior experience with ChatGPT, only two students in the low-proficiency group had previously used it for English learning, while none of the high-proficiency students reported prior experience. When asked to select two preferred learning areas using ChatGPT, low-proficiency learners showed equal interest in grammar, speaking, and writing (10 each), followed by reading (6), listening (3), and vocabulary (3). In contrast, high-proficiency learners expressed the strongest interest in speaking (13), followed by grammar (8), vocabulary (7), reading (6), listening (4), and writing (4). Overall, both groups demonstrated a strong preference for using ChatGPT for speaking practice, along with grammar and vocabulary development.

2. Procedures

This study was conducted in a compulsory General English course offered over 15 weeks. Classes were held twice a week, each lasting 75 minutes, with a focus on listening and speaking activities. At the beginning of the semester, students took a mock TOEIC test administered by the university and were placed into groups according to their proficiency level. The researcher was responsible for one low‑proficiency group (26 students) and one high‑proficiency group (25 students). During the semester, three students withdrew, resulting in a final sample of 25 students in the low‑proficiency group and 23 students in the high‑proficiency group.

The two groups used different textbooks (Richards et al., 2021). The low‑proficiency group studied everyday topics such as birth, places, time, occupations, food, sports, movies, and emotions, while the high‑proficiency group focused on a broader range of themes including food, business, environment, movies, health, travel, emotions, and current trends (Macmillan, 2018). All classes were taught by the same instructor, and the use of ChatGPT, the duration of speaking activities, and the types of listening tasks were kept consistent across groups to ensure fairness in the study. The lessons were organized by units, with pre‑listening activities consisting of brainstorming and speaking practice, and post‑listening activities requiring students to express their opinions on the given topics. ChatGPT‑assisted speaking practice was integrated both before and after the listening tasks, each lasting approximately 5–10 minutes. Both groups engaged in oral practice with ChatGPT during class sessions.

In this study, AI-based conversational activities were employed to enhance students’ English speaking proficiency. Specifically, students engaged in weekly conversations on assigned topics using ChatGPT 3.5. These activities were designed to resemble traditional pair-work tasks and were conducted in an interactive question-and-answer format.

The conversational activities were implemented in two main forms. First, students initiated interactions by posing questions directly to ChatGPT and receiving responses. Second, ChatGPT generated topic-related questions, to which students were required to respond. This dual structure encouraged students not only to produce answers but also to explore a range of ideas through topic-based brainstorming.

In addition, all students interacted individually with ChatGPT using their personal mobile phones. This setup allowed them to practice speaking freely without temporal or spatial constraints and facilitated repeated conversational experiences, which helped promote spontaneous language production and improve speaking fluency.

Accordingly, the methodology of this study was designed to enable students to practice speaking in an environment that closely resembles authentic communicative situations through question–answer–based interactions with ChatGPT. This approach contributed to increased learner engagement and enhanced learning outcomes. Figure 1 presents a screenshot of a student from the low‑proficiency group interacting with ChatGPT, while Figure 2 shows a screenshot of a student from the high‑proficiency group engaging in a conversation.

Fig. 1.

A Screenshot of Sample Conversation: Low Proficiency Group

Fig. 2.

A Screenshot of Sample Conversation: High Proficiency Group

To evaluate the effectiveness of the study, pre‑ and post‑tests in speaking and listening were administered. In addition, pre‑ and post‑surveys were conducted to examine changes in students’ ChatGPT literacy. The pre‑survey collected demographic information, while the post‑survey gathered open‑ended responses regarding students’ experiences with ChatGPT, including perceived benefits, drawbacks, and suggestions for improvement. The data collected were then used to analyze learners’ affective factors in the context of ChatGPT‑supported English instruction.

3. Instruments

1) Pre- and Post-English Speaking and Listening Tests

Oral communication skills in this study encompass both speaking and listening abilities required for interactive communication. To operationalize this construct, learners’ English speaking and listening proficiency was assessed using a mock TOEIC Speaking and Listening test.

The speaking component was adapted from the TOEIC Speaking section of a commercially available preparation book and focused on the “Express an Opinion” task. Students were asked to respond within one minute to the following prompt: “Some people prefer living in a smaller home in the city, while others prefer a larger home in the countryside. Which do you prefer and why?” Responses were scored using the TOEIC Speaking rubric on a scale of 0–5 points.

The listening test consisted of 30 items divided into four parts: Part 1 (3 items), Part 2 (15 items), Part 3 (6 items), and Part 4 (6 items), with each item scored as one point. Identical pre- and post-tests were administered in Weeks 2 and 13, respectively. Given the substantial interval between the two administrations and the fact that neither test items nor answer keys were disclosed after the pre-test, potential practice effects or task familiarity were minimized. Accordingly, the use of identical tasks at pre- and post-test follows common practice in previous studies employing similar assessment designs (Kim et al., 2025; Song & Kim, 2025).

Two raters participated in the scoring process: the researcher and a second evaluator, a university instructor with more than 20 years of teaching experience. To ensure consistency, the raters first reviewed and discussed the assessment criteria, then jointly evaluated approximately five sample students, compared scores, and reached consensus before proceeding with the full evaluation. After this norming session, each rater independently assessed the remaining students. The resulting reliability coefficient was .93, indicating a high level of inter‑rater agreement.

2) Pre- and Post-Questionnaires

To obtain a comprehensive understanding of the participants’ background information, a pre-survey was administered prior to the intervention. The survey included items on participants’ major, age, overseas residence experience, English proficiency level based on the College Scholastic Ability Test (CSAT), and reasons for learning (up to two areas). In addition, participants were asked to identify the areas in which they hoped to use ChatGPT for learning purposes (up to two areas). The survey also asked whether participants had prior experience using ChatGPT for English learning purposes.

Following the initial survey, an additional ChatGPT literacy questionnaire was administered to both groups. The instrument comprised 34 items organized into five sub-factors, adapted from Lee and Park (2024). Technical Proficiency (TP, 8 items) assessed learners’ understanding of ChatGPT’s structure and functions, as well as their ability to troubleshoot and integrate the tool into learning tasks. Critical Evaluation (CE, 8 items) measured their capacity to judge the accuracy, reliability, and potential bias of ChatGPT outputs and to verify information using other sources. Communication Proficiency (CP, 7 items) focused on using appropriate language for different contexts, formulating effective prompts, and engaging in interactive communication for collaborative purposes. Creative Application (CA, 6 items) captured the creative use of ChatGPT for idea generation, storytelling, and enhancing productivity in pursuit of learning goals. Finally, Ethical Competence (EC, 5 items) evaluated learners’ awareness of ethical and legal issues, including data privacy, responsible use, and ethical decision-making when using AI tools. Internal consistency estimates for the five subscales were high, with reliability coefficients ranging from .91 to .95.

All questionnaire items were assessed using a six-point Likert scale to capture participants’ perceptions in a structured manner. To complement these closed-ended items, the post-survey included open-ended questions aimed at eliciting more detailed reflections. The qualitative section consisted of six questions, asking students to describe: (1) the perceived advantages of using ChatGPT, (2) its disadvantages, (3) suggestions for improvement, (4) different instructional preferences & AI use for English learning, (5) the optimal timing for its application in learning, and (6) perceived learning effectiveness across different domains. By combining quantitative and qualitative measures, the study was able to provide a comprehensive understanding of learners’ attitudes and strategies for integrating ChatGPT into EFL contexts.

4. Data Analysis

The instruments included the TOEIC Listening and Speaking tests, a ChatGPT literacy questionnaire, and openended survey items. Prior to conducting the t-tests, the assumption of normality was examined using the Shapiro– Wilk test, which is considered appropriate for small sample sizes. The results indicated that all variables met the normality assumption (p > .05). Therefore, the use of parametric statistical analyses was deemed appropriate.

To analyze changes in learning achievement within each group, paired-samples t-tests were conducted on pre- and post-test TOEIC listening and speaking scores. In addition, to compare post-test performance between the two groups, an analysis of covariance (ANCOVA) was conducted to examine group differences while controlling for variations in English proficiency. Pre-test scores were included as a covariate to account for initial proficiency differences. This approach enabled a more precise evaluation of the treatment effect by reducing the influence of prior language ability.

Later, both groups completed a ChatGPT literacy questionnaire before and after the experiment. The questionnaire consisted of closed-ended items measured on a six-point Likert scale (1 = strongly disagree, 6 = strongly agree). To examine within-group changes in literacy competence, paired-samples t-tests were conducted, while independentsamples t-tests were employed to compare differences between the two groups.

In addition to the quantitative measures, qualitative data were collected to capture learners’ perceptions of ChatGPT-assisted English learning. This component employed an open-ended questionnaire designed to elicit more in-depth responses. The analysis followed a systematic multi-step procedure. First, two independent researchers repeatedly read all responses to familiarize themselves with the data and generate initial codes. The researchers then engaged in a consensus-building process to examine coding consistency and merge conceptually similar codes into broader themes. To ensure the reliability of the coding process, inter-coder reliability was examined by comparing the independently coded data and calculating the agreement rate, and any discrepancies were resolved through discussion until consensus was reached. Finally, the finalized themes were organized into categories reflecting students’ perceived benefits, limitations, and suggestions regarding the use of ChatGPT. Within each category, response frequencies were calculated to identify salient patterns, allowing the qualitative findings to be presented in a systematic and reliable manner rather than as descriptive accounts alone.

IV. RESULT

1. Effects of ChatGPT on Oral Communication Skills

1) Changes in Speaking and Listening Skills Within Each Proficiency Group

The purpose of this study was to examine the impact of using ChatGPT as a supplementary tool in English speaking and listening activities, with particular attention to learners’ proficiency levels. Specifically, the study aimed to investigate whether ChatGPT-assisted learning leads to improvements in students’ speaking and listening skills, whether it influences ChatGPT literacy depending on proficiency level, and whether differences in perception emerge between groups. To address these objectives, participants were divided into two groups according to their English proficiency, and each group engaged in interactive speaking activities supported by ChatGPT.

Table 2 presents the results of paired sample t‑tests for the low proficiency group. The findings indicate significant improvements in both skills measured. For speaking, the mean score increased from .960 (SD = .763) in the pre‑test to 2.140 (SD = .784) in the post‑test, with the difference being statistically significant (t = -11.390, p < .01). Similarly, for listening, the mean score rose from 10.440 (SD = 3.318) to 12.200 (SD = 3.189), also showing a significant improvement (t = –3.689, p < .01). These results suggest that AI conversational practice contributed to notable gains in opinion expression and listening ability among low proficiency learners.

Result of Paired Sample t-Tests on Speaking & Listening: Low Proficiency Group

Table 3 presents the results of paired sample t‑tests for the high proficiency group. The findings show significant improvements in both skills. For speaking, the mean score increased from 2.435 (SD = .529) in the pre‑test to 3.761 (SD = .541) in the post‑test, with the difference being statistically significant (t = –14.378, p < .01). Similarly, for listening, the mean score rose from 17.739 (SD = 3.805) to 20.391 (SD = 3.071), also showing a significant improvement (t = –4.507, p < .01). These results indicate that AI conversational practice contributed to notable gains in opinion expression and listening ability among high proficiency learners, reinforcing its effectiveness across different skill levels.

Result of Paired Sample t-Tests on Speaking & Listening: High Proficiency Group

The findings of the present study align with previous research in demonstrating the positive effects of ChatGPTassisted speaking practice on learners’ oral proficiency (Kim, 2025; Noh, 2024; Song & Kim, 2025). Regardless of proficiency level, both low- and high-proficiency learners showed improvement in speaking and listening abilities after the experiment. This suggests that interaction with ChatGPT can facilitate language development by increasing opportunities for meaningful output and providing accessible listening input.

Previous studies have similarly reported that sustained engagement in ChatGPT-assisted speaking activities leads to improvements in fluency, pronunciation, and vocabulary use, while also reducing speaking anxiety and creating a more comfortable learning environment (Cheon, 2023; Kim, 2025; Noh, 2024; Song & Kim, 2025). Taken together, these findings indicate that ChatGPT-assisted oral practice contributes to speaking and listening development across proficiency levels, supporting its pedagogical value as a supplementary tool in EFL instruction.

2) Comparative Analysis of Speaking and Listening Improvements Across Proficiency Levels

The primary research objective was to determine whether the integration of ChatGPT in classroom activities led to any significant differences in students’ language performance—specifically in the domains of listening and speaking. To examine differences in speaking and listening improvement between low‑ and high‑proficiency students, an ANCOVA was conducted.

Table 4 presents the results of ANCOVA for speaking and listening performance across proficiency groups. In speaking, the low proficiency group improved from a pre‑test mean of .96 (SD = .763) to a post‑test mean of 2.140 (SD = .78), with an adjusted mean of 2.677 (SE = .118). The high proficiency group increased from 2.435 (SD = .529) to 3.761 (SD = .541), with an adjusted mean of 3.177 (SE = .125). The difference was statistically significant (F = 6.134, p < .05). In listening, the low proficiency group rose from 10.440 (SD = 3.318) to 12.200 (SD = 3.189), with an adjusted mean of 14.368 (SE = .555), while the high proficiency group improved from 17.739 (SD = 3.805) to 20.391 (SD = 3.071), with an adjusted mean of 18.035 (SE = .588).

Results of ANCOVA for Speaking & Listening

This difference was highly significant (F = 15.195, p < .01). The ANCOVA results indicated a significant effect of learners’ proficiency level on both speaking and listening performance. Although both lower- and higher-proficiency groups showed statistically meaningful improvements following ChatGPT-assisted speaking practice, higherproficiency learners exhibited significantly higher adjusted posttest scores in both speaking and listening than lowerproficiency learners. These findings suggest that ChatGPT-assisted practice may differentially benefit learners depending on their proficiency level.

Building on this proficiency-related pattern, the findings of this study are consistent with previous research in demonstrating proficiency-related differences in AI-assisted language learning. For instance, Chen et al. (2023) reported that, in English learning using Google Assistant, high-proficiency learners were able to communicate more smoothly. Similarly, in the present study, low-proficiency learners often encountered communication breakdowns when interacting with ChatGPT, as their speech was not accurately recognized or they struggled to fully comprehend the AI’s responses, resulting in shorter and less sustained interactions. In contrast, high-proficiency learners were able to engage in more natural and continuous interactions with ChatGPT, which appears to have positively contributed to the development of their listening and speaking skills. Taken together, these findings suggest that learners’ proficiency level is a critical factor influencing the effectiveness of AI-assisted conversational practice.

2. Development of ChatGPT Literacy by Proficiency Group

To examine how students perceived ChatGPT-assisted speaking activities as a substitute for face-to-face conversations during class, pre- and post-surveys were conducted. Table 5 presents the results of the analysis, highlighting any significant differences in mean scores across the sub-factors between the pre- and post-survey stages. Specifically, the low proficiency group responded to each item before and after using the program, and the mean scores along with standard deviations were measured. These results provide a systematic analysis of how students’ perceptions shifted after engaging in ChatGPT-assisted speaking activities.

Changes in ChatGPT Literacy: Pre- and Post-Test Results for Low Proficiency Group

The pre- and post-survey results for the low-proficiency group revealed significant improvements across all measured domains. Technical Proficiency showed a substantial increase from a pre-test mean of 2.725 to a post-test mean of 3.565 (t = -4.947, p < .001). Similarly, Critical Evaluation improved from 3.085 to 3.575 (t = -3.245, p = .003). Communication Proficiency also increased significantly, rising from 3.006 to 3.686 (t = -3.993, p = .001), while Creative Application showed a notable improvement from 3.160 to 3.880 (t = -3.679, p = .001). Ethical Competence likewise demonstrated a significant increase, improving from 3.400 to 3.784 (t = -2.222, p = .036).

These findings indicate that low-proficiency learners experienced overall enhancement across all dimensions of ChatGPT literacy. In particular, substantial gains were observed in Technical Proficiency and Creative Application, suggesting that engagement with ChatGPT effectively supported both functional skill development and creative language use among lower-level learners.

As seen in Table 6, the pre- and post-survey results for the high-proficiency group indicated significant improvements in most dimensions of ChatGPT literacy. Technical Proficiency increased substantially from a pretest mean of 2.625 to a post-test mean of 3.679 (t = -7.399, p < .001). Similarly, Critical Evaluation improved from 3.223 to 3.815 (t = -3.246, p = .004). Communication Proficiency also showed a significant increase, rising from 3.261 to 3.857 (t = -3.928, p = .001), while Creative Application increased from 3.312 to 3.942 (t = -3.609, p = .002). All of these gains were statistically significant. In contrast, Ethical Competence increased from 3.678 to 3.930; however, this change did not reach statistical significance (t = -1.482, p = .153). These findings indicate that high-proficiency learners achieved notable development in technical, critical, communicative, and creative competencies through the use of ChatGPT, while ethical competence remained relatively stable.

Changes in ChatGPT Literacy: Pre- and Post-Test Results for High Proficiency Group

These findings are consistent with Kim (2025), who similarly reported differential development across dimensions of ChatGPT literacy following AI-assisted language practice. Specifically, ChatGPT-assisted speaking practice led to significant improvements in four dimensions—technical proficiency, critical evaluation, communication skills, and creative application—whereas only minimal gains were observed in ethical competence.

This pattern suggests that ethical competence may not develop spontaneously through increased language proficiency or general AI use alone. Rather, learners may require explicit instructional scaffolding to critically reflect on ethical issues related to AI use, such as bias, responsibility, and appropriate reliance on AI-generated content. Accordingly, these findings underscore the importance of pedagogical interventions that intentionally integrate ethical dimensions into AI-assisted tasks. By embedding ethical reflection and discussion into task design, educators can promote more balanced development across all dimensions of ChatGPT literacy, ensuring that ethical awareness develops alongside technical and communicative competencies.

The pre-survey results indicated that although there were mean differences in ChatGPT literacy between the lowand high-proficiency groups, these differences were not statistically significant (Table 7). Across all dimensions, the high-proficiency group showed slightly higher mean scores than the low-proficiency group; however, no significant differences were found between the two groups at the p < .01 level.

Pre-Survey Results on ChatGPT Literacy by Proficiency Level

According to the post-survey results, both the low- and high-proficiency groups demonstrated comparable levels of ChatGPT literacy overall (Table 8). Although the high-proficiency group showed slightly higher mean scores across all dimensions, which included Technical Proficiency, Critical Evaluation, Communication Proficiency, Creative Application, and Ethical Competence, no statistically significant differences were found between the two groups at the p < .01 level. These findings demonstrate that while learners’ English proficiency may account for some variation in ChatGPT literacy, such differences are not statistically reliable. This indicates that ChatGPT-related competencies may develop relatively independently of general language proficiency.

Post-Survey Results on ChatGPT Literacy by Proficiency Level

3. Perceptions of ChatGPT-Assisted Learning by Proficiency Group

Students’ perceptions of ChatGPT-assisted learning were systematically analyzed through open-ended questions. The qualitative data, which focused on advantages, disadvantages, suggestions for improvement, usage preferences, and optimal usage time, were examined by dividing respondents into low- and high-proficiency groups to identify differences in their perspectives.

Table 9 illustrates that the low-proficiency group highlighted several advantages. As seen in Table 9, the lowproficiency learner group reported positive learning experiences across multiple dimensions. First, learners noted that questions and conversations were tailored to their proficiency level, which reduced misunderstandings and enabled smoother interaction. In addition, the ability to speak without fear of making mistakes lowered anxiety and allowed learners to communicate more comfortably than in face-to-face settings. Receiving immediate and personalized feedback further enhanced the learning experience, creating an environment similar to one-on-one tutoring and contributing to improved learning outcomes.

Students’ Responses on Advantages of Using ChatGPT: Low Proficiency Group

Moreover, learners reported improvements in their English speaking ability and increased confidence, which enabled them to communicate more freely in English than before. By practicing listening and speaking simultaneously, learners were able to develop integrated language skills. They also had opportunities to broaden their learning experiences by engaging with diverse cultural content and information. Finally, interacting in a manner that resembled conversations with native speakers allowed learners to experience authentic communicative situations.

As seen in Table 10, the high-proficiency learner group experienced positive learning outcomes across multiple dimensions. First, through extensive speaking practice, learners improved both fluency and accuracy, enabling them to communicate more quickly and precisely than in interactions with human partners. In addition, receiving levelappropriate questions and support allowed for personalized interaction, which enhanced learning motivation. Learners also reported reduced anxiety, as they could speak comfortably even without perfect language use, experiencing less stress compared to conversations with other people.

Students’ Responses on Advantages of Using ChatGPT: High Proficiency Group

In terms of learning efficiency, participants noted that they were able to process a large number of questions and access specialized information within a short period of time, making their learning more time-efficient. Furthermore, exposure to diverse cultural content and information broadened their knowledge base and increased enjoyment in the learning process. By practicing listening and speaking simultaneously, learners improved their listening comprehension and developed more integrated language skills. Finally, the use of more advanced vocabulary and refined expressions contributed to greater linguistic precision.

Compared to the low-proficiency group, the high-proficiency group tended to perceive the effects of AI use from more cognitive and strategic perspectives. While the low-proficiency group responded positively to emotional stability, reduced anxiety in speaking, and level-appropriate interaction (Cheon, 2023; Xiao & Zhi, 2023), the highproficiency group placed greater emphasis on improvements in speaking fluency(Noh, 2024), the use of advanced vocabulary, and learning efficiency (Cheon, 2023; Noh, 2024; Song & Kim, 2025). In addition, whereas the lowproficiency group highlighted psychological comfort and engagement through interaction with AI, the highproficiency group more actively recognized the learning benefits associated with the quality of questions, personalized feedback (Xiao & Zhi, 2023), and expanded access to information (Caruana et al., 2022).

Although improvements in integrated listening and speaking skills were observed in both groups, the highproficiency group perceived these gains as part of a more strategic learning process, whereas the low-proficiency group tended to emphasize the comfort and accessibility of the learning experience itself. These findings suggest that learners’ proficiency levels influence how they perceive the educational value of AI-assisted learning, with differing emphases placed on affective versus cognitive and strategic dimensions.

The low-proficiency group also identified several limitations in their use of AI during the learning process (Table 11). The most prominent issue was communication breakdowns, as learners reported that the AI sometimes failed to accurately understand their questions or repeatedly generated similar responses, which disrupted the flow of interaction. Concerns were also raised regarding the accuracy and reliability of responses, with some participants noting instances of repetitive or questionable information. In addition, some learners found the responses overly lengthy or lacking clarity, which made comprehension difficult. Difficulties related to speech recognition were also reported, occasionally hindering smooth communication. Furthermore, participants perceived limitations in maintaining a natural conversational flow, which restricted opportunities for spontaneous speaking practice. Challenges were also noted when vocabulary level or sentence complexity increased, making comprehension more demanding. Compared to interactions with real people, learners felt that emotional engagement and personal connection were limited when communicating with AI.

Students’ Responses on Disadvantages of Using ChatGPT: Low Proficiency Group

The high-proficiency group demonstrated a clear awareness of both the technical and cognitive limitations associated with the use of AI (Table 12). In particular, learners identified speech recognition issues, often caused by pronunciation differences or background noise, as one of the most significant challenges. They also reported instances in which the AI failed to accurately interpret the intent or context of their questions, resulting in responses that did not meet their expectations. Furthermore, participants perceived the level of natural interaction and immersion to be lower compared to real human communication, noting a tendency for interactions to shift toward listening-focused activities rather than balanced speaking engagement. Some learners also expressed concerns regarding the reliability of information, citing instances in which inconsistent or inaccurate responses were provided to similar questions. As a result, learners perceived that these limitations reduced the applicability of ChatGPT-assisted interactions to authentic communicative situations. They also highlighted the cognitive burden of continuous ChatGPT’s responses and emphasized that learning effectiveness was further reduced when their active engagement in the interaction was lacking.

Students’ Responses on Disadvantages of Using ChatGPT: High Proficiency Group

Overall, while high-proficiency learners recognized the pedagogical value of AI-assisted learning, they also demonstrated a clear awareness of its limitations, particularly in terms of interactional authenticity, contextual understanding, and engagement.

The two groups differed in how they perceived the limitations of ChatGPT-assisted learning. The low-proficiency group mainly reported practical difficulties, such as communication breakdowns, repetitive or unclear responses (Kim, 2025; Lee & Park, 2024; Song & Kim, 2025), speech recognition problems (Kim, 2025; Noh, 2024), and challenges in understanding complex or lengthy language, which limited spontaneous speaking practice. In contrast, the highproficiency group showed a more critical awareness of AI-related constraints, emphasizing reduced interactional authenticity, contextual misunderstanding (Lee & Park, 2024; Song & Kim, 2025), information reliability issues, and increased cognitive load (Pokrivčáková, 2019; Woo & Choi, 2021). Overall, while both groups recognized the pedagogical value of ChatGPT, lower-level learners were more affected by accessibility and comprehension issues, whereas higher-level learners focused more on cognitive and qualitative limitations.

The low-proficiency group perceived that more structured question design and goal-oriented use of AI were necessary to enhance learning effectiveness (Table 13). They emphasized that clearer and more specific questions led to higher-quality responses, and suggested that proficiency-appropriate topics and speaking-focused activities would be more effective. Participants also noted that classroom noise hindered speech recognition, indicating that quieter or task-based learning environments would be more suitable. In addition, they stressed the importance of critically evaluating AI-generated responses and improving technical limitations. While recognizing constraints in achieving fully natural interaction, they suggested that more diverse question types and activities could further enhance learning outcomes.

Students’ Suggestions for Using ChatGPT: Low Proficiency Group

The high-proficiency group emphasized the need for conversation-centered task design to enhance the effectiveness of AI use (Table 14). They preferred topic-based interactions over simple Q & A formats and highlighted the importance of guidance in question formulation. Participants also noted the need to adjust task difficulty to learners’ speaking levels and to incorporate peer discussion for idea expansion. In addition, they stressed the importance of feedback on language use, minimizing noise-related recognition issues, and clarifying whether AI use was intended for information retrieval or conversational practice. Overall, they emphasized the need for strategic and goal-oriented instructional design when using AI.

Students’ Suggestions for Using ChatGPT: High Proficiency Group

Both low- and high-proficiency groups recognized the need to improve AI-assisted learning, but their priorities differed. The low-proficiency group emphasized accessibility and stability, highlighting the need for structured questions, level-appropriate tasks, and supportive learning conditions. In contrast, the high-proficiency group focused on enhancing interaction quality through conversation-based tasks, strategic questioning, peer interaction, and feedback on language use. Overall, while low-proficiency learners emphasized structural support for participation, high-proficiency learners stressed the importance of instructional design to deepen learning (Kim, 2025).

In terms of different instructional preferences, both proficiency groups showed a clear tendency to favor a blended approach that combines AI with face-to-face interaction. Specifically, in the low-proficiency group, 15 students preferred blended use, while 8 chose AI-only practice and 2 opted for face-to-face interaction. In the high-proficiency group, 19 students favored blended use, with only 3 preferring face-to-face interaction and 1 selecting AI alone. Regarding willingness to use AI for future English learning was also high, with 23 low-proficiency and 21 highproficiency learners expressing positive intentions.

These findings align with previous studies, which have demonstrated that the use of ChatGPT fosters both intrinsic and extrinsic motivation by stimulating curiosity and sustaining learners’ interest in language learning activities (Aydın Yıldız, 2023; Kim & Kim, 2024). Regarding optimal usage time, most learners selected 5–10 minutes (low: 21; high: 18), while fewer preferred 10–15 minutes (low: 3; high: 4) or 15–20 minutes (low: 10; high: 1).

With respect to perceived learning effectiveness, both groups identified speaking and listening as the most positively affected areas. In the low-proficiency group, speaking (11) and listening (7) were rated highest, followed by vocabulary (5), grammar (1), and writing (1). Similarly, the high-proficiency group reported the greatest benefits in speaking (12) and listening (11), with no notable responses for other skills. Overall, learners across proficiency levels viewed ChatGPT as a supportive tool rather than a primary instructional medium, particularly effective for short, focused, and communication-oriented learning activities.

V. CONCLUSION

As generative artificial intelligence is being rapidly adopted in language education, there is growing interest in how AI-mediated interaction can support the development of learners’ oral communicative competence and AI literacy. However, empirical evidence remains limited regarding the extent to which such learning environments are effective for learners at different proficiency levels and whether their impact extends beyond immediate linguistic performance to broader dimensions of AI literacy. To address this gap, the present study explored the pedagogical potential of ChatGPT-assisted conversational practice and offered empirical insights by examining its effects on learners’ speaking and listening skills, multidimensional ChatGPT literacy, and proficiency-specific perceptions. The findings are summarized according to the three research questions.

With respect to oral communication skills, the results indicate that ChatGPT-assisted practice had a positive impact on both low- and high-proficiency learners. Both groups demonstrated clear improvements in speaking and listening skills after participating in the ChatGPT-assisted activities. Low-proficiency learners experienced meaningful gains in expressing opinions and improving listening comprehension, while high-proficiency learners also showed significant enhancement in both domains. These findings suggest that the effectiveness of oral practice with ChatGPT extends across proficiency levels.

The analysis of improvement patterns indicated proficiency-related differences in learning outcomes. Higherproficiency learners showed greater gains in speaking and listening performance. Although the extent and focus of improvement differed by proficiency level, the findings suggest that ChatGPT-assisted practice contributed to the development of oral communication skill, particularly among higher-proficiency learners (Chen et al., 2023; Mingyan et al., 2025).

Regarding ChatGPT literacy, the survey results revealed that both proficiency groups improved across multiple dimensions following the experiment. The low-proficiency group demonstrated significant gains in all areas, particularly in technical proficiency and creative application, suggesting that ChatGPT facilitated both functional competence and creative language use. The high-proficiency group also showed notable improvements in technical, critical, communicative, and creative competencies, while ethical competence remained relatively stable. These findings are largely consistent with those reported by Kim (2025), except for the dimension of ethical awareness, which exhibited a different pattern. This divergence highlights the need for educators to provide explicit instruction and guidance on ethical considerations when students engage with AI tools, underscoring the importance of integrating ethics-focused education into AI-supported learning environments.

Importantly, comparisons between groups revealed no statistically significant differences in overall ChatGPT literacy either before or after the intervention. After engaging in ChatGPT-assisted speaking practice, both groups reached comparable levels of ChatGPT literacy. These findings suggest that ChatGPT-related competencies can develop independently of learners’ general English proficiency, highlighting the broad applicability of ChatGPT as a learning-support tool.

Learners’ perceptions of ChatGPT-assisted learning were generally positive across proficiency levels, though the focus of their perceptions differed. Low-proficiency learners emphasized affective and accessibility-related benefits, such as reduced anxiety, level-appropriate questioning, immediate feedback, and increased willingness to participate. They also reported improved confidence and integrated listening–speaking ability, while identifying limitations such as communication breakdowns, repetitive or unclear responses, speech recognition errors, and limited conversational naturalness. Accordingly, they suggested structured questioning, proficiency-appropriate tasks, quiet learning environments, and critical evaluation of AI responses as key improvements.

In contrast, high-proficiency learners highlighted cognitive and strategic benefits, including improved fluency and accuracy, use of advanced vocabulary, learning efficiency, and personalized interaction. At the same time, they pointed out limitations related to speech recognition, contextual misunderstanding, reduced interactional authenticity, information reliability, and cognitive load. To enhance effectiveness, they emphasized conversation-centered task design, strategic questioning, peer discussion, feedback on language use, and task difficulty adjustment.

Analytically, these differences indicate that low-proficiency learners tend to prioritize emotional stability and accessibility, whereas high-proficiency learners place greater emphasis on cognitive depth and strategic learning processes. This suggests that learners’ perceptions of the educational value of ChatGPT-assisted learning vary according to proficiency level. In line with this, Kim and Park (2023) underscore the potential of integrating Artificial Intelligence, such as ChatGPT, into English teaching while emphasizing the importance of differentiated approaches based on students’ linguistic abilities. Taken together, these findings highlight the necessity of tailoring AI-supported instruction to diverse learner profiles.

In conclusion, ChatGPT-assisted learning demonstrated positive effects on linguistic skills, ChatGPT literacy, and learner perceptions across proficiency levels. While both low- and high-proficiency learners benefited from oral practice with ChatGPT, the nature of these benefits differed. These findings underscore the importance of differentiated instructional design that aligns ChatGPT use with learners’ proficiency levels. When appropriately designed, ChatGPT can serve as an effective supplementary tool that supports communicative language development, digital literacy, and learner engagement in EFL contexts (Kim, 2025; Song & Kim, 2025).

Despite its contributions, this study has several limitations. First, the relatively small sample size limits the generalizability of the findings, and future research should involve larger and more diverse learner populations. Second, the short intervention period makes it difficult to evaluate the long‑term effects of ChatGPT‑mediated learning, highlighting the need for longitudinal follow‑up studies. Third, the speaking assessment focused primarily on opinion‑expression tasks, which constrains the reliability and scope of the measure and calls for the development of more comprehensive and validated assessment instruments. Finally, because all participants engaged in ChatGPT‑assisted activities alongside their regular classroom instruction and no control group was included, it is difficult to determine whether ChatGPT functioned as an independent treatment or whether the observed learning gains can be attributed solely to ChatGPT‑assisted conversational practice.

Building on these limitations, future research should adopt more rigorous experimental designs that incorporate control groups to better verify the effects of AI use. In addition, research needs to move beyond a sole emphasis on speaking practice by developing a broader range of AI‑assisted tasks that foster diverse language skills, including vocabulary development, grammatical accuracy, and writing proficiency. Employing more in‑depth analytic approaches will allow for a richer understanding of learners’ cognitive and communicative processes during interaction with AI. Ultimately, such work could inform the development of concrete, pedagogically grounded instructional frameworks for the effective integration of AI into foreign language education.

References

Ai H.. 2017;Providing graduated corrective feedback in an intelligent computer-assisted language learning environment. ReCALL 29(3):313–334. https://doi.org/10.1017/S0958344017000034.
AlAfnan M. A., Dishari S., Jovic M., Lomidze K.. 2023;ChatGPT as an educational tool: Opportunities, challenges, and recommendations for communication, business writing, and composition courses. Journal of Artificial Intelligence and Technology 3(2):60–68. https://doi.org/10.37965/jait.2023.0184.
Ali J. K. M., Shamsan M. A., Hezam T. A., Mohammed A. A. Q.. 2023;Impact of ChatGPT on learning motivation: Teachers and students’ voices. Journal of English Studies in Arabia Felix 2(1):41–49. https://doi.org/10.56540/jesaf.v2i1.51.
Aydın Yıldız T.. 2023;The impact of ChatGPT on language learners’ motivation. Journal of Teacher Education and Lifelong Learning 5(2):582–597. https://doi.org/10.51535/tell.1314355.
Baidoo-Anu D., Ansah L. O.. 2023;Education in the era of generative artificial intelligence (AI): Understanding the potential benefits of ChatGPT in promoting teaching and learning. Journal of AI 7(1):52. :62. https://doi.org/10.61969/jai.1337500.
Caruana, N., Moffat, R., Miguel-Blanco, A., & Cross, E. S. (2022). Talk, listen and keep me company: A mixed methods analysis of children’s perspectives towards robot reading companions. In C. Bartneck & T. Kanda (Eds.), HAI ’22: Proceedings of the 10th International Conference on Human-Agent Interaction (pp. 239–241). Association for Computing Machinery. https://doi.org/10.1145/3527188.3563917.
Chen H. H.-J., Yang C. T.-Y., Lai K. K.-W.. 2023;Investigating college EFL learners’ perceptions toward the use of Google Assistant for foreign language learning. Interactive Learning Environments 31(3):1335–1350. https://doi.org/10.1080/10494820.2020.1836506.
Cheon S.-M.. 2023;Educational application of ChatGPT: Its impact on Korean university students’ English speaking skills. The Journal of Linguistic Science 107:469–496. https://doi.org/10.21296/jls.2023.12.107.469.
Choe Y.. 2023;Exploring ChatGPT’s impact on the English summary writing of pre-service English teachers. Multimedia-Assisted Language Learning 26(2):104–132. https://doi.org/10.15702/mall.2023.26.2.104.
Crosthwaite P., Baisa V.. 2023;Generative AI and the end of corpus-assisted data-driven learning? Not so fast! Applied Corpus Linguistics 3:Article 100066. https://doi.org/10.1016/j.acorp.2023.100066.
Dizon G.. 2020;Evaluating intelligent personal assistants for L2 listening and speaking development. Language Learning & Technology 24(1):16–26. https://doi.org/10125/44705.
Gass, S. M., Behney, J., & Plonsky, L. (2020). Second language acquisition: An introductory course (5th ed.). Routledge.
Han S. H.. 2023;Korean speaking study using conversational generative AI (artificial intelligence) ChatGPT: Based on role-playing, from using Talk-to-ChatGPT to utilizing AIPRM-for-ChatGPT. The Journal of Learner-Centered Curriculum and Instruction 23(18):651–674. https://doi.org/10.22251/jlcci.2023.23.18.651.
Im H.. 2023;A study on college students’ perspectives and attitudes toward the use of ChatGPT in English classes. Culture and Convergence 45(9):1335–1342. https://doi.org/10.33645/cnc.2023.09.45.09.1335.
Jeong N.-S.. 2024;Exploring the effects of ChatGPT on university students’ English writing skills and their perceptions. Multimedia-Assisted Language Learning 27(1):78–95. https://doi.org/10.15702/mall.2024.27.1.78.
Kao C.-W.. 2020;The effect of a digital game-based learning task on the acquisition of the English article system. System 95:Article 102373. https://doi.org/10.1016/j.system.2020.102373.
Khoshafah, F. (2023). ChatGPT for Arabic–English translation: Evaluating the accuracy. Research Square. https://doi.org/10.21203/rs.3.rs-2814154/v2.
Khoso A. K., Honggang W., Darazi M. A.. 2025;Empowering creativity and engagement: The impact of generative artificial intelligence usage on Chinese EFL students’ language learning experience. Computers in Human Behavior Reports 18:Article 100627. https://doi.org/10.1016/j.chbr.2025.100627.
Kim H.-S.. 2025;AI-Supported language learning for developing English listening and speaking skills and ChatGPT literacy. Korean Journal of English Language and Linguistics 25:1353–1377. https://doi.org/10.15738/kjell.25.202510.1353.
Kim H.-S., Kim N.-Y.. 2024;Exploring the effects of ChatGPT on video-making projects in an EFL course. Asia-Pacific Journal of Convergent Research Interchange 10(6):737. :750. https://doi.org/10.47116/apjcri.2024.06.50.
Kim N.-Y., Cha Y., Kim H.-S.. 2025;Exploring ChatGPT literacy in EFL reading education. Multimedia-Assisted Language Learning 28(1):39. :60. https://doi.org/10.15702/mall.2025.28.1.39.
Kim N.-Y., Kim H.-S.. 2025;The impact of AI-based English classes on students’ multimodal literacy, critical thinking, and creative thinking. Convergence Studies in English Language & Literature 10(2):57. :85. https://doi.org/10.55986/cell.2025.10.2.57.
Kim S., Park S. H.. 2023;Young Korean EFL learners’ perception of role-playing scripts: ChatGPT vs. textbooks. Korean Journal of English Language and Linguistics 23:1136. :1153. https://doi.org/10.15738/kjell.23.202312.1136.
Lee S., Park G.. 2024;Development and validation of ChatGPT literacy scale. Current Psychology 43:18992. :19004. https://doi.org/10.1007/s12144-024-05000-x.
Lee U., Jeong Y., Koh J., Byun G., Lee Y., Hwang Y., Kim H., Lim C.. 2024;Can ChatGPT be a debate partner? Developing ChatGPT-based application “DEBO” for debate education: Findings and limitations. Educational Technology & Society 27(2):321–346. https://doi.org/10.30191/ETS.202404.
Li H., Xiao R., Nieu H., Tseng Y.-J., Liao G.. 2025;“From unseen needs to classroom solutions”: Exploring AI literacy challenges and opportunities with a project-based learning toolkit in K–12 education. Proceedings of the AAAI Conference on Artificial Intelligence 39(28):29145–29152. https://doi.org/10.1609/aaai.v39i28.35187.
Liu G., Ma C.. 2023;Measuring EFL learners’ use of ChatGPT in informal digital learning of English based on the technology acceptance model. Innovation in Language Learning and Teaching 18:125–138. http://doi.org/10.1080/17501229.2023.2240316.
Liu W.. 2025;Language teacher AI literacy: Insights from collaborations with ChatGPT. Journal of China Computer-Assisted Language Learning 5(2):287–316. https://doi.org/10.1515/jccall-2024-0030.
Ma Q., Crosthwaite P., Sun D., Zou D.. 2024;Exploring ChatGPT literacy in language education: A global perspective and comprehensive approach. Computers and Education: Artificial Intelligence 7:Article 100278. https://doi.org/10.1016/j.caeai.2024.100278.
Macmillan. (2018). Skillful 2: Listening & speaking student’s book pack. Macmillan Education.
Mingyan M., Noordin N., Razali A. B.. 2025;Improving EFL speaking performance among undergraduate students with an AI-powered mobile app in after-class assignments: An empirical investigation. Humanities and Social Sciences Communications 12:Article 370. https://doi.org/10.1057/s41599-025-04688-0.
Muniandy J., Selvanathan M.. 2025;ChatGPT, a partnering tool to improve ESL learners’ speaking skills: Case study in a public university, Malaysia. Language and Education 43(1):1–15. https://doi.org/10.1177/01447394241230152.
Noh Y.. 2024;The impact of ChatGPT-assisted learning on English speaking proficiency of Korean university students. The New Studies of English Language & Literature 89:89–114. https://doi.org/10.21087/nsell.2024.11.89.89.
Omaggio, A. (1986). Teaching language in context: Proficiency-oriented instruction. Heinle & Heinle.
Park H.-Y.. 2023;Application of ChatGPT for an English learning platform. Journal of English Teaching through Movies and Media 24(3):30–48. https://doi.org/10.16875/stem.2023.24.3.30.
Pokrivčáková S.. 2019;Preparing teachers for the application of AI-powered technologies in foreign language education. Journal of Language and Cultural Education 7(3):135–153. https://doi.org/10.2478/jolace-2019-0025.
Richards, J. C., Hull, J., & Proctor, S. (2021). Interchange intro (5th ed.). Cambridge University Press.
Song E., Kim H.-S.. 2025;Investigating the effects of ChatGPT-assisted English speaking practice based on student engagement levels. Research Institute of Curriculum & Instruction 29(5):298–313. https://doi.org/10.24231/rici.2025.29.5.298.
Woo J. H., Choi H.. 2021;Systematic review for AI-based language learning tools. Journal of Digital Contents Society 22(11):1783–1792. https://doi.org/10.9728/dcs.2021.22.11.1783.
Xiao Y., Zhi Y.. 2023;An exploratory study of EFL learners’ use of ChatGPT for language learning tasks: Experience and perceptions. Languages 8(3):Article 212. https://doi.org/10.3390/languages8030212.
Yilin W., Jishuang L., Yunus M. M.. 2023;Chances and challenges of EFL teaching powered by ChatGPT on developing students’ critical thinking. Academic Research in Business and Social Sciences 13(12):3922–3935. https://doi.org/10.6007/IJARBSS/v13-i12/20385.
Yu Y., Yoo H.. 2024;The learning effects of Korean writing through generative AI ChatGPT for Chinese learners. The Journal of Learner-Centered Curriculum and Instruction 24(9):361–377. https://doi.org/10.22251/jlcci.2024.24.9.361.
Zhai C., Wibowo S., Li L. D.. 2024;The effects of over-reliance on AI dialogue systems on students’ cognitive abilities: A systematic review. Smart Learning Environments 11:Article 28. https://doi.org/10.1186/s40561-024-00316-7.

Article information Continued

Fig. 1.

A Screenshot of Sample Conversation: Low Proficiency Group

Fig. 2.

A Screenshot of Sample Conversation: High Proficiency Group

Table 1.

Demographic Information

Group Age Major Overseas Exp. Reasons for Learning English
Low Proficiency (n = 25) 19-24 College of Future Industry Convergence (13) Yes (2) Grade (16)
College of Natural Sciences (7) No (23) Career (10)
College of Arts and Physical Education (5) Travel (8)
Global Language (6)
Interest (4)
Self-improvement (2)
High Proficiency (n = 23) 19-23 College of Natural Sciences (13) Yes (3) Grade (16)
College of Future Industry Convergence (6) No (20) Global Language (11)
College of Arts and Physical Education (1) Travel (9)
College of Humanities (1) Career (7)
Open Major (2) Interest (5)

Table 2.

Result of Paired Sample t-Tests on Speaking & Listening: Low Proficiency Group

Skill Test M SD df t p
Speaking Pre .960 .763 24 -11.390 .000**
Post 2.140 .784
Listening Pre 10.440 3.318 24 -3.689 .001**
Post 12.200 3.189
**

p < .01.

Table 3.

Result of Paired Sample t-Tests on Speaking & Listening: High Proficiency Group

Skill Test M SD df t p
Speaking Pre 2.435 .529 22 -14.378 .000**
Post 3.761 .541
Listening Pre 17.739 3.805 22 -4.507 .000**
Post 20.391 3.071
**

p < .01.

Table 4.

Results of ANCOVA for Speaking & Listening

Group N Pre-test
Post-test
Post-test (Adjusted Mean)
F p
M SD M SD M SE
Speaking Low 23 .960 .763 2.140 .784 2.677a .118 6.134 .017*
High 25 2.435 .529 3.761 .541 3.177 a .125
Listening Low 23 10.440 3.318 12.200 3.189 14.368 a .555 15.195 .000**
High 25 17.739 3.805 20.391 3.071 18.035 a .588

Note. ANCOVA was conducted with the pre-test score entered as a covariate.

*

p < . 05.

**

p < .01.

Table 5.

Changes in ChatGPT Literacy: Pre- and Post-Test Results for Low Proficiency Group

Factor Survey M SD df t p
Factor 1 Pre 2.725 .705 24 -4.947 .000**
Post 3.565 .970
Factor 2 Pre 3.085 .689 24 -3.245 .003**
Post 3.575 .931
Factor 3 Pre 3.006 .716 24 -3.993 .001**
Post 3.686 .889
Factor 4 Pre 3.160 .711 24 -3.679 .001**
Post 3.880 .927
Factor 5 Pre 3.400 .633 24 -2.222 .036*
Post 3.784 .957

Note. Factor 1: Technical Proficiency (TP), Factor 2: Critical Evaluation (CE), Factor 3: Communication Proficiency (CP), Factor 4: Creative Application (CA), Factor 5: Ethical Competence (EC).

*

p < .05.

**

p < .01.

Table 6.

Changes in ChatGPT Literacy: Pre- and Post-Test Results for High Proficiency Group

Factor Survey M SD df t p
Factor 1 Pre 2.625 .831 22 -7.399 .000**
Post 3.679 .756
Factor 2 Pre 3.223 .961 22 -3.246 .004**
Post 3.815 .652
Factor 3 Pre 3.261 .899 22 -3.928 .001**
Post 3.857 .756
Factor 4 Pre 3.312 1.022 22 -3.609 .002**
Post 3.942 .7082
Factor 5 Pre 3.678 .904 22 -1.482 .153
Post 3.930 .710

Note. Factor 1: Technical Proficiency (TP), Factor 2: Critical Evaluation (CE), Factor 3: Communication Proficiency (CP), Factor 4: Creative Application (CA), Factor 5: Ethical Competence (EC).

**

p < .01.

Table 7.

Pre-Survey Results on ChatGPT Literacy by Proficiency Level

Factor Group M SD df t p
Factor 1 Low 2.725 .705 46 .451 .700
High 2.625 .831
Factor 2 Low 3.085 .689 46 -.575 .322
High 3.223 .961
Factor 3 Low 3.006 .716 46 -1.097 .381
High 3.261 .899
Factor 4 Low 3.160 .711 46 -.601 .197
High 3.312 1.022
Factor 5 Low 3.400 .633 46 -1.244 .320
High 3.678 .904

Note. Factor 1: Technical Proficiency (TP), Factor 2: Critical Evaluation (CE), Factor 3: Communication Proficiency (CP), Factor 4: Creative Application (CA), Factor 5: Ethical Competence (EC).

Table 8.

Post-Survey Results on ChatGPT Literacy by Proficiency Level

Factor Group M SD df t p
Factor 1 Low 3.565 .970 46 -.453 .208
High 3.679 .756
Factor 2 Low 3.575 .931 46 -1.027 .060
High 3.815 .652
Factor 3 Low 3.686 .889 46 -.716 .610
High 3.857 .756
Factor 4 Low 3.880 .927 46 -.259 .476
High 3.942 .7082
Factor 5 Low 3.784 .957 46 -.598 .100
High 3.930 .710

Note. Factor 1: Technical Proficiency (TP), Factor 2: Critical Evaluation (CE), Factor 3: Communication Proficiency (CP), Factor 4: Creative Application (CA), Factor 5: Ethical Competence (EC).

Table 9.

Students’ Responses on Advantages of Using ChatGPT: Low Proficiency Group

Category # Example Response
Level-Appropriate Interaction 6 “I can receive questions that match my English level.” “There were almost no misunderstandings due to differences in skill levels.”
Reduced Anxiety & Affective Support 6 “I liked that I could speak without worrying about making mistakes.” “I could have a conversation more comfortably than in person.”
Improved Speaking Fluency & Confidence 6 “My speaking skills improved, and I gained confidence.” “I was able to have English conversations freely, which I usually don’t do.”
Personalized Feedback & One-on-One Learning 5 “It felt like having a one-on-one teacher.” “I liked that unknown expressions were explained immediately.”
Listening & Integrated Language Skills 5 “I became more accustomed to listening to English.” “I was able to practice speaking and listening at the same time.”
Access to Information & Learning Expansion 5 “It was fun to learn about the culture of other countries.” “It became a new way to access various information.”
Authentic Communication Experience 4 “It felt like talking with a foreigner.” “The conversation flowed naturally.”

Table 10.

Students’ Responses on Advantages of Using ChatGPT: High Proficiency Group

Category # Example Response
Speaking Practice & Fluency, Accuracy Development 6 “I practiced speaking a lot.” “I was able to speak faster and more accurately than when talking with a partner.”
Personalized Interaction & Adaptive Support 6 “I was able to receive questions that matched my English level.” “It was refreshing that questions were generated tailored to the individual.”
Affective Comfort & Reduced Anxiety 6 “I could speak comfortably even if my English wasn’t perfect.” “It was less stressful than speaking with other people.”
Learning Efficiency & Time Effectiveness 5 “Time was saved, and I could quickly get professional information.” “I was able to handle many questions in a short time.”
Knowledge Expansion & Engagement 5 “It was fun to learn about the culture of other countries.” “The process of gaining new information was interesting.”
Listening Skill Improvement 4 “Focusing on listening seemed to improve my skills.” “I was able to improve speaking and listening at the same time.”
Integrated Language Skill Development 4 “I was able to learn English speaking and listening effectively.” “It helped me pick up new expressions and practice reading when using ChatGPT.”
Advanced Vocabulary & Linguistic Precision 4 “I was able to use more advanced vocabulary than when talking with people.”

Table 11.

Students’ Responses on Disadvantages of Using ChatGPT: Low Proficiency Group

Category # Example Response
Communication Breakdown 8 “There were many cases where it gave strange answers or ignored questions because it didn’t understand.” “The conversation gets interrupted when GPT doesn’t understand or repeats the same thing.”
Limited Accuracy & Reliability 6 “Sometimes the accuracy was low, so it was hard to fully trust it.” “The answers felt like they were repeated within a similar range.”
Overly Long or Unfocused Responses 6 “I wanted short answers, but it speaks too long.” “It would be better if it summarized the information.”
Speech Recognition Problems 6 “I spoke in English, but it was recognized as Korean or Japanese.” “Conversation was difficult because pronunciation wasn’t recognized well.”
Limited Interaction Flow 5 “If the answer is a little delayed, the conversation gets cut off.” “It was hard to move to the next step because I had to keep asking questions.”
Limited Speaking Development 4 “Since it only allowed short answers, there was a limit to improving speaking skills.” “It was difficult to freely express what I wanted to say.”
Vocabulary & Level Control Issues 4 “It was difficult to understand because there were many unknown words.” “It was hard to understand when the sentences were too long or technical.”
Limitations Compared to Human Interaction 3 “There are limits to sharing personal stories like with a person.” “I felt it was different from talking with a real person.”

Table 12.

Students’ Responses on Disadvantages of Using ChatGPT: High Proficiency Group

Category # Example Response
Speech Recognition Limitations 8 “It was frustrating because it couldn’t understand me properly due to my pronunciation or background noise.” “It didn’t recognize even slightly unclear pronunciation.”
Difficulty in Understanding Intent & Context 7 “Sometimes it gives unexpected answers because it doesn’t understand well.” “If the question isn’t clear, it’s hard to get the answer you want.”
Lack of Natural Interaction & Engagement 6 “It wasn’t fun because I was talking to a machine, not a person.” “I end up looking at my phone instead of using ChatGPT.”
Reduced Speaking Opportunities 5 “I spent more time listening than speaking.” “It was closer to listening practice than speaking practice.”
Inconsistency & Reliability Issues 5 “It gave different answers to the same question, which reduced trust.” “Sometimes it presents incorrect information as if it were true.”
Limited Real-Life Applicability 4 “Many answers were difficult to use in real life.” “It was different from actual conversations with people.”
Cognitive Load & Attention Issues 4 “It didn’t recognize well when multiple people spoke.” “It was difficult because I had to listen, interpret, and think at the same time.”
Dependence on Learner Initiative 3 “I don’t gain anything if I don’t participate actively.”
Lack of Perceived Uniqueness 3 “I didn’t feel much difference compared to conversations with a partner.”

Table 13.

Students’ Suggestions for Using ChatGPT: Low Proficiency Group

Category # Example Response
Need for Structured Question Design 7 “I think questions should be as specific as possible.” “The answers change depending on how you ask questions.”
Level-Appropriate & Purpose-Oriented Use 6 “It would be good to choose topics considering speaking skills.” “It would be best used mainly for conversation practice.”
Improvement of Learning Environment 6 “It doesn’t recognize well in class because it’s noisy.” “It would be better to do it quietly alone as homework.”
Need for Critical Evaluation of AI Output 5 “It would be good if I had the ability to evaluate whether information is correct.” “Sometimes it gives strange answers, so verification is needed.”
Emphasis on Speaking-Oriented Tasks 5 “It would be good to use it mainly for conversation.” “I hope it helps improve speaking skills.”
Technical Limitations & System Constraints 4 “It doesn’t recognize well because of surrounding noise.” “GPT-3.5 has many errors, so other models should be considered.”
Expansion of Task Variety 4 “It would be good if there were a variety of questions.”
Recognition of Practical Limits 3 “It’s difficult to have a conversation like with a real person.”

Table 14.

Students’ Suggestions for Using ChatGPT: High Proficiency Group

Category # Example Response
Conversation-Oriented Task Design 8 “It would be good to continue the conversation with a single question.” “It would be better if it proceeded in a conversational format rather than simple questions.”
Guidance on Question Design 7 “You can get better answers if you ask specific questions.” “It would be helpful to be guided on how to ask questions.”
Level-Appropriate Task Design 6 “Topics that consider speaking skills are needed.” “Difficult questions made it hard to continue the conversation.”
Integration of Peer Interaction 6 “It would be good to discuss in groups first before asking questions.” “There needs to be time to compare answers with friends.”
Need for Grammar & Expression Feedback 5 “It would be good to receive feedback on phrasing and grammar.”
Classroom Environment & Management 5 “It doesn’t recognize well because the classroom is noisy.” “I think I could concentrate better if I did it at home.”
Clarifying Purpose of AI Use 4 “It was unclear whether the purpose was to obtain information or to have a conversation.”
Diversification of Topics & Prompts 4 “It would be nice to be able to talk about a variety of topics.”
Critical Use of AI Information 3 “I think I need to judge for myself whether the answers given by AI are correct.”
Technical Limitations Awareness 3 “Sometimes, I have difficulty understanding the pronunciation.”