EFL Learners’ English Writing Feedback and Their Perception of Using ChatGPT

Chae-young Mun

doi:10.16875/stem.2024.25.2.26

J Eng Teach Movie Media > Volume 25(2); 2024 > Article

Mun: EFL Learners’ English Writing Feedback and Their Perception of Using ChatGPT*

Article

Journal of English Teaching through Movies and Media 2024; 25(2): 26-39.

Published online: May 31, 2024

DOI: https://doi.org/10.16875/stem.2024.25.2.26

EFL Learners’ English Writing Feedback and Their Perception of Using ChatGPT^*

Chae-young Mun¹

¹Associate professor, Department of General Education, Seoul Women’s University, 621 Hwarangro, Nowon-Gu, Seoul, 01797, Korea

Corresponding author, Associate professor, Department of General Education, Seoul Women’s University, 621 Hwarangro, Nowon-Gu, Seoul, 01797, Korea (E-mail: christi95@naver.com)

*. This work was supported by a research grant from Seoul Women’s University. (2024-0076)

Received: April 14, 2024 Revised: May 15, 2024 Accepted: May 28, 2024

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

This study aims to investigate the utilization of AI-assisted feedback by EFL college learners in their English writing. Specifically, the research sought to explore how writing assisted by ChatGPT improves students’ writing skills in comparison to peer feedback. Additionally, the study aimed to understand the learners’ perceptions regarding the use of ChatGPT in editing English writing. Participants were tasked with submitting their presentation scripts before their speaking exams. They received an instructional session on how to use the tool effectively and were told useful prompts for using ChatGPT. The collected writings were analyzed holistically by two experienced EFL instructors, and analytically by using Grammarly (2022), a free online grammar and spelling checker, to identify characteristics. The analysis revealed that there was a significant difference in holistic scores including content and organization. The experimental group who used the AI tool had significantly reduced grammatical and lexical errors. However, no significant difference between the groups was found in the word counts and use of vocabulary types. Additionally, reflections from participants indicates a positive attitude toward the use of AI-assisted feedback in English writing but there were some concerns about reliability and over-reliance. The study suggests pedagogical implications for the effective integration of AI-assistance in English writing based on these findings.

Keywords: AI tools, ChatGPT, writing skills, affective aspect, perception

Applicable level: tertiary

I. INTRODUCTION

The rapidly developing technology such as artificial intelligence (AI) has already greatly affected society as a whole. And the convergence of AI and education has garnered significant attention with its numerous benefits for learners. Current digital-era learners are growing up with technology in daily life, and nearly all language learners use AI tools for learning language purposes, and most frequently use AI tools for L2 writing tasks (Valijärvi & Tarsoly, 2019).

Proficient writing skills empower learners to effectively communicate their ideas, articulate thoughts clearly, and achieve academic excellence across various professional domains (Yoon, 2011). However, English as Foreign Language (EFL) writing is challenging for learners due to lack of related lexical resources to inspire them for meaningful writing. They often face motivation constraints due to time limitations, which hinders their ability to allocate sufficient time and effort toward improving their writing abilities.

AI-assisted writing tools offer automated feedback on various aspects of writing, including grammar, coherence, organization, and vocabulary, thereby facilitating more effective writing performance improvements. Automated feedback offers immediate computer-generated quantitative assessments and qualitative feedback on numerous submitted pieces of writing. Utilization of AI tools in foreign language writing has been increasingly active. Previous studies have reported that students were able to improve the quantity and quality of L2 writing (Li et al., 2015; Rahman et al., 2022; Wang et al., 2013) through AI tools, and identify that the learners support the value of feedback and reduce their cognitive barriers (Cheng, 2017; Gayed et al., 2022). However, some studies point out the low quality of writing (Grimes & Warschauer, 2010) and identified challenges associate with accuracy and precision of the software (Chun et al., 2021) and claimed that automated feedback could be a supplement of peer and teacher feedback instead of a replacement as a result of comparisons between peer, instructors, and the automated writing evaluation tools.

The advent of ChatGPT in November 2022 has revolutionized the field of AI by overcoming the limitations of prior AI technologies through the utilization of vast amounts of data. AI-powered tool is ChatGPT, an AI-assisted Chatbot developed by OpenAI, can be effectively utilized in diverse language learning courses to enhance learners’ writing abilities (Barrot, 2023). Equipped with comprehensive knowledge, it generates words and grammatically correct structures to facilitate the creation of coherent and cohesive written text. Moreover, this tool comprehends human queries and provides appropriate responses. It has been recognized for its potential in enhancing writing performance that it provides learners with immediate feedback and alternative grammatically correct sentences (Song & Song, 2023).

It has been a controversial issue among foreign language (FL) educators because there are both benefits and drawbacks when using technology such as machine translators or AI-powered writing assistance tools. This can work as an alternative to dictionaries and model more advanced use of foreign learning; however, overreliance on technology devices can jeopardize students’ learning (Clifford et al., 2013; Jolley & Maimone, 2015). Regardless of whether foreign language instructors approve or disapprove of its usage, students commonly use them as part of their language learning as well as for many other different purposes (Briggs, 2018).

As ChatGPT has been unveiled recently, its versatile capabilities that can be applied in learning have provoked significant interest in the education field. However, the number of studies on its use in language learning remains relatively limited. Little research has been conducted to investigate whether ChatGPT could perform as a writing assistant tool in learners’ environments and does it enhances their performance. Thus, the purpose of the study is to examine how Korean EFL college learners utilized ChatGPT as they were involved in composing their presentation scripts. In addition, learners’ perspectives on using ChatGPT will be examined. This research focuses on two research questions.

1. Is there a significant difference in writing performance between participants who use ChatGPT as a writing feedback tool in an EFL learning context and who do not?

2. How do the participants who use ChatGPT as a writing feedback tool perceive the impact of ChatGPT as a feedback tool on their presentation script writing?

II. LITERATURE REVIEW

1. AI-Based Automated Writing Feedback

Feedback is crucial for encouraging learning (Anderson, 1982) and its significance has been recognized in L2 writing. Among different types of feedback, error correction is the most frequently given feedback in the L2 writing process. According to Leki (1991), students believe that grammar feedback provided to them by teachers helps them improve their writing quality. While acknowledging the importance of feedback in effective L2 writing (Hyland & Hyland, 2006), research has spelt out that feedback on writing is a time-consuming job and that it can be problematic for teachers to provide feedback to students on a regular basis (Grimes & Warschauer, 2010). For L2 writing teachers, providing comprehensive corrective feedback, which is usually complex and time consuming, is inevitable. With the advent of technology, AI writing tools have provided a possible solution to alleviate teachers’ workload and time constraints and facilitate students’ writing. Current learners, characterized as digital natives, tend to prefer instant corrective feedback (O’Neill & Russell, 2022); therefore, teachers have difficulty allocating time effectively and prioritizing feedback.

The use of AI writing software has received extensive attention in L2 research, particularly when it comes to the grammatical and lexical accuracy of L2 writing. Research has provided empirical evidence on the advantages of using AI tools for L2 writing. Automated writing evaluation (AWE) has been studied for years as a way to provide timely evaluation of student writing and diminish the burden on educators to evaluate writing (Wilson & Roscoe, 2020). These systems typically use natural language processing and artificial intelligence to evaluate writing. Some studies have shown such systems can produce positive effects (Roscoe et al., 2017; Stevenson & Phakiti, 2014; Wilson & MacArthur, 2024; Zhai & Ma, 2022).

Studies have reported that using AI writing tools leads to more accurate foreign language composition compared to instances where they are not used, primarily through the reduction of grammatical errors and lexical errors. Wang et al. (2013) looked at the use of AWE software with college EFL students and examined its effect on writing accuracy. The research results reveal a significant difference between the experimental group and the control group in terms of writing accuracy following the adoption of AWE. The experimental group who used the software significantly reduce the number of grammatical errors in their writing from the pre- to the post-test, and they also outperformed the control group when it came to grammatical accuracy post-treatment. Regarding the overall effect and the exploration of students' perceptions toward their usage of the AWE software, it shows that students who used AWE display obvious writing enhancement in terms of writing accuracy and learner autonomy awareness.

Li et al. (2015) looked at the use of AWE among ESL students at an American university and found that use of the system led to a significant decrease in the total number of errors (grammar, usage, mechanics, and style) from the rough to the final draft. The participants had a favorable opinion of using criterion in the writing process. In addition, they believed that its use improved the linguistic accuracy of their compositions.

In a similar study, Ranalli et al. (2017) focused on the accuracy of the feedback given by a popular AWE system as well as the ability of American college ESL students to make use of this feedback to correct errors. The researchers found that while there were issues related to the accuracy of the feedback given, the students were able to correct their mistakes according to the feedback by the program 55-65% of the time.

In a 2017 study by Cheng, the paper examined the use of automated feedback among 138 college EFL students in Hong Kong. The participants were divided into two groups: experimental (n = 82) and control (n = 56). The experimental group could access a web-based classification system to generate online automated feedback on their second and third reflective journals while the control group could not. Results of the study indicate that the online automated feedback group significantly outperformed the control group, thus demonstrating the positive impact that the software can have on overall writing quality. While data collected from an online questionnaire survey and focus group interviews generally support the value of online automated feedback for writing. In another study, Liao (2016) examined how AWE could improve grammatical performance among 63 EFL writers. The study investigated participants’ grammatical performance in revised and subsequent new essays. The results of the study revealed that the software did help improve grammatical accuracy. More specifically, even though improvements were seen in each revision, significant improvements to new text were not found until the third draft of the essay. In recent study, Rahman et al. (2022) examined the role of an AI-assisted language learning tool in identifying and addressing grammatical errors, leading to the development of writing skills among EFL learners. The results indicated significant improvement in the EFL learners’ writing skills, and the learners themselves expressed positive perceptions regarding the effects of AI-assisted language learning on their writing abilities. Gayed et al. (2022) developed an AI-assisted language learning program to address cognitive barriers and improve the writing performance of EFL students. The findings indicated that the AI-powered language learning tool effectively enhanced students’ writing performance and reduced cognitive barriers encountered during writing tasks.

While some AWE systems are as efficient and reliable as human raters when assigning scores to writing, they are typically less accurate, more generic and verbose, and sometimes confusing to feedback recipients (Grimes & Warschauer, 2010; Wilson & MacArthur, 2024). Additionally, preparing such tools for educational settings takes substantial time (Moore & MacArthur, 2016). Research has also identified challenges associated with accuracy of the program. Bai and Hu (2017) investigated the accuracy and precision of AWE software among 30 undergraduate learners in EFL writing classrooms. They revealed that learners were not blindly accepting AWE feedback, but making more cautious decisions of feedback uptake. They claimed that automated feedback could be a supplement of peer and teacher feedback in the EFL writing classrooms instead of a replacement, and called for future studies to examine how learners made decisions of incorporating or rejecting a suggestion. In a comparative study between online peer feedback and AWE software, Shang (2019) found that the former had a greater positive effect on the grammatical accuracy of Taiwanese L2 English writers. This finding demonstrates that human feedback, whether from a teacher or a peer, can still prove to be useful or perhaps more beneficial than automated feedback.

2. Writing Feedback Using ChatGPT

While ChatGPT can be useful as an L2 writing assistant, adopting it as a supplemental tool for writing essays is highly advisable rather than relying on it as a content creator. A helpful approach is to encourage students to write their original outputs first and then refine them using ChatGPT. This approach helps students develop their writing skills while using ChatGPT to improve their written outputs further (Barrot, 2023). Research related to ChatGPT has been investigated recently (Mizumoto & Eguchi, 2023; Rudolph et al., 2023; Wang & Guo, 2023). Mahapatra (2024) investigated the impact of ChatGPT as a formative feedback tool on the writing skills of undergraduate ESL students. Data were collected from tertiary level ESL students and the findings indicate a significant positive impact of ChatGPT on students’ academic writing skills, and students’ perceptions of the impact were also overwhelmingly positive. The paper suggest that ChatGPT can be a good feedback tool in large-size writing classes with proper student training. In another study by Dai et al. (2023), the study investigated the feasibility of using ChatGPT to provide students with feedback to help them learn better. The results show that ChatGPT is capable of providing feedback on the process of students completing the task, which benefits students developing learning skills. In addition, the study found that ChatGPT generated more detailed feedback that fluently and coherently summarizes learners’ performance than human instructors.

With the proliferation of AI-driven tools has become easier for students to obtain feedback on their writing. They have advanced automated writing evaluation and feedback in writing (Gayed et al., 2022). While the literature on AI writing tools as a feedback tool in the writing classroom is well-established and its positive impact on writing has been investigated empirically. The use of ChatGPT in writing classes is a relatively new area and requires further investigation (Barrot, 2023). Since artificial intelligence-driven automated writing evaluation tools positively impact students’ writing, ChatGPT, a generative artificial intelligence tool, can be expected to have a more substantial positive impact. However, very little empirical evidence regarding the impact of ChatGPT on writing is available. In addition, there seems to be a necessity to investigate the shifts in learner perceptions with recent considerable improvement in AI learning tool. The purpose of the present study is to find how writing with ChatGPT improves students’ writing skills and affective aspects by analyzing the effects of using the tool on writing products.

III. METHOD

1. Participants

The participants in this study were 43 Korean college freshmen attending a university in Seoul, Korea. All participants were taking a mandated general English course focused on listening and speaking. Their English proficiency was classified as at the high intermediate level through the placement test. The test was TOEIC mock test and their scores were between 550 and 800. Their ages spanned from 19 to 21 years old, with all participants having studied English for a minimum of 7 years at the time of the study. They have known ChatGPT but never used ChatGPT to edit their English writing prior to participating in this study. One class, comprising 23 participants, utilized ChatGPT software to revise scripts and constituted the experimental group (EG), while the other class, also consisting of 20 participants, received instructions without ChatGPT and constituted the control group (CG). Out of 23 participants from experimental group, three scripts including inappropriate formats were excluded.

To ensure the initial equivalence of the two groups’ writing abilities, the researchers conducted an independent t-test to examine potential differences in pretest mean scores between them. The results revealed comparable mean scores between the control group (CG: M = 58.25, SD = 6.80) and the experimental group (EG: M = 57.20, SD = 6.78), with no statistically significant difference observed (F = 0.148, p = .70). This finding indicates that participants in both groups exhibited similar writing abilities at the pretest.

ChatGPT was introduced by OpenAI as a conversational artificial intelligence in November 2022. By May 2023, when experiments were conducted, it was essential to assess the extent to which learners were aware of and utilizing ChatGPT. Particularly, there was a need to explore its application in English writing and inquire about the existence of any AI tools used for conventional English composition.

A survey was conducted, and all students responded that they have used online writing tools for English writing. To the question of asking any AI writing tools they use frequently, the total of 40 participants primarily use Naver dictionary (41%), the translation tool Papago (41%) followed by Google Translate (4.5%), ChatGPT (4.5%), and others (9%; see Figure 1).

Areas of assistance includes checking English expressions (35%), followed by checking vocabulary (26%), grammar (22%), and other writing errors (17%; see Figure 2). In Larson-Guenette’s (2013) study, most students responded that they use online writing tools daily because of time efficiency, word search, confirmation, and reference. And participants are using AI tools for writing despite being aware of its shortcomings (Valijärvi & Tarsoly, 2019).

2. Instruments

The instruments consisted of students’ pre-test/post-test presentation script writing and open-ended questions listed in a survey. For the pre-test, students write one paragraph writing on the topic learned in the first week. For the posttest, students write a presentation script on the given topic to prepare their oral presentation on week 7. The presentation script is composed of a three-paragraph writing. The first paragraph is a self-introduction, and two paragraphs are related to the given topics dealt in the textbook. The topics are “mother nature” and “on the move.” For the first topic, the students prepared their views on extinction of a species and what they can do to save endangered species. The second topic “on the move” is related to the immigration, and the participants were asked to pick one city for the future internship program and discuss the push and pull factors that influenced their choice.

The survey questions were utilized in exploring students’ learning experiences and acceptance of new technology towards ChatGPT. A questionnaire comprises 11 questions which included 10 items on a 5-point Likert scale and one open-ended question asking opinions of the advantages or disadvantages of using ChatGPT or individual planning for future usage.

3. Procedure

Both experimental and control groups received instructions by the same instructor, utilized identical course materials and syllabus, and underwent the same examinations. The students were mandatory to take one speaking test throughout the semester. The participants were allocated 20 minutes in class to write a script expressing their perspectives on the given topic during both the pre- and post-tests. Both groups were instructed to post the presentation scripts on a school learning management system (LMS).

Participants in the experiment group were asked to proofread and revise their drafts individually using ChatGPT. They were asked to upload their second drafts followed by checking their drafts on the ChatGPT site (https://chat.openai.com) for the free version, GPT-3.5 (Generative Pretrained Transformer-3.5). Examples or prompts were shown during class time and the instructor illustrated their usage with a sample paragraph. The prompts include “revise my writing,” “polish,” or “correct my grammatical, lexical errors.” The participants in the EG submitted their second drafts after receiving feedback from the platform and revising their initial drafts. Conversely, those in the CG wrote their drafts and provided peer feedback during class. Peer feedback include both global (content and organization) and local (grammar and mechanics) aspects of writing. Finally, both groups submitted presentation scripts to the instructor for evaluation and grading. After they submitted their second draft, students in EG groups’ perceptions and attitudes toward using the tool were collected through LMS.

4. Data Analysis

An independent t-test in SPSS version 25.0 was used to analyze the data collected from the pre- and post-test writing scores of the groups. Two experienced EFL raters, including the researcher, independently graded participants’ drafts to ensure objective scoring. The holistic scores, including content and organization on students’ presentation drafts were employed to evaluate the overall writing quality (Chang et al., 2021). The raters both possess doctoral degrees and have accumulated over a decade of teaching experience at the university level. To assess the consistency between the raters, an inter-rater reliability test was conducted on the post-test script scores. The inter-rater Cronbach’s alpha coefficient was calculated at 0.82, significant at p < .05, indicating a high level of agreement between the two raters. A total of 43 presentation scripts were collected from experimental group; however, three scripts with inappropriate formats were excluded from the analysis.

For form-oriented feedback, Grammarly (2022) was employed to analyze the participants’ script writing since the program makes it possible to analyze the drafts numerically and mechanically (Dizon & Gayed, 2021). This program identifies mistakes such as spelling, punctuation, and grammar errors in students’ drafts. Additionally, it furnishes details such as the overall word count, sentence count, and approximate time required for reading and speaking in English. Moreover, the tool calculates a readability score by considering factors like sentence and word length in students’ drafts, employing the Flesch reading-ease test (Flesch, 1948). The EG students’ learning experiences towards ChatGPT in survey responses were first analyzed and then categorized.

IV. RESULT

1. L2 Writing Scores Using ChatGPT

To answer the first research question, the participants’ writing scores were compared. Paired sample t-tests were conducted to investigate whether there were significant differences between the pre- and post-test scores.

The results of the descriptive statistics and paired sample t-tests are presented in Table 2. The results showed that there was no statistically significant difference in mean scores between the two tests for the control group (p = .15).

As for the EG group, on the other hand, the findings revealed statistically significant differences between the two tests indicating that the participants in the group improved their writing. Table 3 shows the results of paired t-tests of EG group. To be specific, mean scores in the pre-test showed 62.20 (SD = 11.52) while the scores in the post-test showed 78.25 (SD = 8.94). The result of the differences between the pre- and post-tests within each group suggests that using ChatGPT in revision can be effective for their writing.

The difference in post-test scores was verified by an independent sample t-test to analyze whether there was a significant difference in the post-test mean score between the two groups. The results revealed that a significant difference was found (t = -3.72, p = .00 < .05) between the EG (M = 78.75, SD = 8.97) and the CG (M = 65.00, SD = 13.86) group, indicating that students’ writing performance in the EG had greater gains in the posttest writing than the students in the CG did. Such a result suggests that revising the script using ChatGPT can assist participant in writing significantly than revising the draft from peer feedback. The quantitative results are presented in Table 4.

An independent t-test between the writing tasks was performed, comparing their scores in two different conditions in terms of word count, vocabulary and grammatical and lexical errors. Table 5 presents the result of descriptive statistics and independent t-tests. The mean scores of the number of words and sentences in each group are shown in the table below. In CG group, it turned out the participants’ revised drafts consisted of 385.55 words and 32.24 sentences on average whereas the participants’ revised drafts consisted of 358.65 words and 26.15 sentences on average in EG group. The number of words and sentences in EG group turned out to be decreased compared to the CG group. A significant mean difference was found in the number of sentences between the two versions (t = 6.57, p < .01). The p-value for word count was not significant. The presentation script using ChatGPT does not help learners generate more text through the revision process in this study. This assumption could be related to the nature of the text. Participants reviewed their writings before memorizing lines for presentation, which may lead them to refrain from generating additional text or incorporating unique or uncommon words.

The number of vocabulary “unique” and “rare” words was calculated using Grammarly. According to the definitions of Grammarly (2022), unique words are connected with how much the writer uses words in an unusual or different way, and those unique words do not fall into the category of the 200 most frequently used words. Rare words are indicator of more advanced language use. It was found that 51.95 unique words and 21.75 rare words were used in CG group, while 55.30 unique words and 22.40 rare words were found in the EG group. The number of unique and rare words is increased in the group using ChatGPT; however, a significant difference was not found between the two versions of the script.

The number of grammatical and lexical errors presented in their revised output is shown in Table 6. The mean number of errors in CG group is 14.60 for grammar and 3.70 for the lexical part. In EG group, on the other hand, the mean scores are 6.05 and 0.35 in each category respectively. Lexical errors encompass various types such as improper word choice, unclear sentences, wordiness, inconsistencies in the text, and the use of inappropriate colloquial expressions. On the other hand, grammatical errors are associated with punctuation usage, construction of compound or complex sentences, incorrect utilization of passive voice, semicolons, and quotation marks. The findings revealed statistically significant differences between the two drafts. It indicates that the group using ChatGPT for their writing revision made less errors grammatically and lexically (p < .01).

The results of this study are consistent with the results of previous studies in that the scores using AI tools for writing have significantly increased. Previous studies have found that using AI tools resulted in higher writing scores (O’Neill, 2016) and advantages in lexical choice (Chen et al., 2015). It was similar to previous findings that feedback improved the accuracy of written communication. However, learners using ChatGPT do not get helped to generate longer text or more unique and rare vocabulary through the revision process in the study.

2. Perceptions of Using ChatGPT in English Presentation Script Writing

To investigate the learners’ affective aspects of using ChatGPT for revision, the questionnaire responses to writing using ChatGPT was analyzed. The questionnaire consisted of 10 closed-items divided into three sections: usefulness, improvements in L2 writing skills, and affective variables. A total of 23 responses were collected from experimental group. Participants were asked to indicate their level of agreement with each statement on a scale ranging from strongly disagree (SD) to strongly agree (SA).

As Table 7 shows above, over 95 % of participants in EG showed positive sentiment to ChatGPT. More than 90% of respondents found ChatGPT easy to use and a significant majority (87%) found it very convenient. None of the respondents showed negative opinion on usefulness of using ChatGPT as an AI tool in general. This suggests that ChatGPT is perceived as a user-friendly and convenient tool by the majority of users.

Table 8 depicts the results of survey questions asking the effects of using ChatGPT on writing skill. Regarding the questions asking whether it is helpful in their English writing, the majority (95.7%) of the respondents found their experience with ChatGPT either helpful or very helpful in English writing. One out of 23 respondents responded that it is not helpful. Over 90% found ChatGPT helpful or very helpful for grammar learning. None of the participants disagree to this question. In the aspects of vocabulary, majority of the respondents (78.3%) acknowledged the tool is helpful to find appropriate or better vocabulary for their writing, four participants took a neutral attitude and one disagreed. A significant majority (91.4%) acknowledged the tool’s effectiveness in organizing sentence structure and content. This suggests that users find ChatGPT beneficial for refining the structure and coherence of their written content. Users perceive ChatGPT as a valuable tool for language learning, particularly in improving grammar skills. The result can be in line with the quantitative result that writing revision for the EG made less errors in grammar and lexical. It can be suggested that this tool can help students to enhance their writing and raise confidence in writing in their affective aspect.

Table 9 presents the distribution of responses for three affective variables. Most of the EG students thought that ChatGPT was a beneficial learning tool for enhancing English writing performance. To the question asking future interest in using the tool for English writing, over 78% expressed a high future interest in English writing, four students expressed a moderate future interest and one strongly disagreed. More than 80% responded that ChatGPT lowered their writing anxiety in English writing and felt that their fear in English writing would be eliminated. More than 10% of respondents took neutral, and one respondent disagreed. These responses suggest a positive outlook for continued engagement in English writing with reduced apprehension. The overall sentiment towards ChatGPT is highly positive, with users perceiving it as a valuable and effective tool for English writing and language learning. The tool’s ease of use, convenience, and positive impact on grammar, vocabulary, and content organization contribute to its positive reception.

Table 10 shows the result of an open-ended question, the responses from participants were grouped based on their resemblance to a single item. If participants provided three distinct opinions, each was considered and tallied individually. In an open-ended survey question, many respondents mentioned that the tool was very helpful and informative and convenient to use. In addition, many participants responses the benefits highlighting the advantages of using ChatGPT as a learning means of enhancing their writing abilities especially in grammar. They wrote similar responses such as “I appreciate the aspect of quickly correcting my sentences for grammatical accuracy,” “It is useful for me to know my grammatical mistakes (such as the singular and plural of words, changing conjunctions and adjectives),” “I felt the responses were enhanced with more advanced vocabulary.” Almost 20% of responses showed that they considered the tool very helpful in connecting and structuring sentences and this is consistent to the quantitative result that the number of sentences in revised version of EG turned out to be decreased compared to the CG. The students also stated that it provides features that give users advanced alternative word choices, and expressions. One respondent reported that the tool helps partially since it suggested expressions and vocabulary in academic way that she should change it into her own word. Due to the necessity for participants to ready themselves for the oral examination and commit the script to memory, they refrained from promptly implementing the tool’s recommendations. As the participants were working on their presentation scripts, they sought simpler words rather than opting for unique or uncommon vocabulary, even though the suggested words appeared to be more advanced in language. Additionally, some mention the tool helps their alleviation of writing fear or uncertainty with English writing.

However, students pointed out the disadvantages of using ChatGPT in their presentation script writing (see Table 11). Students listed several negative comments about it. Some participants reported that ChatGPT sometimes gave wrong answers to their writing drafts. They disapproved of the comments given by ChatGPT. For instance, “There were occasional cases where it completely changed the meaning of the text, so there was a need for modification,” “It was very helpful, but the corrected sentences ended up being longer than the original ones, making them sound a bit unnatural.” There are also concerns about reliability and over-reliance of ChatGPT. One participant mentioned that relying on ChatGPT can be problematic since it never states impossibilities. Another participant reported that “There is some doubt about its accuracy. It seems a promising program with increased confidence and reliability.” Another significant issue they considered is about their effectiveness of future language learning. For instance, “It’s advisable to use it with a sound knowledge of grammar and vocabulary” and “If misused, it may not contribute significantly to the actual improvement of English proficiency.” The result is consistent with previous studies that foreign language learners tend to distrust the results of the translation tool (Briggs, 2018), perceiving it as not beneficial for foreign language learning (Yoon, 2019).

V. CONCLUSION

This study investigated the efficacy of utilizing the AI tool ChatGPT in the presentation script writing among students in the EG who received instruction with the tool compared to those in the CG who did not. Additionally, the study explored students’ perceptions regarding the use of this tool for revising their writing. The analysis of data comparing pre-test and post-test writing scores revealed substantial improvement in both groups (EG: from 62.20 to 78.25, CG: from 58.55 to 65.00). Following the computation of independent sample t-tests, a significant difference in posttest writing scores emerged between the EG) and the CG, with t = 3.72, and p = .00, indicating a statistically significant difference at the p < .05 level.

Building upon prior research demonstrating the positive impact of integrating AI tools in EFL writing, this study offers further insights into the efficacy of ChatGPT among EFL students. By illustrating a significant improvement in writing performance within the experimental group compared to the control group, the study underscores the usefulness of ChatGPT. Notably, students in the experimental group exhibited enhanced post-test writing quality in both structural and linguistic aspects, surpassing their pre-test scores. These findings highlight the efficacy of ChatGPT intervention in fostering students’ linguistic proficiency through the application of AI feedback software.

The performance of students in the EG compared to the CG in terms of writing quality aligns with findings from prior research (Cheng, 2017; Liao, 2016; Mahapatra, 2024; Wang et al., 2013). These studies have highlighted the benefits of AI tools in offering instantaneous grammar feedback, which proves particularly advantageous for EFL students during the drafting and proofreading stages. The EG students’ enhanced post-test scores can be attributed to their utilization of ChatGPT feedback. Additionally, the group using ChatGPT for their writing revision made less errors grammatically and lexically. Incorporating the tool into the presentation script writing process facilitated ample opportunities for rectifying grammatical mistakes and fostering the construction of new knowledge through AI-based automated writing feedback. However, the findings of the current study diverge from those of a previous study (Shang, 2019), which suggested that online peer feedback is more beneficial than automated feedback. Notably, participants in this study did not receive training on how to provide feedback.

The survey responses from students shed light on their attitudes toward ChatGPT. As a proofreading tool for EFL learners to review their presentation script writing, the students in the EG perceived it very positively. They mentioned its usefulness for getting feedback, especially for lexical and grammatical errors. By using ChatGPT’s feedback and revision suggestions, they developed lexical and grammar awareness while revising their drafts and benefited from the tool. They expect a positive outlook for continued engagement in English writing with reduced apprehension.

ChatGPT can be a valuable resource for students who struggle with writing. The rise of artificial intelligence has had a significant impact on student writing skills. While there are potential benefits to using AI in writing instruction, there are also concerns about the negative effects it may have on students’ ability to learn and develop their writing skills. Some worry that the use of AI will discourage students from learning how to write well. If students are relying on AI to correct their mistakes, they may not learn how to identify and correct those mistakes independently. Additionally, there is a risk that students will become overly reliant on AI and fail to develop critical thinking skills and creativity. Even though the tool can improve students’ L2 writing quality, it is difficult to ascertain its effect on whether students improve their writing ability or knowledge of grammar and vocabulary. Ultimately, the goal should be to use AI as a tool to supplement and enhance writing instruction, rather than as a substitute for it. Frequent reliance on generated text from ChatGPT may hinder language learners’ own writing abilities. It may lead to issues of plagiarism that should be carefully addressed when using it without appropriate review and editing (Song & Song, 2023). Therefore, instructors need to reflect on how, with the help of AI tools, they help students permanently improve their language knowledge and motivation to write in L2.

The present study differentiates itself from previous research on writing instruction by employing the contemporary artificial intelligence program, ChatGPT. Furthermore, the study implemented writing correction activities among university students and soliciting their feedback. However, it is essential to acknowledge limitations, including a constrained sample size in the experiment and challenges associated with generalizing learners’ English proficiency to a high-intermediate level. The availability of ChatGPT presents both challenges and opportunities for L2 writing teachers to recalibrate their classroom practices. Rather than outright banning ChatGPT, teachers can explore ways to work alongside these AI-based tools and capitalize on their potential. Research on ChatGPT is in its early stage of development and continues to expand. Therefore, teachers need to determine what to do with this rapidly emerging tool and how and when to adopt it.

Further research is warranted to conduct a comprehensive analysis of the experiment results, specifically examining significant differences between the original and revised writings in various grading domains through an in-depth investigation of writing scoring. Since ChatGPT is a large generative language model, its potential to help students with writing is immense. It is more student-friendly and can provide more need-based assistance than other AWE tools, as suggested by Guo et al. (2022) and Rudolph et al. (2023). It can support student writing by providing appropriate directions related to content and organization as they write. Since it can automatically train itself and learn from previous conversations (Chan & Hu, 2023), further research can investigate how students can receive tailored feedback suitable for individual needs.

FIGURE 1

Frequently Used Technology Tools

FIGURE 2

Areas of Assistant

Table 1.

Results of Independent t-Tests (Pre-)

	M	SD	t	df	p
CG (n = 20)	58.25	6.80	.00	38	.70
EG (n = 20)	57.20	6.78

Table 2.

Results of Paired t-Tests: CG Group

	M	SD	t	df	p
Pre-test (n = 20)	58.55	12.35	-1.49	19	.15
Post-test (n = 20)	65.00	13.86

Table 3.

Results of Paired t-Tests: EG Group

	M	SD	t	df	p
Pre-test (n = 20)	62.20	11.52	-4.93	19	.00*
Post-test (n = 20)	78.25	8.94

Table 4.

Results of Independent t-Tests (Revised Draft)

	M	SD	t	df	p
CG (n = 20)	65.00	13.86	-3.72	38	.00*
EG (n = 20)	78.75	8.97

Table 5.

An Independent t-Test Result for Revised Draft Outputs

		Without ChatGPT (CG)		With ChatGPT (EG)		t	p
		M	SD	M	SD	t	p
Word count	Word	385.55	79.23	358.65	70.45	.10	.83
Word count	Sentence	32.24	8.20	26.15	4.93	6.57	.01*
Vocabulary	Unique	51.95	4.59	55.30	4.40	.22	.63
Vocabulary	Rare	21.75	3.14	22.40	3.84	1.29	.26

Table 6.

An Independent t-Test Result for Revised Draft Outputs

	Without ChatGPT (CG)		With ChatGPT (EG)		t	p
	M	SD	M	SD	t	p
Grammar error	14.60	5.87	6.05	5.69	4.67	.00*
Lexical error	3.70	2.71	0.35	0.67	5.35	.00*

Table 7.

A Survey Result of Usefulness

Usefulness	N	A	SA
Helpful	1 (4.3%)	12 (52.2%)	10 (43.5%)
Ease of use	2 (8.7%)	9 (39.1%)	12 (52.2%)
Convenience	3(13.0%)	5 (21.7%)	15 (65.3%)

Note. SD = strongly disagree, D= disagree, N = neutral, A = agree, SA = strongly agree

Table 8.

Effects of Using ChatGPT on English Writing Skill

Effects of using ChatGPT on writing skill	D	N	A	SA
Helpful in English writing	1 (4.3%)	0 (0%)	13 (56.6%)	9 (39.1%)
Helpful in grammar	0 (0%)	2 (8.6%)	14 (61.0%)	7 (30.4%)
Helpful in vocabulary learning	1 (4.3%)	4 (17.4%)	10 (43.5%)	8 (34.8%)
Helpful in sentence structure and content	1 (4.3%)	1 (4.3%)	15 (65.3%)	6 (26.1%)

Note. SD = strongly disagree, D= disagree, N = neutral, A = agree, SA = strongly agree

Table 9.

Affective Variables

Affective variables	SD	N	A	SA
Necessary for writing	1 (4.3%)	1 (4.3%)	10 (43.6%)	11 (47.8%)
Future interest in English writing	1 (4.3%)	4 (17.4%)	11 (47.8%)	7 (30.5%)
Fear in English writing	1(4.3%)	3 (13.1%)	14 (60.9%)	5 (21.7%)

Note. SD = strongly disagree, D= disagree, N = neutral, A = agree, SA = strongly agree

Table 10.

Benefits of Using ChatGPT

Responses	# of responses	%
It seems very helpful and informative.	6	16.7%
I tried using it for the first time, it was very convenient to use.	5	13.9%
It has been very helpful in organically connecting and structuring sentences.	7	19.4%
I appreciated the aspect of quickly correcting my sentences for grammatical accuracy.	7	19.4%
This is a great writing tool to help me improve my writing quality.	4	11.1%
I felt that the responses were enhanced with more advanced vocabulary compared to the words I initially used.	4	11.1%
I believe it contributes to alleviating the fear or uncertainty associated with English composition.	2	5.6%
It seems like it suggests expressions used in academic conferences or when writing papers, so it might be helpful to refer to them partially when engaging in everyday conversations or preparing for speaking exams.	1	2.8%

Table 11.

Challenges of Using ChatGPT

Responses	# of responses	%
At first, the corrections were made with words that were too difficult, so there was a bit of inconvenience in having to request corrections for each sentence, asking for simpler and more usable words.	2	20%
There were instances where the corrections sometimes completely changed the meaning of the text, so there was a need for adjustments.	2	20%
It was very helpful, but the corrected sentences ended up being longer than the original ones, making them sound a bit unnatural.	1	10%
It was good that it automatically corrected sentences, but there is some doubt about its accuracy. With increased confidence and reliability, it seems like a promising program.	2	20%
I think relying solely on ChatGPT for learning, considering it doesn’t explicitly state impossibilities, can be problematic, as errors may be accepted naturally.	1	10%
If used without a solid understanding of grammar, it may be challenging to confirm whether ChatGPT’s suggestions are accurate. Therefore, it’s advisable to use it with a sound knowledge of basic grammar and vocabulary.	1	10%
Indeed, if misused, it may not contribute significantly to the actual improvement of English proficiency.	1	10%

REFERENCES

Anderson, J. R. (1982). Acquisition of cognitive skill. Psychological Review, 89(4), 369-406.

Bai, L., & Hu, G. (2017). In the face of fallible AWE feedback: How do students respond? Educational Psychology, 37(1), 67-81. https://doi.org/10.1080/01443410.2016.1223275.

Barrot, J. S. (2023). Using ChatGPT for second language writing: Pitfalls and potentials. Assessment in Writing, 57, 100745https://doi.org/10.1016/j.asw.2023.100745.

Briggs, N. (2018). Neural machine translation tools in the language learning classroom: Students’ use, perceptions, and analyses. The JALT CALL Journal, 14(1), 2-24. https://doi.org/10.29140/jaltcall.v14n1.221.

Chan, C., & Hu, W. (2023). Students’ voices on generative AI: Perceptions, benefits, and challenges in higher education. International Journal of Educational Technology in Higher Education, 20(43), 1-18. https://doi.org/10.1186/s41239-023-00411-8.

Chang, T., Li, Y., Huang, H., & Whitfield, B. (2021). Exploring EFL students’ writing performance and their acceptance of AI-based automated writing feedback. In Association for Computing Machinery (Ed.), Proceedings of the 2021 2nd international conference on education development and studies (pp. 31-35). Association for Computing Machinery, https://doi.org/10.1145/3459043.3459065.

Cheng, G. (2017). The impact of online automated feedback on students’ reflective journal writing in an EFL course. The Internet and Higher Education, 34, 18-27. https://doi.org/10.1016/j.iheduc.2017.04.002.

Chen, M., Huang, S., Chang, J., & Liou, H. (2015). Developing a corpus-based paraphrase tool to improve EFL learners’ writing skills. Computer Assisted Language Learning, 28(1), 22-40. https://doi.org/10.1080/09588221.2013.783873.

Chun, H. L., Lee, S. M., & Park, I. E. (2021). A systematic review of AI technology use in English education. Multimedia-Assisted Language Learning, 24(1), 87-103. https://doi.org/10.15702/mall.2021.24.1.87.

Clifford, J., Merschel, L., & Munné, J. (2013). Surveying the landscape: What is the role of machine translation in language learning? @ tic. Revista D'Innovació Educativa, (10), 108-121.

Dai, W., Lin, J., Jin, F., Li, T., Tsai, Y., Gasevic, D., & Chen, G. (2023). Can large language models provide feedback to students? A case study on ChatGPT. https://doi.org/10.35542/osf.io/hcgzj.

Dizon, G., & Gayed, J. M. (2021). Examining the impact of Grammarly on the quality of mobile L2 writing. The JALT CALL Journal, 17(2), 74-92. https://doi.org/10.29140/jaltcall.v17n2.336.

Flesch, R. (1948). A new readability yardstick. Journal of Applied Psychology, 32(3), 221-233. https://doi.org/10.1037/h0057532.

Gayed, M., Carlon, J., Oriola, M., & Cross, S. (2022). Exploring an AI-based writing assistant’s impact on English language learners. Computer & Education: Artificial Intelligence, 3, 100055,

Grammarly (2022). Grammarly: Free online writing assistant. https://www.grammarly.com.

Grimes, D., & Warschauer, M. (2010). Utility in a fallible tool: A multi-site case study of automated writing evaluation. Journal of Technology, Learning, and Assessment, 8(6), 4-43.

Guo, K., Wang, J., & Chu, S. K. W. (2022). Using chatbots to scaffold EFL students’ argumentative writing. Assessing Writing, 54, 100666https://doi.org/10.1016/j.asw.2022.100666.

Hyland, K., & Hyland, F. (2006). Feedback on second language students’ writing: State of the art. Language Teaching, 39(2), 83-101. https://doi.org/10.1017/S0261444806003399.

Jolley, J. R., & Maimone, L. (2015). Free online machine translation: Use and perceptions by Spanish students and instructors. In A. J. Moeller (Ed.), Learn languages, explore cultures, transform lives. pp 181-200. Central States Conference on the Teaching of Foreign Languages.

Larson-Guenette, J. (2013). It’s just reflex now: German language learners’ use of online resource. Teaching German, 46(1), https://doi.org/10.1111/tger.10129.

Leki, I. (1991). The preferences of ESL students for error correction in college‐level writing classes. Foreign Language Annals, 24(3), 203-218.

Liao, H.-C. (2016). Enhancing the grammatical accuracy of EFL writing by using an AWE-assisted process approach. System, 62, 77-92. https://doi.org/10.1016/j.system.2016.02.007.

Li, J., Link, S., & Hegelheimer, V. (2015). Rethinking the role of automated writing evaluation (AWE) feedback in ESL writing instruction. Journal of Second Langue Writing, 27, 1-28. https://doi.org/10.1016/j.jslw.2014.10.004.

Mahapatra, S. (2024). Impact of ChatGPT on ESL students’ academic writing skills: A mixed methods intervention study. Smart Learning Environments, 11(9), https://doi.org/10.1186/s40561-024-00295-9.

Mizumoto, A., & Eguchi, M. (2023). Exploring the potential of using an AI language model for automated essay scoring. Research Methods in Applied Linguistics, 2(2), 100050https://doi.org/10.1016/j.rmal.2023.100050.

Moore, S., & MacArthur, C. A. (2016). Student use of automated essay evaluation technology during revision. Journal of Writing Research, 8(1), 149-175. https://doi.org/10.17239/jowr-2016.08.01.05.

O’Neill, E. (2016). Measuring the impact of online translation on FL writing scores. The IALLT Journal, 46(2), 1-39. https://doi.org/10.17161/iallt.v46i2.8560.

O’Neill, R., & Russell, A. (2022). Stop! grammar time: University students’ perceptions of the automated feedback program Grammarly. Australasian Journal of Educational Technology, 35(1), 42-56. https://doi.org/10.14742/ajet.3795.

Rahman, N., Zulkornain, H., & Hamzah, H. (2022). Exploring artificial intelligence using automated writing evaluation for writing skills. Environment-Behaviour Proceedings Journal, 7, 547-553. https://doi.org/10.21834/ebpj.v7iSI9.4304.

Ranalli, J., Link, S., & Chukharev-Hudilainen, E. (2017). Automated writing evaluation for formative assessment of second language writing: Investigating the accuracy and usefulness of feedback as part of argument-based validation. Educational Psychology, 1, 8-25. https://doi.org/10.1080/01443410.2015.1136407.

Roscoe, J., Wilson, A., & Johnson, C. (2017). Presentation, expectations, and experience: Sources of student perceptions of automated writing evaluation. Computers in Human Behavior, 70, 207-221.

Rudolph, J., Tan, S., & Tan, S. (2023). ChatGPT: Bullshit spewer or the end of traditional assessments in higher education? Journal of Applied Learning & Teaching, 6(1), 342-363. https://doi.org/10.37074/jalt.2023.6.1.9.

Shang, H. F. (2022). Exploring online peer feedback and automated corrective feedback on EFL writing performance. Interactive Learning Environments, 30(1), 4-16. https://doi.org/10.1080/10494820.2019.1629601.

Song, C., & Song, Y. (2023). Enhancing academic writing skills and motivation: Assessing the efficacy of ChatGPT in AI-assisted language learning for EFL students. Frontiers in Psychology, 14, 1-14. https://doi.org/10.3389/fpsyg.2023.1260843.

Stevenson, M., & Phakiti, A. (2014). The effects of computer-generated feedback on the quality of writing. Assessing Writing, 19, 51-65. https://doi.org/10.1016/j.asw.2013.11.007.

Valijärvi, R. L., & Tarsoly, E. (2019). Language students as critical users of google translate: Pitfalls and possibilities. Practitioner Research in Higher Education, 12(1), 61-74.

Wang, M., & Guo, W. (2023). The potential impact of ChatGPT on education: Using history as a rearview mirror. ECNU Review Education, https://doi.org/10.1177/20965311231189826.

Wang, Y.-J., Shang, H.-K., & Briody, P. (2013). Exploring the impact of using automated writing evaluation in English as a foreign language university students’ writing. Computer Assisted Language Learning, 26(3), 234-257. https://doi.org/10.1080/09588221.2012.655300.

Wilson, J., & MacArthur, C. (2024). Exploring the role of automated writing evaluation as a formative assessment tool supporting self-regulated learning and writing. Routledge.

Wilson, J., & Roscoe, R. (2020). Automated writing evaluation and feedback: Multiple metrics of efficacy. Journal of Educational Computing Research, 58, 87-125. https://doi.org/10.1177/2F0735633119830764.

Yoon, C. (2011). Concordancing in L2 writing class: An overview of research and issues. Journal of English Academic Purpose, 10, 130-139. https://doi.org/10.1016/j.jeap.2011.03.003.

Yoon, S. (2019). Student readiness for AI instruction: Perspectives on AI in university EFL classrooms. Multimedia-Assisted Language Learning, 22(4), 134-160. https://doi.org/10.15702/mall.2019.22.4.134.

Zhai, N., & Ma, X. (2022). Automated writing evaluation (AWE) feedback: A systematic investigation of college students’ acceptance. Computer Assisted Language Learning, 35(9), 2817-2842. https://doi.org/10.1080/09588221.2021.1897019.