The second research question aimed to identify the specific areas of IELTS writing—such as task response, coherence and cohesion, lexical resource, and grammatical range—where ChatGPT’s feedback was perceived to have the greatest impact.
1) ChatGPT’s Evaluation Accuracy
The analysis, which focused on how the students perceived ChatGPT’s ability to evaluate essays based on IELTS Writing assessment criteria (Question 7), revealed that a majority of students (73.1%,
n = 19) believed ChatGPT provided accurate evaluations. Meanwhile, 11.6% (
n = 3) expressed neutrality, and 15.4% (
n = 4) voiced skepticism. This skepticism often stemmed from misconceptions about IELTS examiners being subjective or doubts about AI’s ability to replicate human judgment effectively (see
Table 2).
Among the criteria, Task Response was the most frequently highlighted, with the students (S5, S11, S15, S19, S25, S26) appreciating ChatGPT’s ability to identify weaknesses and suggest improvements. These students specifically mentioned that ChatGPT’s feedback helped refine their arguments and better address the task prompt.
Meanwhile, responses from S15 and S19 were more general in nature and did not explicitly reference Task Response. Instead, their comments reflected broader perceptions of ChatGPT’s assessment accuracy across multiple criteria. As such, these responses have been reclassified accordingly to avoid misrepresentation.
However, other criteria, such as Coherence and Cohesion (S5, S6, S15, S19), Lexical Resource (S5, S15, S19), and Grammatical Accuracy (S5, S15, S19, S24), were also valued. For example, S5 commented, “ChatGPT can provide a full assessment along with the weaknesses in the essay if a proper prompt of assessment criteria is provided.” It is also essential to note that few comments (S5, S15, S19) were wide-ranging, covering multiple aspects of IELTS Opinion Essay writing rather than being limited to a single criterion.
To illustrate this point,
Figure 1 shows an excerpt from Student 5’s essay along with ChatGPT’s feedback. While the feedback provides balanced insights across all four IELTS criteria, it particularly underscores areas for improvement under Task Response. Specifically, it suggests further development in proposed solutions and more work on the underlying reasons, noting that “the prompt is adequately addressed,” rather than perfectly. These are major issues that need to be mended whereby the Task Response can be elevated to a higher score. In contrast, Coherence and Cohesion and Lexical Resource were assessed more positively with minor mistakes pointed out, reflecting strengths in organization and vocabulary use.
The comparison demonstrates ChatGPT’s detailed alignment with IELTS assessment criteria, offering constructive feedback tailored to the essay’s strengths and weaknesses of the essay. This level of feedback not only supports the student’s reflections on Task Response but also reinforces the tool’s utility in providing actionable insights across other criteria like Coherence and Cohesion and Lexical Resource.
However, while Task Response was frequently cited as a strength, the students’ overall satisfaction with ChatGPT’s assessment varied. Responses to Question 10 (M = 2.77, SD = 1.18) indicate a low level of satisfaction with ChatGPT’s evaluation accuracy, though not overwhelmingly negative. Some students expressed dissatisfaction, while others provided neutral responses, reflecting mixed perceptions of ChatGPT’s effectiveness in assessment accuracy. This variability highlights the need to further explore factors influencing students’ views on AI-generated feedback, including expectations, prior experiences with human feedback, and individual preferences. When comparing ChatGPT’s overall assessment with the ratings of other satisfaction factors, its rating was significantly lower. This suggests that while the students found certain aspects of ChatGPT’s feedback useful (such as grammar correction or coherence-related suggestions), they were more critical of its overall assessment accuracy. Higher expectations for AI-driven evaluation, along with the complexity of IELTS writing tasks, may have contributed to this discrepancy.
This discrepancy can be examined in three main ways. Firstly, while the students found specific aspects of ChatGPT’s feedback highly satisfactory (such as grammar correction or vocabulary enhancement), they might have been more critical when considering its overall effectiveness. Secondly, the students may have had higher expectations for ChatGPT’s overall performance, leading to a more stringent evaluation. Finally, the complexity of IELTS writing tasks might have influenced the students’ perceptions of ChatGPT’s overall usefulness, despite finding it helpful in specific areas. Therefore, this finding highlights the importance of considering both specific and overall evaluations when assessing the effectiveness of AI tools in educational contexts.
2) Perceived Strengths in Feedback
The analysis, which focused on identifying the areas of IELTS writing the students believed ChatGPT was most effective (Question 8), revealed that Task Response was particularly emphasized.
Table 3 highlights these perceptions, illustrating the aspects of IELTS essay evaluation where areas where ChatGPT’s feedback was considered most impactful.
The findings indicate that Task Response was perceived as ChatGPT’s strongest area, with 53.8% (
n = 14) of the students acknowledging its effectiveness in addressing essay content and relevance to the prompt. This aligns with earlier findings from the students’ qualitative responses (see
Table 2), where many valued ChatGPT’s ability to highlight weaknesses and suggest improvements in this criterion.
Additionally, Coherence and Cohesion was identified by 26.9% (n = 7) of the students, indicating the tool’s utility in guiding logical structuring and the organization of ideas. Although Grammatical Range (11.5%, n = 3) and Lexical Resource (7.7%, n = 2) were mentioned less frequently, the results highlight ChatGPT’s broad applicability in providing feedback across diverse writing elements.
3) Follow-Up Questions and Their Themes
The analysis of Question 4.1 and 6 indicates that while most students did not ask follow-up questions after receiving ChatGPT’s feedback, this does not necessarily imply that they found the feedback clear or comprehensive. Specifically, 38.4% (
n = 10) of students sought additional clarification or guidance, primarily for grammar-related or task-related issues (see
Table 4). This suggests that while the pre-designed prompt effectively guided ChatGPT to deliver structured feedback, some students still desired more specific explanations in certain areas.
The follow-up question (Question 4.2) was related to grammar and task clarity. For example, one student (S14) asked ChatGPT to explain a specific grammar rule relevant to their essay, while another (S11) sought clarification on how to better meet the task requirements of the IELTS Opinion Essay.
In contrast, 61.5% (n = 16) of students did not ask follow-up questions (Question 6). The majority of these students explicitly stated ChatGPT’s feedback met their needs, as evidenced by responses such as: “It pointed out my exact weaknesses and offered clear fixes, so I didn’t have any more questions” (S2), “I am satisfied with ChatGPT response” (S7), “Everything is quite clear and there is no need to ask extra questions” (S14), and “Explanation was clear and fair” (S16). Additionally, two students (S19, S21) indicated that their limited English proficiency prevented them from asking further queries.
Among the 10 students who posed follow-up questions, most (
n = 4) found inquiries related to essay structure and writing skill improvement to be the most helpful. Others highlighted the usefulness of ChatGPT’s feedback on grammatical accuracy (
n = 3) or valued its general learning recommendations (
n = 1). Additionally, two students (
n = 2) appreciated all types of feedback equally, reflecting ChatGPT’s ability to cater to diverse learning needs (see
Table 5).
These observations align with
Koraishi’s (2023) findings, which emphasize the significance of structured prompts in optimizing AI-generated feedback. These findings do not conclusively establish that ChatGPT’s feedback clarity alone minimized follow-up questions. While some students explicitly stated that the feedback was sufficient, others may have had different reasons for not engaging further, such as a preference for independent learning or difficulty in formulating follow-up questions.
4) Perceptions of Writing Improvement
When the students were asked if they believed ChatGPT could improve their IELTS writing skills (Question 11.1), responses were overwhelmingly positive (see
Table 1). To further explore this perception, the students were invited to elaborate on how ChatGPT could support their writing development (Question 11.2).
Table 6 summarizes their comments, with 19 out of 26 students (73.1%) providing detailed responses. The most frequently cited benefit was Feedback and Error Identification (
n = 7), followed by Vocabulary and Grammar Improvement (
n = 5) and Idea Generation and Development (
n = 4). Fewer students highlighted Writing Structure and Style Guidance (
n = 2) and General Practice and Learning (
n = 1), reflecting the tool’s versatility in addressing diverse aspects of IELTS essay writing. These findings indicate that EFL learners perceive ChatGPT as a multifaceted resource capable of enhancing both the linguistic and conceptual dimensions of their writing. This diversity of responses illustrates the broad applicability of ChatGPT as a resource for both linguistic and conceptual enhancement.
The comments gathered reflect how Uzbek EFL students view ChatGPT as a reliable tool for addressing their writing challenges. Its strengths in providing clear feedback on task response and coherence were particularly valued, alongside its role in supporting grammar and vocabulary improvements. These perceptions align with
Shin et al. (2023), who highlight the importance of targeted feedback in fostering writing proficiency.
However, some students noted ChatGPT’s limitations in identifying advanced grammatical errors. For instance, Student 6 commented, “ChatGPT feedback is sometimes not very accurate. While it helps with basic grammar mistakes, it does not always recognize more complex sentence structures or subtle grammar mistakes that a human examiner might notice.” These concerns align with prior research (
Kim et al., 2023) that highlights the limitations of AI in detecting complex linguistic nuances. Despite these concerns, the overwhelmingly positive responses underscore ChatGPT’s potential as a valuable resource for foundational writing skill development.
Additionally, the ease and clarity of ChatGPT’s feedback suggests that structured prompts may enhance the feedback process, particularly for EFL learners preparing for high-stakes assessments. However, further research is needed to directly examine the role of structured prompts in optimizing AI-generated feedback.