기술 수용 모델에 기반하여 ChatGPT를 활용한 한국 대학생의 영어 학습에 관한 연구
Applying the Technology Acceptance Model to Understand the Use and Application of ChatGPT for English Language Learning by South Korean University Students
Article information
Abstract
본 연구에서는 대학 필수 영어 과목을 수강하는 한국 대학생들이 영어 학습을 위해 ChatGPT를 사용하는 데 영향을 미치는 요인을 알아보고자 했다. 본 연구는 외부 요인, 인지된 이용 용이성, 인지된 유용성, 이용 태도, 행동적 이용 의사, 실제 이용을 포함한 6개의 관련 변수를 활용하는 기술 수용 모델(TAM)을 사용하였다. 총 122명의 한국 대학생이 온라인 설문에 참여했고, 설문 결과 분석에 구조 방정식 모델링이 사용되었다. 연구 모델을 기반으로 한 8가지 가설을 테스트했고, 통계적 유의성을 검증했다. 특히, 5가지 요인 모두 실제 기술(ChatGPT) 이용에 직접적으로 긍정적인 영향을 미치고 있는 것으로 나타났다. 본 연구는 한국 학생의 관점을 분석하여 영어 학습을 위한 기술 수용 행보에 대한 이해를 높이고, 이러한 연구 결과가 향후 대학 수준에서 외국어로서의 영어(EFL) 교육 발전에 미치는 영향에 대해 논의한다.
Trans Abstract
This study delves into the determinants influencing South Korean university students studying required college English courses to adopt the use of ChatGPT for their English language learning. A hypothesized technology acceptance model (TAM) utilizing six pertinent variables, including external factors, perceived ease of use, perceived usefulness, attitudes toward the use of technology, behavioral intention, and actual use, was adopted. A total of 122 college students from one university in South Korea participated in the study, providing data through an online questionnaire. Structural equation modeling was employed for the data analysis. The study tested eight hypotheses based on the study model, all of which demonstrated statistical significance. Notably, all five of the factors exhibited direct positive effects on the actual use of the technology. This research contributes to our understanding of the trajectory toward technology acceptance by exploring the perspectives of students in South Korea. It also discusses the implications these findings have on the future evolution of English as a foreign language (EFL) education at the university level.
1. Introduction
ChatGPT is a form of generative AI that has evolved as an industry leader in a very brief period of time. Its ability to produce a wide variety of output has made it an indispensable tool for today’s workforce. While ChatGPT has been embraced rapidly in many professional contexts, its acceptance in academic circles has been mixed. The discourse frequently revolves around its impact on learning outcomes. From the perspective of an English as a foreign language (EFL) educator, the challenge is how much of the new technology, if any, can be included in the curriculum while still addressing the class’s learning objectives. Traditionally, language learning classrooms are teacher-student centered, where both parties work together to achieve a learning outcome. However, with the advent of text-generative AI, students can now, more than ever before, dramatically alter their educational setting to a self-directed learning environment. Initial reactions worldwide have suggested this is the future, while others point to the need for human oversight of the output as proof that the technology is not as helpful for learning English as first suggested. This same argument took place for other advances in technology, such as machine translators (Jones et al., 2019).
While many studies have examined the inclusion of earlier forms of digital technology such as language translators or online learning in the language classroom (Baber, 2021; Cha & Kwon, 2018; Copeland & Franzese, 2021; Inozu et al., 2010; Jones et al., 2019; Kim, 2016; Kim et al, 2019; Lee et al., 2022; Kim & Lee, 2016; McLoughlin & Lee, 2010; Park et al., 2011), investigating the perception and role of ChatGPT in English language learning is still evolving. In the South Korean context, Kim et al. (2023) found positive results when they evaluated ChatGPT as a language-learning tool. Jeon & Lee (2023) explored the potential roles for ChatGPT in the classroom as a collaborator with instructors. These studies were conducted from the perspective of the educator. However, in China, a study (Liu & Ma, 2023) using Davis’ (1989) Technology Acceptance Model (TAM), which assesses how key factors such as perceived ease of use and perceived usefulness affect the decision to adopt new technology, looked at the level of acceptance and use of ChatGPT for autonomous English language learning by students. Liu (2023) found an overall positive attitude towards using ChatGPT for autonomous English language learning, indicating a need for educators and policymakers to embrace this new technology. Similar research is lacking in South Korea for students engaged in both classroom and autonomous learning. This study aims to add to our understanding of this topic. Specifically, this research examines how South Korean university EFL students perceive ChatGPT for learning English, their intentions of using it for this purpose, and their actual use of it for both autonomous and classroom English language learning. Ultimately, by understanding how students feel about its use for English language learning, educators can better comprehend how to address the inclusion of this new resource in their teaching toolbox.
2. Literature Review
2.1. Technology and English Language Learning
Digital devices and many other forms of new technology are accessible on campuses across the globe. They are powerful tools that can enhance or disrupt the learning process. Research has demonstrated that students have a positive attitude toward using technology for learning (Inozu et al., 2010; McLoughlin & Lee, 2010). Students in South Korea are generally willing to adopt the use of most forms of technology, such as mobile learning or e-learning and chatbots (Baber, 2021; Kim, 2016; Kim et al, 2019; Park et al., 2011) Im’s 2017 study of Korean EFL university students’ perceptions of using machine translators to complete English writing tasks found that while students were generally positive about using translators, they did not trust them fully due to frequent mistranslations (Jones et al., 2019). Im’s findings were confirmed in a study by Briggs in 2018. While the output produced by ChatGPT is commendable, it too is imperfect. It is the responsibility of the instructor to ensure students are aware of its shortcomings and understand how to properly vet the output for credibility (Cai, 2023).
Other studies have shown (Inozu et al., 2010; McLoughlin & Lee, 2010) that students also require some scaffolding support from teachers if they are to include technology in their learning process. This greatly motivates students to use a given technology, particularly for self-directed learning (Pan, 2020).
When it comes to text-generators such as ChatGPT, there are surprisingly few papers that deal exclusively with the use of this technology from a student’s perspective (Shaikh et al., 2023; Xiao & Zhi, 2023), and none could be found dealing with the South Korean student context specifically. This is the gap in the literature this research attempts to address.
2.2. Technology Acceptance Model
The Technology Acceptance Model (TAM) created by Fred Davis in 1989, attempts to explain how users accept and use technology. The model is based on two key principles: perceived usefulness (PU) how this new technology can enhance a user’s productivity, and perceived ease of use (PEU) the degree to which a person believes the technology is easy to use. Since its inception, this model has been criticized for oversimplifying the “decision to adopt” process. As a result, it has been modified several times for different studies to account for varying conditions affecting the decision-making process of the targeted end user. Venkatesh and Davis (2000), suggested in TAM2 that both PU and PEU can be affected by external factors (EF), such as social influences. In the TAM2 model, accepting new technology involves three stages. In the first stage, EF can include a variety of influences such as subjective norms, future or present jobs, images, and so on. These factors are predicted to initiate cognitive responses in the second stage. This stage includes PEU and PU. In the third stage, these cognitive responses lead to affective responses, such as the attitude (AT) towards using a particular technology or the intention (BI) to use it, ultimately influencing actual usage (AU) behavior (Marikyan & Papagiannidis, 2022).
In a 2023 study by Shaengchart, using the TAM, the researcher found that PU and PEU significantly impacted a student’s decision to adopt the use of ChatGPT technology. This was pertaining to their studies in general and not specifically language related. However, using a simplified version of the TAM2 model, Liu & Ma (2023) conducted a study examining the degree to which autonomous learners of English in China accept and use ChatGPT to achieve their educational objectives. Their model eliminated consideration of EF on the decision outcome, instead focusing on the relationship between AT, BI, and AU because of PU and PEU. Their model found that PEU alone failed to predict students’ AT toward using ChatGPT for autonomous English learning. However, both PEU and PU taken together had a significant impact on AT. Overall, their findings suggest that ChatGPT has the potential to be a “powerful language learning tool that EFL learners should utilize” (Liu & Ma, 2023, p.2). At the time of writing, there are no other TAM studies that relate to the acceptance and use of ChatGPT for language learning in South Korea.
2.3. Model and Hypotheses
Based on existing TAM literature (Venkatesh & Davis, 2000; Mrikyan & Papagiannidis, 2022) this study considered the impact of EF important for the model (see Figure 1). The EF category considered five items that could easily be broken into three separate factors, including social influences, image, and future career/job, but for the sake of simplicity, they were kept as one group. Overall, the model adopts the TAM2 staged approach to acceptance.
The model hypothesizes inter-factor relationships between the TAM components (Table 1) and will attempt to prove the following hypotheses:
H1. EF positively affects PU.
H2. EF positively affects PEU.
H3. PEU positively affects PU.
H4. PU positively affects AT.
H5. PEU positively affects AT.
H6. AT positively affect BI
H7. PU positively affects BI
H8. BI positively affects AU.
3. Methodology
3.1. Participants
167 students studying college English at a private university in South Korea were asked to voluntarily complete this survey about their perceptions of using ChatGPT for English language learning. The survey was designed to collect quantitative data using a modified TAM that was based on a design by Venkatesh & and Davis (2000). The questionnaire was posted online over two weeks during the 2023 fall semester. 122 university students in the EFL courses responded, for a total response rate of 73%.
39% of the respondents were female, 59% were male, while the remainder preferred not to say. 70% of the respondents were between 18-21, and roughly 29% were between 21-25. Only one respondent was between 26-30 years old. Most participants (88%) were in their first year of studies, and all participants came from various majors (see Table 2).
The students were predominately designated intermediate level in English. However, 13% of the respondents were advanced, and one respondent was a beginner (Table 2). Upon admission to the university, the English language proficiency skills of students are tested using an exam similar to TOEIC and administered by an agency outside the university prior to the start of their first semester. The university then set class levels based on those results. Scores ranging between 401 to 700 points on the test were determined to be intermediate level learners, while the beginner-level students achieved scores of 400 and below, and the advanced achieved scores above 700.
3.2. Data Collection
Students were given a link to the Google form survey posted on the e-learning class portal. Students were advised on the topic of the survey and that their participation was voluntary, and that responses would remain anonymous. Participants were tasked with responding to five questions related to their demographics and 27 questions that addressed the six factors identified in the model in Figure 1. Questions relied on a 5-point scale rating of 1 to 5 from strongly disagree, disagree, neutral, agree to strongly agree.
3.3. Data Analysis
The data was then downloaded for analysis using IBM SPSS Statistics v.27. Several tests were performed. First, descriptive statistics were determined to assess the factor load. Then, internal reliability and the feasibility of factor reduction were assessed. Finally, standard multiple regression tests were conducted to determine how the factors predicted usage within the framework of the model.
4. Results
The first section will review the results of the descriptive statistics of the items covered in the survey. The test assumptions are then defined. This section concludes with the results of a standard multiple regression analysis for all stages of the model. The first stage explains the impact EF has on PEU and PU. The next stage assesses the impact PEU and PU have on AT. This is followed by the impact PU and AT have on BI and finishes with how BI impacts AU.
4.1. Descriptive Statistics
The survey covered six factors (Table 3). The first category was external factors (EF) with five questions. The mean responses ranged from 2.52 to 3.49. The next factor was perceived ease of use (PEU) with four questions. The mean responses ranged from 3.39 to 3.61, the largest among the six factors. Perceived usefulness (PU) was the third factor with five questions. The mean response range in this category was between 2.66 and 3.36. Attitude (AT) was the fourth factor considered in this model, with five questions and a mean response range of 3.00 to 3.47. Next was behavioral intention (BI) with four questions. This factor had a mean range of 3.25 to 3.31. The final factor considered in the model was actual use (AU). This factor had four questions with a mean response of 2.50 to 3.42. It is also the factor with items returning the largest standard deviation, implying a more extensive response range. Overall, one-third or 9 of the 27 survey items returned an SD rate of slightly greater than one, indicating approximately 68% of the scores are within one standard deviation of the mean, which, in the case of this survey, translates to the difference between agree and disagree.
4.2. Assumptions
According to Beavers et al. (2013), The Kaiser-Meyer- Olkin value (KMO) and Bartlett’s Test of Sphericity (BTS) should be completed to determine if the sample size is appropriate for factorial analysis. The KMO was determined to be .939. According to Dziuban and Shirkey (1974), this result is in the Excellent range. The BTS was significant at x2 (351) = 2984.161, p = <.004. Factorial analysis was permissible as both tests passed.
The study adopted a standard of larger than one eigenvalue to determine the total variance explained in the survey at 71%. The survey consisted of 27 items having a factor load ranging from .56 to .82 (Table 3). Next, Cronbach’s alpha was used to determine internal consistency. The entire survey had a Cronbach’s alpha of .968, which is considered excellent (>.9) but slightly high because it is >.95. This is likely due to the number of items in the survey. However, the Cronbach alpha scores for each of the six factors range from .784 to .939, which is between the very good and excellent range (Hair 2003). The External Factors (EF) variable had a Cronbach’s alpha score of .784. Perceived Ease of Use (PEU) had a .861, and Perceived Usefulness (PU) had a score of .923. Attitude (AT)’s Cronbach’s alpha was .896. Behavioral Intention (BI) had a Cronbach’s alpha of .939, while Actual Use (AU) had a score of .858 (Table 3).
4.3. Statistical Tests
A standard multiple regression analysis was conducted to determine the degree to which different factors in the TAM model for this study predicted the usage behavior of ChatGPT by students to improve their English language learning.
4.3.1. Prediction of PU from EF and PEU
According to the study model, PU was determined by EF and PEU. The item to total correlations was statistically significant to PU at the p = < .004 level. Both items had a large statistically significant effect of over .70 (Table 4). The inter-factor relationships were also significant.
The overall regression of the two predictors was statistically significant at an adjuster R2 = .694, F (2,119) = 138.13, p = < .004. This means the model can explain 69% of the variance of the regression. Both of the predictors were statistically significant, indicating a positive relationship between the predictors and PU. PEU had a standardized coefficient β = .47, t (121) = 7.304, p = .004, and EF had a standardized coefficient β = .458, t (121) = 7.117, p = .004 (Table 4).
4.3.2. Prediction of AT from PU and PEU
The second stage of the model examines how PU and PEU affect Attitude (AT). The item-to-total and inter-item correlations were statistically significant to AT at the p < .004 level (Table 5). The adjusted R2 = .761, F (2,119) = 193.29, p ≤ .004. This means the model can explain 76% of the variance of the regression.
Both of the predictors were statistically significant, indicating a positive relationship between the predictors and AT. PU had a standardized coefficient β = .524, t (121) = 7.719, p = .004, and PEU had a standardized coefficient β = .408, t (121) = 6.016, p = .004(Table 5).
4.3.3. Prediction of BI from PU and AT
The third stage of the model addresses how PU and AT predict BI. The item-to-total values were statistically significant to BI at the p < .004 level (Table 6). The inter-item correlation was also statistically significant. The adjusted R2 = .773, F (2,119) = 207.48, p ≤ .004. This means the model can explain 77% of the variance of the regression.
Both of the predictors were statistically significant, indicating a positive relationship between the predictors and PU. PEU had a standardized coefficient β = .260, t (121) = 3.324, p = .004, and PU had a standardized coefficient β = .654, t (121) = 8.368, p = .004 (Table 6).
4.3.4. Prediction AU from BI
A final standardized regression was run to determine how BI affects Actual Usage (AU). The inter-item correlation shows a statistically significant relationship at the p = .004 level. The adjusted R2 = .462, F (1,120) = 111.756, p ≤ .004. This means the model can explain 48% of the variance of the regression. AT had a standardized coefficient β = .683, t (121) = 10.428, p = .004.
5. Discussion
5.1. Hypothesis Results
All eight of the hypotheses tested in the conceptual model passed (Table 7). The EF in the model predicted PU and PEU, although the relationship was stronger between EF and PEU. The paths of PU to AT and PEU to AT were also supported. PU also affected a student’s BI. Finally, the BI positively affected a student’s AU.
5.2. The Model
Figure 2 illustrates student acceptance of using ChatGPT for English language learning. Using a TAM model illustrated below, it is evident that students’ perceived usefulness (PU) is affected by both external factors (EF) and perceived ease of use (PEU). This means that as these variables increase, so will the PU. Students’ PEU and PU were also both positively linked to Attitude (AT). At the same time, PU and AT showed a strong relationship towards BI. Finally, based on the results, it is clear that students’ actual use (AU) increases as the behavioral intention (BI) increases.
5.2.1. External Factors (EF)
Two items in the EF variable indicated a low mean response. Both questions dealt with peer pressure. “I feel I will fall behind my classmates if I do not use ChatGPT for my English language learning.” And “Everyone around me uses ChatGPT to learn English, so I believe I should too.” Their mean responses were 2.52 and 2.6, respectively (Table 3). This indicates that peer pressure is less of a factor influencing their decision to use ChatGPT to learn English than the perspectives of people who are important to them (3.03) (Table 3).
The item “Using ChatGPT for English language learning will help me in my future career” had the highest response mean at 3.49, indicating that most students believe using ChatGPT to learn English is important for the future.
Overall, EF had a greater effect on PEU than it did on PU, although both were still significant. This is likely because the initial reactions to ChatGPT were positive, with many people professing the relative simplicity of the AI tool to achieve results. However, given that this model deals with using it for the purposes of language learning, university students might be showing a slight preference for learning English without the aid of AI. In fact, the mean response for the question “Using ChatGPT significantly enhances my English language learning” was the lowest response in the variable PU at 2.66.
5.2.2. Perceived Ease of Use
This factor has the highest overall positive responses to each item. The mean response range was from 3.40 to 3.61, indicating a strong perception of the simplicity of using ChatGPT for English language learning.
5.2.3. Perceived Usefulness
The lowest mean response to an item in this factor was 2.66. It was related to the statement, “Using ChatGPT significantly enhances my English language learning.” Interestingly, the item “Using ChatGPT contributes positively to my English language learning.” Had the highest mean value of the factor at 3.36. Users were slightly more than neutral to this response. However, the SD for this same item was 1.021, indicating a single-point range difference in responses, which is considerably high given it was a five-point scale. This means overall that while students were more positive about the contributions it could make to their English language learning, they did not see this contribution as significant.
5.2.4. Attitude
AT impacted BI two and a half times more than PU. This shows that the attitude of students towards using ChatGPT for English language learning significantly impacts their reaction to using it. Overall, students had a positive attitude towards using ChatGPT to learn English with the highest mean score of 3.47 applying to the item, “I have a strong positive attitude towards using ChatGPT for my English language learning.” However, this item also had a large SD of 1.054, indicating a single-point range between responses despite favoring the positive. The SD for AT item number four, “I am highly interested in incorporating ChatGPT into my English language learning,” is 1.093 with a mean response of 3.14. This again demonstrates that the responses vary across the neutral range by approximately 1 point, between agree and disagree.
5.2.5. Behavioral Intention
BI also significantly impacts students’ actual use of ChatGPT to learn English. From the model we can see that the BI of most students was slightly favorable in all items tested. Mean scores ranged from 3.00 to 3.30 with SD ranging from .903 to .944. Overall students are satisfied with their experiences of using ChatGPT to learn English.
5.2.6. Actual Use
According to the model predictions, BI should impact actual use. While BI was generally favorable, AU fell below the mean response in a few categories. The item “I frequently use ChatGPT in my English Language learning” had the lowest mean response of the survey, with a rate of 2.50 and an SD of 1.187. This clearly indicates there is a divide between those actively use it and those who do not.
However, the item “In the past year, I have used ChatGPT to help with work in my English language classes” returned a more neutral response rate of 3.02 with an SD of 1.256. This indicates that some students have used it but not too often. This could be because it was either included in the classroom activities or they were motivated to use it for classwork to improve their grades.
The item “In the past year, I used ChatGPT for my autonomous English language learning.” Had a low mean response rate of 2.73 and an SD of 1.150. This could mean that students rarely focused on learning English outside the classroom, and when they did, at least for some, ChatGPT was not their preferred tool for language learning. In a study of first-year Turkish students of English conducted by Inozu et al. (2010), those who made efforts to improve their English language skills outside the classroom said they did so through the Internet. The content they preferred included “materials most useful in improving receptive language skills rather than productive ones” (Inozu et al., 2010, p.18). In other words, student autonomous learning focuses on vocabulary development rather than speaking or writing skills. While ChatGPT is accessible through the Internet, it is text- generative in nature and does not focus on receptive language skills development. This could explain why students in this study remain lukewarm overall or divided on its use for autonomous English language learning. Another possible explanation is that students require some oversight for their interactions with ChatGPT. Studies have shown that student acceptance of new technology for autonomous learning requires some form of scaffolding for users to readily adopt. “The challenge for educators, therefore, is to enable self-direction, knowledge building, and autonomy by providing options and choice while still supplying the necessary structure and scaffolding” (McLoughlin & Lee, 2010, p.33).
For the item, “Based on my experience, I will continue to use ChatGPT in the future to learn English,” students indicated they were in accordance with this remark. The mean response rate was 3.42 with a SD of 1.120. In other words, most of the student respondents are willing to use it to continue their English language learning in the future. It is clear that students understand it is important to assist in their English language learning but they have not begun to fully include it in their toolbox.
6. Conclusion and Suggestions
While ChatGPT has been publicly available since November 2022, the reaction of students in this study towards using it for their English language learning has been overall positive yet mixed. The model used in this research was proven, with each hypothesis proposed being accepted. However, it is important to note that in this study, students were aware that their participation was voluntary, and questions related to their experiences using ChatGPT for autonomous and academic language learning. Therefore, the assumption is made that they only responded had they actually used it or are still using it for this purpose. Some may have tried it and did not feel that it suited their language-learning needs. This would explain why the mean response range for all items falls somewhere between a slight disagreement and a slight agreement.
It is clear from the results that students are well aware of ChatGPT’s capabilities and ease of use, and most have reported positive experiences using it; however, they seem a little more cautious about their decision to actually adopt it for use in their language learning. The use of ChatGPT for autonomous English language learning was quite low, with a less-than-neutral mean response of 2.73. While some students may value autonomous English language learning, as previously mentioned, studies have shown they focus primarily on receptive skills during their self-directed learning instead of productive ones. ChatGPT focuses more on productive skills such as writing, so this could be the reason for its low reported mean use. In addition, students often consider their autonomous learning an extension of classroom studies and often seek guidance from teachers for their autonomous learning activities. As a result, it is imperative that teachers train students in more expansive methods for using ChatGPT for more effective self-directed learning. In the context of this study, students can benefit from being taught how to use ChatGPT to improve their listening, speaking, reading, and writing skills. This would promote the use of the tool beyond the seminar room for continued language learning. In South Korean universities, students are often tasked with minimal required English language courses to graduate. In most instances, their English language learning ends when those requirements are met. If professors teach their students how to use the tool for a broad range of language skills for future autonomous learning, their knowledge will continue to develop beyond the classroom, better preparing them for a competitive job market.
While favoring the positive slightly, student attitude is lowest regarding the idea that ChatGPT offers quality assistance. The results of this item indicated that respondents were divided on whether they agreed or disagreed with this remark. Some educators may agree with this shortcoming and take it as proof that this technology cannot replace the role of the teacher. However, in doing so, they are denying their students a learning opportunity that will help them well into the future. While it is true that the responses generated by ChatGPT are sometimes flawed and incorrect, it is the responsibility of educators to offer guidance to their students on how to spot weaknesses and get more out of the tool by learning to use it more effectively. They need to teach students how to get the most effective educational feedback using better prompting. This is a skill set that takes time to develop, and with more exposure to proven ideas and strategies, student impressions of the value of assistance obtained from ChatGPT for English Language learning will increase, and teacher confidence in the tool as a partner educator, instead of a replacement, will grow too. Overall, professors interested in arming their students with 21st-century skills will need to spend more time developing their own and their students’ knowledge and confidence in using ChatGPT to achieve this objective.
7. Limitations and Future Research
This study has some limitations. First, it focused on students in one private university in South Korea. It would be interesting to see how the results would vary if the sample size and scope widened. The study also investigated the use of ChatGPT for English language learning. It did not identify specific language-related skills. Future studies could examine if students were more or less inclined to use it, for example, for writing assistance more than speaking or pronunciation skills. Finally, this study was also conducted in the first year of ChatGPT being introduced to the public. There have been several iterations of the AI-powered program since its launch in November last year, including a more advanced paid version. This study did not differentiate between the versions used by the students, and given the reported differences in capabilities, this might have an impact on their views and adoption of the technology.