Ria Goyal
Iris Kim
Logan Hosoda
University of Washington, Seattle
01 March 2024
HCDE 496: AI Directed Research Group
Executive Summary………………………………………….………………………………………..01
Introduction……………………………………………………………………………………………..01
Course Overview……..………………………………………………………………………………01
Purpose…………..……………………………………………………………………………………02
Research/Literature Review ………………………………………………………………………..02
Design Question……….…..………………………………………………………………….……..03
Research Questions…….……………………………………………………………………….……03
Methodology………..…………………………………………………………………………………..04
Interview Questions ..………..………………………………………………………………………05
Survey Questions…..………………………………………………………………………………..05
Participants…………………………..…….………………………………………………………….05
Test Environment.……………….……….…….…………………………………………………….06
Session Format…….………….……….…….………………………………………………………..06
Tasks…….………….……….…….……………………………………………………………….…..07
Data Collection/Analysis Method….…….……..…………………………………………………..08
Results………………….………………………………………………………………………………..09
Quantitative Summary………………….…….………………………………………………………..09
Qualitative Summary..…….….…….….….………………………………………………………….12
Findings…………………………………………………………………..………………………………13
Finding 1…………..…………………………………………………………………………………..13
Finding 2…………..…………………………………………………………………………………..13
Finding 3…………..…………………………………………………………………………………..13
Next Steps…………………….…….……………………………………………………………………14
Limitations…..………………………………………………..…………..……………………………14
Evaluations…..…….…….……..……………………………………………………………….……14
Appendix…………………………..…………..………………………………….……………………..15
Images …………..……………………………………………………………………………………15
Transcripts of Interviews…………..…………………………………………………………………15
Questionnaire…………..…………………………………………………………………………….15
Bibliography…………..………………………………………………………………………………15
Over the course of ten weeks, our Directed Research Group under the Human Centered Design and Engineering Department has been discussing and researching Generative Artificial Intelligence (AI). Our team of three students decided to conduct a user test based on what we were interested in, which was Generative AI images. AI image generation has been rising in popularity over the past few years and it has raised some concerns about plagiarism and if it can truly be considered an emerging art form. We wanted to explore what understanding people had about the images created by AI, so this report entails what our study found at the University of Washington.
For our study, we gave out survey questions to gain insights on what people thought of each generated image to see which characteristics were common in their perception of AI generated images. We also conducted interviews to get more qualitative information of what each participant thought about AI as a whole, and Generative image AI specifically.
This usability test report demonstrates our process of usability testing and study, starting with the motivation and purpose of our research, design, and research questions. We also included the methodology of how we conducted our research, and then we displayed the results of our test with qualitative and quantitative data. We conclude the report with the evaluation and limitations of our test.
This study has been conducted by three students in the Human Centered Design and Engineering department at the University of Washington. We were a part of a Directed Research Group with a focus on Artificial Intelligence (AI) in Academia, under the supervision of Professor Alan Marks. This research group focused on understanding the impacts of AI as it related to not only academia, but different sectors of the world as a whole. Every week students delved into readings surrounding different topics of AI. Some topics included how Large Language Models (LLMs) function, ethics of AI, impacts of AI in different industries, and AI generated art. These readings informed student’s independent research. For more information regarding the research group, please visit https://www.hcde.washington.edu/research/marks.
####
The purpose of this study is to explore the perception humans have of visual art as it applies to both AI generated art and human created art. Art has been collectively regarded as a medium for the expression of emotions, experiences, and social commentaries. Traditionally known as a distinctly human pursuit, the evolution of AI (specifically generative AI), equipped with the ability to create artwork almost indistinguishable from human-created art, has brought about a significant shift to this model. This shift prompts a reevaluation of the notion that art is solely a human endeavor. As AI-generated art gains prominence and the lines between it and human-created art become blurred, this study aims to answer the question of whether humans can genuinely distinguish between art created through AI versus humans. The research seeks to delve beyond the traditional understanding of art, aiming to investigate the shifts in aesthetic appreciation as technology plays a role in shaping the artistic landscape.
Article 1: Art, Creativity, and the Potential of Artificial Intelligence
This article explores the intersection of AI and art, focusing on the development of AI art, focusing on AICAN. The authors advocate for using AICAN’s works as art and discuss the challenges and implications of AI creativity. They emphasize the importance of understanding the history of art-making by humans and that there can be a partnership between human and machine creativity rather than working against each other. The AI can serve as a tool in the overall artistic process using algorithms that learn aesthetics from a curated set of images. There are examples in the article where the AI-generated images have evidence of failure to completely imitate human faces with deformations. However, the authors argue that AI can be used for the creative process with pre and post curatorial actions and can use AI art as conceptual art. AICAN uses training that simulates how artists digest prior art and break out of established styles that allow the fostering of a more inherently creative process compared to traditional generative art AI. There are concerns such as the potential of replacing human artists rather than a partner but the authors draw a parallel with historical resistance to photography in the art world. The authors encourage conceptualizing AI algorithms as a new artistic medium as a tool, creative partner, and acknowledging the strengths of how AI can contribute to artistic endeavor.
Article 2: Human versus AI: whether and why we prefer human-created compared to AI-created artwork
This article systematically explores the preference of individuals for human-created art over AI-generated art, assessing sentiments across factors such as emotionality, narrativity, perceived effort, personal meaning, and perceived time. To establish a standardized baseline, participants were only presented with AI-generated artworks, despite being informed about the potential inclusion of human-made pieces.
The interplay between emotion, story, meaningfulness, and effort, is explored, highlighting the intricate relationship between perceived creator and aesthetic judgment. Participants exhibited a preference for AI-created paintings when enriched by compelling narratives, while expressing a preference for human-created art when perceived effort in creation is higher. A detailed examination of Liking and Beauty models uncovers significant interactions related to Story and Effort, indicating a tendency for participants to favor AI-labeled art with engaging narratives and to prefer human-labeled art when perceived effort is higher.
Article 3: Algorithmic Images: Artificial Intelligence and Visual Culture - Antonio Somaini
The use of deep learning for digital images recognition began during the 2010s and since then has had profound impacts on current AI image generation, most notably exposing the internal biases of the internet. With the recent influx of AI image generation, the focus has now shifted towards the relationship between algorithmic images and the translation of various inputs such as text or original artwork. These notable shifts in perception of visual culture, the author expresses concern for the future of visual culture, anticipating that prompt engineering and intellectual integrity will become greater issues. Additionally, large companies are then given the power to filter what can or cannot be produced by image generators, leaving individuals with limiting abilities to explore the full potential of these algorithms. Through these digital tools, it now leads our visual culture into a landscape that is controlled, both by the innovators of AI technology and the capabilities of our words
What characteristics play a key role in participants’ ability to differentiate between artworks believed to be human made and those produced by AI?
Figure 1: Here are some of the images we presented to our participants. All 12 images will be in the appendix.
Inclusion Criteria:
Exclusion Criteria:
General AI Experience Level:
AI Art Experience Level:
Participant Demographic Information: | ||||||
---|---|---|---|---|---|---|
Participant | Age | Year | Gender | Major | Experience Level with AI | Experience Level with AI Art |
P1 | 21 | Senior | M | Business | 5 | 2 |
P2 | 21 | Junior | F | International Studies | 3 | 1 |
P3 | 22 | Senior | F | Human Centered Design & Engineering | 4 | 2 |
P4 | 21 | Senior | F | Communications | 3 | 2 |
P5 | 20 | Junior | F | Education, Communities, & Organizations | 3 | 1 |
P6 | 21 | Junior | M | Industrial Design | 5 | 4 |
P7 | 20 | Junior | F | Informatics | 3 | 1 |
P8 | 20 | Sophomore | F | Informatics | 2 | 2 |
P9 | 19 | Junior | M | Biology | 2 | 1 |
P10 | 22 | Senior | F | Human Centered Design & Engineering | 4 | 3 |
P11 | 21 | Junior | M | Real Estate Development | 4 | 2 |
P12 | 21 | Junior | M | Computer Science | 5 | 4 |
The user tests were conducted in-person, face-to-face, providing a direct and immediate interaction between us and the participants. Each student conducted four user tests, totaling 12 participants for the study. Despite the variations in physical locations, a standardized testing protocol was maintained to ensure consistency in data collection. The in-person format allowed us to observe participants’ reactions firsthand, address questions promptly, and maintain a personalized testing experience.
Test Materials
Participants engaged with the user test by using their personal devices (including phones and laptops) to complete the survey for each image presented. In order to conduct the user test, we used a laptop to display images to participants, and read off a script to ensure standardization across all user tests. We also utilized a phone to voice record the entire session and capture all insights from participants, and to transcribe the entire session.
The user test consisted of three segments, designed to investigate participants’ perceptions of artwork. In the initial phase, participants were presented with 12 individual artworks spanning diverse styles and mediums, and were prompted to complete a survey for each image, evaluating their emotional connection, authenticity judgments, and perceived effort invested in the creation of each artwork. In the second phase, participants were given the same images but in groups of four. Their task involved identifying which images they believed were generated by AI, allowing for an assessment of their ability to discern AI-created art when presented alongside pieces of different mediums and styles. The third and last part was interviews. Questions covered topics including participants’ overall perceptions of art, their perceived legitimacy of AI-generated art, their familiarity with AI platforms, and concerns of AI tools.
Participants were given three overarching tasks over the course of the user test.
Participants were presented with 12 pieces of artwork encompassing various artistic styles and mediums. For each image, participants were asked to fill out a survey that measured the four aspects: emotional connection, detail perception, authenticity, and perceived time in creation. This task aimed to capture nuanced reactions, preferences, and initial impressions.
The same set of 12 images were then presented in groups of four. Participants were tasked with identifying which image(s) they believed were generated by AI within each set. This task aimed to investigate participants’ ability to discern AI-generated art when presented with different styles of artwork.
Participants engaged in an interview session where they answered questions surrounding their beliefs about art and AI. Topics included participants’ beliefs regarding the definition of art, their perspectives on the legitimacy of AI-generated art, their personal experiences with AI tools, and their opinions on the future role of AI in the art domain and in general. This task sought to gather qualitative data, offering a more comprehensive understanding of participants’ beliefs and attitudes towards art and AI.
The interview portion of our study was conducted independently between the three of us. We each conducted four experiments over the course of three weeks. All of these took place in a face-to-face setting while the interviewers used their own devices to present the images to the interviewees. The interviewees used their personal devices to fill out the survey questions. Every interview was audio recorded which was then transcribed into text documents. (Appendix: Transcript of Interviews)
All quantitative data was collected based on a scale from 1-5 via the five survey questions for each of the 12 individual AI generated images. These were the following scales for each of the questions:
Participant demographic data such as age, general AI experience, and general AI art experience values were also collected at the beginning of the experiment.
Participants were asked about their thoughts and opinions about AI in general and its use in artistic settings. Additionally, participants were asked to talk through their thought process when viewing the images one by one, often mentioning certain features about an image, what they liked/disliked, ‘flaws’ in the images, etc. This data was used to determine the impacts of previous AI experience on the experiment and initial perceptions of AI art.
Below are the data visualizations based on the the individual experiments that we conducted:
Graph 1. Survey Question #1
Graph 2. Survey Question #2
Graph 3. Survey Question #3
Graph 4. Survey Question #4
This graph had a total average of 2.4 across all images. Participants expressed to have felt the most personal connection with image 10 (avg: 3) followed by image 3 (avg: 2.83) and the least amount of connection with images 1 (avg: 1.5) and 2 (avg: 1.75). The images that featured more realistic scenery or tangible objects were easier for participants to connect with while abstract or intangible objects had averages less than 2.
This approximately normal distributed graph varies greatly based on the image shown and had an overall average of 3.24. Images such as 11 (avg: 4.17) and 6 (avg: 4) were believed to have high levels of detail while images like 3 (avg: 2.5), 4 (avg: 2.67), and 8 (avg: 2.5) were also considered to have low levels of detail. Images that were more often mistaken for being completely human made were perceived to have higher levels of detail while those that appeared to be created with digital art or with the assistance of AI presented lower averages for level of detail.
Due to the uncertainty that many of the participants expressed when attempting to distinguish AI vs human generated art caused our data to vary greatly from image and person to person. While some of our participants believed that there was an even distribution of each type of image, others leaned more towards the notion that these pieces were human generated as we asked them to assume that each image was human generated if they were unsure or unless they had reason to believe otherwise. The overall average of responses across all images was approximately 2.94 with the most common responses being the two extremes, 1 (hand drawn/made) and 5 (AI generated)
The majority of the participants expressed that these images were quite intricate and had a high level of detail. The overall average of responses across all images was approximately 3.93. In general, the images that featured greater depth in terms of foreground and background tended to have higher averages. Examples of images with greater depth include the use of highlights, shadows, texture, and color complexity.
During the survey portion of our experiments, we gave the participants the opportunity to share their opinions of AI in general and in the context of the art landscape. All of our participants expressed hesitance towards the concept of using AI image generation when used in an artistic manner, explaining that it has the potential to copy based off of previous artworks and cannot yet generate images that are truly original. There was also a unanimous worry for the potentials of AI in general especially in the contexts of data ethics, job security, and future dependency on this technology. Where the participants’ responses differ is their approaches to the fundamental concept of what defines artistic expression and whether AI image generation can be considered art. Some of the participants went with the notion that art is a creative process in which a human individual physically expresses themselves in a medium. Those that considered AI image generation to be a form of artistic expression stated that the act of a human writing a prompt validates that they were involved in the creative process. The overall consensus was that AI should be kept as an assistive tool instead of a replacement for art itself.
Participants’ previous exposure to AI had minimal impact on their ability to distinguish whether the artworks were AI generated or not. Contrary to expectations, participants with varying degrees of AI familiarity displayed a similar proficiency in making judgements about the origin of the artwork. This suggests that prior experience with AI, both AI chat bots and text to image generator tools, had a limited impact on participant’s effectiveness in distinguishing between AI-generated art. The consistency in performance across diverse participant backgrounds prompts a reevaluation of the factors contributing to perceptual distinctions between AI and human created art.
Participants consistently associated images with a distinct digital art appearance, such as Image 1 and Image 5, with being created using AI. The perceptual link between digital art characteristics and the attribution of AI creation was evidence in the responses. These images were noted to have clean lines, precision, and a level of abstraction to them. Because of this, and because the human hand would often leave traces of imperfections, led participants to believe that AI played a role in its creation. The association between a discernible digital art style and the perceived involvement of AI highlights the ways in which visual elements shape participants’ beliefs about the origin of artistic creations. These findings contribute to a deeper understanding of the perceptual cues that participants rely on when making authenticity judgements, emphasizing the role of specific aesthetic features in guiding their attributions of AI creation.
Participants exhibited a connection between high emotional engagement with certain images and a reduced likelihood of perceiving them as AI-generated. This connection was particularly evident when examining Image 3 and Image 10, where participants reported higher average scores for emotional connection. These same images received lower ratings in terms of participants believing they were AI-generated. This implies a connection between the subjective emotional response participants felt towards certain images and their cognitive judgements regarding the origin of the artwork, steering them away from immediate AI attributions. These findings suggest the nature of perception, suggesting that the emotional impact of an artwork may play a role in shaping participants’ beliefs about its origin, especially in the context of AI-generated art.
Since a lot of University of Washington students are familiar with AI, our data can be limited due to sample size and the demographics of the participants. We asked our friends and peers to participate in the study because of the time constraint and resources so our pool of subjects were convenient. However, getting more participants that vary in age, profession, and knowledge of AI can help us gain more information about the characteristics of generative AI. For example, none of the participants had proficient knowledge about AI and used it for work or school so having someone in that field could give us more knowledge and insights, especially in the interview portion.
After we discussed our findings and results with each other, we discussed our next steps that we can take to refine our study. We can get more participants for a larger scale from different universities and age groups. We also talked about how it would be interesting to experiment with different AI image generators and not just the Copilot image generation to see if there is a clear distinction between image generators.
*P3 requested no record/documentation of interview
Bellaiche, L., Shahi, R., Turpin, M.H. et al. Humans versus AI: whether and why we prefer human-created compared to AI-created artwork. Cogn. Research 8, 42 (2023). https://doi.org/10.1186/s41235-023-00499-6
Mazzone, M., & Elgammal, A. (2019). Art, Creativity, and the Potential of Artificial Intelligence. Arts, 8(1), 26. https://doi.org/10.3390/arts8010026
Somaini, A. (2023). Algorithmic Images: Artificial Intelligence and Visual Culture. Grey Room, 93(93), 74–115. https://doi.org/10.1162/grey_a_00383