In a highly technical article on AI art, I-Feng this week goes into detail about some of the models available. The writer also explores the complicated issues around copyright and the future of ‘real’ artists.
Last week, the highly anticipated Midjourney V5 AI art generator was officially released, once again changing the world of AI-driven art creation. It has significantly enhanced image quality, more diversified output, wider style range, support for seamless textures, wider aspect ratio, improved image prompts, extended dynamic range, etc.
The following picture is an image generated by Midjourney V4 and Midjourney V5 respectively with "Elon Musk's Introduction to Tesla, Commercial Advertising in the 1990s" as a prompt.
What satisfies people's expectations this time is that Midjourney V5 brings a more realistic picture generation effect, a more expressive angle or scene overview, and finally the right "hand". A joke once widely circulated in AI painting was, "Never ask a woman's age or an AI model why she hid her hands."
This is because AI art generators are "difficult painters", although they can master visual patterns, but they cannot master potential biological logic. In other words, the AI art generator can calculate that there are fingers, but it is difficult to know that a person's hand should normally have only five fingers, or that there should be a fixed relationship between these fingers.
In the past year, the "defect" of AI art generators that cannot render hands correctly has become cultural rhetoric. Hand problems are partly related to the ability of AI art generators to infer information from the large number of image data sets they have been trained. It is worth noting that Midjourney V5 can generate well. Most of the time, the hand is correct.
Evolution of AI art
In 2018, the first AI-generated portrait, Edmond de Belamy, was created by the Generating Confrontation Network (GAN). It was finally sold for $432,500 at a Christie's art auction.
In 2022, Jason Allen's AI creation "Théâtre D'opéra Spatial" won first place in the annual art competition at the Colorado State Fair.
DeepDream generates images based on the representations learned by neural networks. After obtaining the input images, it runs the trained convolutional neural network (CNN) in reverse and tries to maximize the activation of the entire layer by applying gradient rise. The following figure (left) shows the original input image and its DeepDream output.
Neural Style Transfer is a deep learning-based technology that can combine the content of one image with the style of another image, as shown in the figure above (right), applying Van Gogh's Starry Night to the target image.
Soon after, Google released a diagram model called Imagen. This model can generate images containing text more accurately which is a difficult problem for the OpenAI model to solve.
Deep learning and its image-processing applications are now at a completely different stage from even a few years ago. At the beginning of the last century, it was ground-breaking that deep neural networks were able to classify natural images. Nowadays, these milestone models can generate highly realistic and complex images based on simple text prompts,
"Threat" or "symbiosis", where will the human painter go?
Since its birth, AI artist has been controversial. Copyright disputes, output error information, algorithm bias, become the centre of a storm again. For example, in January this year, three artists filed a lawsuit. They claimed that AI organizations violated the rights of "millions of artists" and trained AI models with 5 billion pictures captured from the Internet "without the consent of original artists".
Most artists are afraid that they will be replaced by robots and lose their livelihood because AI imitates its unique style of model. In December last year, hundreds of artists uploaded pictures to ArtStation, one of the largest art communities on the Internet, saying "no to AI-generated images." At the same time, some artists are pessimistic, "We are watching the death of art unfold." The copyright of images used in training data is still under dispute.
Of course, there are also some artists who actively embrace AI and regard it as their painting assistant, eliminating repetitive and boring work. At the same time, some artists use AI as the "engine" of imagination. In the interaction with users in similar Midjourney software and communities, they tear each other apart, generating new and interesting human aesthetics, and then overflowing into the real world. As Midjourney described:
AI is not a reproduction of the real world, but an extension of human imagination.
At present, regulators are catching up with AI artists. Recently, the U.S. Copyright Office said in a letter that images in graphic novels created using the AI system Midjourney should not be protected by copyright. This decision was one of the first decisions made by U.S. courts or agencies on the scope of copyright protection for works created by AI.
Worked on the article: