A new AI image generation method, InstantID, has been introduced by the InstantX team in Beijing. According to the team’s recent paper, InstantID can quickly recognize an individual’s identity and generate new images based on a single reference image. This groundbreaking technique has been hailed as the “new state-of-the-art” by Reuven Cohen, an enterprise AI consultant for Fortune 500 companies.
The Downside of InstantID: A Flood of Deepfakes
Cohen, who has closely followed InstantID’s development, cautions that this innovative method comes with a significant concern. He believes that InstantID will inadvertently foster a flood of deepfake audio, images, and video tools, coinciding with the 2024 election. Cohen highlights the ease with which deepfakes can be created using tools like InstantID, stating:
“The use of tools like InstantID for deepfakes raises significant concerns due to the ease of creation and consistency of output with no training or fine-tuning required. InstantID’s ability to efficiently generate identity-preserving content can lead to the creation of highly realistic and convincing deep fakes with no GPU and little CPU resources required.”
Despite the potential drawbacks, InstantID surpasses previous methods like LoRA, which relied on small, fine-tuned models trained on specific characters or styles. LoRA gained popularity through its association with platforms like Civitai, where AI enthusiasts shared a range of creations, including AI-generated fan fiction, anime characters, photorealism, and even fashion. However, LoRA’s infamy lies in its contribution to the proliferation of porn and deepfakes.
Cohen poked fun at LoRA’s decline in popularity, declaring “So long, LoRA,” and describing InstantID as “deep fakes on steroids” in a recent LinkedIn post.
InstantID: Zero-shot Identity-Preserving Generation in Seconds
The InstantX team’s paper, titled “InstantID: Zero-shot Identity-Preserving Generation in Seconds,” emphasizes InstantID’s superiority over techniques like LoRA and QLoRA. While LoRA and QLoRA require high storage demands, lengthy fine-tuning processes, and multiple reference images, InstantID offers a simple solution. It introduces a plug and play module capable of adeptly handling image personalization in various styles using just a single facial image while ensuring high fidelity.
Cohen explains that InstantID focuses on zero-shot identity-preserving generation, setting it apart from LoRA and QLoRA. These existing methods primarily aim at fine-tuning models through updating model parameters or applying quantization for efficiency. In contrast, InstantID prioritizes generating outputs that preserve the identity characteristics of the input data through a fast and efficient approach. Cohen illustrates this with an example:
“Think consistency in things like the identity of an individual. Donald Trump always looks like Donald Trump.”
Cohen stresses the simplicity and accessibility of InstantID, sounding a note of caution. He warns that it has never been easier to quickly engineer a deepfake, claiming that deployment or replication on platforms like Hugging Face requires just one click.