Word-As-Image for Semantic Typography
-->
Word-As-Image for Semantic Typography
Shir Iluz*,1,<br>Yael Vinker*,1,<br>Amir Hertz1,<br>Daniel Berio2,<br>Daniel Cohen-Or1,<br>Ariel Shamir3
1Tel Aviv University, 2Goldsmiths University, 3Reichman University
*Denotes equal contribution
SIGGRAPH 2023 - Honorable Mention Award
Paper
-->
Code
Demo
--><br>-->
-->
-->
-->
A few examples of our W ord-A s-I mage illustrations in various fonts and for<br>different textual concept. The semantically adjusted letters are created completely automatically using our method, and can then be used for further creative design as we<br>illustrate here.
Abstract
A word-as-image is a semantic typography technique where a word illustration presents a visualization<br>of the meaning of the word, while also preserving its readability.<br>We present a method to create word-as-image illustrations automatically. This task is highly challenging<br>as it requires semantic understanding of the word and a creative idea of where and how to depict these<br>semantics in a visually pleasing and legible manner.<br>We rely on the remarkable ability of recent large pretrained language-vision models to distill textual<br>concepts visually.<br>We target simple, concise, black-and-white designs that convey the semantics clearly. We deliberately do<br>not change the color or texture of the letters and do not use embellishments.<br>Our method optimizes the outline of each letter to convey the desired concept, guided by a pretrained<br>Stable Diffusion model.<br>We incorporate additional loss terms to ensure the legibility of the text and the preservation of the<br>style of the font.<br>We show high quality and engaging results on numerous examples and compare to alternative techniques
-->
-->
Our method can handle a large variety of semantic concepts and use any font,
while preserving the legibility of the text and the fontโs style.
Note how styles of different fonts are preserved by the semantic modification:
How does it work?
Our word-as-image illustrations concentrate on changing only the geometry of the letters to<br>convey the meaning.<br>We deliberately do not change color or texture and do not use embellishments.<br>This allows simple, concise, black-and-white designs that convey the semantics clearly.
We rely on the prior of a pretrained Stable Diffusion model to connect between text and images, and utilize<br>the Score Distillation Sampling approach to encourage the appearance of the letter to reflect the<br>provided textual concept.
Given an input word, our method is applied separately for each letter.
We represent each letter as a closed vectorized shape.
Given an input letter represented by a set of control points ๐, and a concept (shown in purple),<br>our goal is to optimize its parameters to reflect the meaning of the word, while still preserving its original style and design.
we optimize the new positions ๐ห of the deformed letter iteratively. At each iteration, we use a differentiable rasterizer (DiffVG marked in blue) that allows to backpropagate gradients from a raster-based loss to<br>the shapeโs parameters.<br>We then augmented the rasterized deformed letter and passed into a pretrained frozen Stable<br>Diffusion model, that drives the letter shape to convey the semantic concept using the Lsds loss (1).<br>To preserve the shape of the original letter and ensure legibility<br>of the word, we utilize two additional loss functions. The first loss<br>preserves the local tone and structure of the<br>letter by comparing the low-pass filter (LPF marked in yellow) of the resulting rasterized<br>letter to the original one to compute L๐ก๐๐๐ (2).<br>The second loss regulates the shape modification by constraining the deformation<br>to be as-conformal-as-possible over a triangulation of the letterโs<br>shape (D marked in green), defining L๐๐๐๐ (3).
The same word in a variety of fonts.
Additional Editing
-->
Word-as-image applied on Chinese<br>characters.<br>In Chinese, a whole word can be represented by one character.<br>Here we show from left: bird, rabbit, cat and surfing (two last characters<br>together).
-->
Utilizing Depth-to-image in Stable Diffusion 2 as a post-processing step for our model's results to incorporate color and texture .
-->
Results
-->