Skip to main content

Computer Vision : Visual representation with Sketches (Part 1)

 Our physical reality, and our products, have more and more a digital counterpart. The production lead times  of sports retail makes it challenging to have product samples when we need to encode visual product information for our advance analytics use cases, as we need to take decisions seasons before the product is in the store.

One of the safest source on information that we have at this point in time is product sketch, a realistic abstraction of how the product will look like when produced. Textures and some details may be lost, but a lots of attributes such as colors, silhouette, design elements, patterns, technologies can be picked up by computer vision deep learning networks. It is therefore mandatory to be able to extract visual embeddings of the highest quality for this digital source. On our own data that improve coverage 77% for future seasons (around 14k products could have visual embeddings)

In the following posts, I review what we understood as the most promissing approaches to generate robust representations of our products from a visual standpoint, independently of the availability of images, 3d renderings or sketches. 


Can I reuse the same encoder as I build for my images?

The answer, as with most complex topics is, it depends. In our case, we tried out with a carefully curated data set where sketches were of high resolution, same orientation, no background .




For those cases we compare the cosine similarity between a Resnet50 and Resnet34 encoded representation between a sample of 60 pairs of images and its sketches. For this sample we achieve ~70% top 1 accuracy and ~85% in finding for each sketch the most similar image the actual.  That was quite strong but when we scale for the whole data set of sketches >70K we got much weaker results. 

The reasons are that the heterogeneity of sketches is too great (20% of the sketches contain other objects, have rotation, different levels of detail...)  and many are quite far from the actual image, suggesting that we need train a model that could generalize well with a more complex set of sketch types. While preprocessing the skecthes did help, it has not been sufficient to reach the results obtained with the curated data set. We decide to investigate the specific modelling approaches for that type of problem.


Finding a good sketch to image Translator

The task at hand is therefore to find a good encoder that make every asset type domain comparable or ready to be translate to the other. One way to do that is to learn from existing sketch and image pairs so we can train and encoder of sketch that generate reallistic images, that our discrimators tries to pick up. This allow us to translate effectively any sketch into a reallistic image, from which we can get embeddings that can be used together with embeddings extracted directly from images.




Our first attempt is the pix2pix model, which follows the architecture shown above. As we are dealing with high resolution images, it is possible that the pix2pixHD will work better for us.

The papers and open source code from those models can be found below:




In the following post, we will deep dive into the details of the method, as giving a brief introduction on GANNS.









Comments

Popular posts from this blog

Degrowth Communism Strategy

Kohei Saito has published another book to make a valid point: any economic system that does not overcome capitalism will fail to reconcile social provisioning with planetary boundaries. The question is how democratic we want this system to be. He advocates radically democratizing the economic system and avoiding any form of climate Maoism, or a state dictatorship to enforce how we transition from capitalism. Let's see why, who, and also some strategic gaps I identified while reading the book, which I recommend. We need to reconcile socialism with ecology, and degrowth with socialism. Not all socialists agree or support degrowth or the notion of planetary boundaries, and definitely the mainstream left is rather green Keynesian, productivist, and mostly pro-growth. The author claims that due to the impossibility of sufficient decoupling and the need for capitalism to grow, only socialism and a break from capitalism can achieve a stable climate and public abundance. Also, not all degr