Know about diffusion models

What Is Generative Models? A generative model can be seen as a way to model the conditional probability of the observed $X$ given a target $y$ (e.g., given a target ‘dog’, generate a picture of the dog). Once trained, we can easily sample a stance of $X$. While training a generative model is significantly more challenging than a discriminative model (e.g., it is more difficult to generate an image of a dog than to identify a dog in a picture), it offers the ability to create entirely new data....

3 min · Loong

Visual LLMs

techniques have improved on not only text data but also computer vision recently. Here we focus on Visual Language Model (VLM) based on transformers. In the begining, some researchers try to extend BERT to process visual data and make a success. For example, visual-BERT and ViL-BERT achive strong performances on many visual tasks by training on two different objectives: 1) masked modeling task that aims to predict the missing part of a given input; and 2) a match task that aims to predict if the text and the image content are matched....

1 min · Loong