update model card

Files changed (4) hide show

README.md +59 -3
clusters.png +0 -0
logs/{train → train_clustering}/events.out.tfevents.1655108829.7c9e25180606.72.1.v2 +0 -0
logs/{train → train_encoder}/events.out.tfevents.1655106318.7c9e25180606.72.0.v2 +0 -0

README.md CHANGED Viewed

@@ -2,20 +2,76 @@
 library_name: keras
 tags:
 - clustering
 ---
 ## Model description
-More information needed
 ## Intended uses & limitations
-More information needed
 ## Training and evaluation data
-More information needed
  ## Model Plot
 <details>

 library_name: keras
 tags:
 - clustering
+datasets:
+- CIFAR-10
 ---
 ## Model description
+This is a image clustering model trained after the [**Semantic Clustering by Adopting Nearest neighbors (SCAN)**](https://arxiv.org/abs/2005.12320)(Van Gansbeke et al., 2020) algorithm.
+The training procedure was done as seen in the example on <a href='https://keras.io/examples/vision/semantic_image_clustering/' target='_blank'>**keras.io**</a>  by [Khalid Salama](https://www.linkedin.com/in/khalid-salama-24403144/).
+The algorithm consists of two phases:
+1. Self-supervised visual representation learning of images, in which we use the simCLR technique.
+2. Clustering of the learned visual representation vectors to maximize the agreement between the cluster assignments of neighboring vectors.
 ## Intended uses & limitations
+The model is intended to show the effective use of self-supervised learning combined with nearest neighbours for (semantic) image clustering.
+You can use these clusters to retrieve images of the same class.
+### Limitations
+This model is not supposed to show any superiority to image classification from supervised learning, but as a POC that unsupervised learning is able to cluster similar images together without any labels.
+### Possible Improvements:
+As given by the original author on keras.io, these steps can be taken to improve the accuary further:
+1) increase the number of epochs in the representation learning and the clustering phases;
+2) allow the encoder weights to be tuned during the clustering phase
+3) perform a final fine-tuning step through self-labeling, as described in the original SCAN paper
 ## Training and evaluation data
+### Training Data
+The model was trained using the [CIFAR-10 dataset](https://www.cs.toronto.edu/~kriz/cifar.html). For training the images were scaled to (32,32,3).
+### Hyperparameters
+For training the following parameters were used:
+- Feature Vector Dimension: 512
+- Projection Units of Head: 128
+- Number of Cluster: 20
+- K-Neighbours: 5
+The encoder was not tuned during clustering.
+### Evaluation
+#### Visualization of highest confidence cluster picks
+![Visualization of highest confidence cluster picks](clusters.png)
+#### Clusters and their respective labels, accuracy and size
+| Cluster | Label | Accuracy | Size |
+|:---------|:-------------:|-----:| -----:|
+|cluster 0| frog  | 31.6 %|3582|
+|cluster 1| frog  | 19.76 %|2348|
+|cluster 2| horse  | 26.82 %|2983|
+|cluster 3| bird  | 29.7 %|1532|
+|cluster 4| airplane  | 39.16 %|3575|
+|cluster 5| ship  | 22.38 %|2207|
+|cluster 6| automobile  | 26.41 %|4365|
+|cluster 7| dog  | 21.09 %|5049|
+|cluster 8| automobile  | 21.94 %|4093|
+|cluster 9| truck  | 29.66 %|4639|
+|cluster 10| bird  | 23.02 %|1455|
+|cluster 11| truck  | 17.78 %|3937|
+|cluster 12| deer  | 30.36 %|2635|
+|cluster 13| dog  | 22.62 %|1950|
+|cluster 14| frog  | 22.64 %|4391|
+|cluster 15| airplane  | 26.89 %|2838|
+|cluster 16| ship  | 34.7 %|2213|
+|cluster 17| ship  | 17.59 %|1785|
+|cluster 18| cat  | 16.57 %|1997|
+|cluster 19| deer  | 27.25 %|2426|
  ## Model Plot
 <details>

clusters.png ADDED Viewed

logs/{train → train_clustering}/events.out.tfevents.1655108829.7c9e25180606.72.1.v2 RENAMED Viewed

File without changes

logs/{train → train_encoder}/events.out.tfevents.1655106318.7c9e25180606.72.0.v2 RENAMED Viewed

File without changes