In-Browser Object Detection with SSD Mobilenet

Select an Image to Analyze:

Analyzed Image

Prediction Time: 0ms

The model did not detect any objects
Loading Model...


I made this website based off the Tensorflow.js COCO-SSD demo. The model is ported into Tensorflow.js and runs completely in your browser with no backend server component, pretty cool!

Why does the model [suck in some way]?

As with all machine learning models, the model is only as good as the data you give it. This model was trained on the COCO dataset and will not perform as well on images that look different. Hilariously, SSD Lite Mobilenet V2 thinks the food image is a refrigerator.


COCO-SSD is an object detection model trained on the Common Objects in Context (aka COCO) dataset. SSD stands for Single Shot MultiBox Detection which generates default boxes over different aspect ratios and scales, adjusts boxes during prediction time, and can combine predictions from multiple feature maps to handle various object sizes. SSD encapsulates all computation in a single Feed-Forward Convolutional Neural Network which eliminates proposal generation and resampling, making it easier to train and operationalize (full paper).

Currently the models support these 80 classes:

  • airplane
  • apple
  • backpack
  • banana
  • baseball bat
  • baseball glove
  • bear
  • bed
  • bench
  • bicycle
  • bird
  • boat
  • book
  • bottle
  • bowl
  • broccoli
  • bus
  • cake
  • car
  • carrot
  • cat
  • cell phone
  • chair
  • clock
  • couch
  • cow
  • cup
  • dining table
  • dog
  • donut
  • elephant
  • fire hydrant
  • fork
  • frisbee
  • giraffe
  • hair drier
  • handbag
  • horse
  • hot dog
  • keyboard
  • kite
  • knife
  • laptop
  • microwave
  • motorcycle
  • mouse
  • orange
  • oven
  • parking meter
  • person
  • pizza
  • potted plant
  • refrigerator
  • remote
  • sandwich
  • scissors
  • sheep
  • sink
  • skateboard
  • skis
  • snowboard
  • spoon
  • sports ball
  • stop sign
  • suitcase
  • surfboard
  • teddy bear
  • tennis racket
  • tie
  • toaster
  • toilet
  • toothbrush
  • traffic light
  • train
  • truck
  • tv
  • umbrella
  • vase
  • wine glass
  • zebra

Tensorflow? JavaScript? Is this for real??

I could hardly believe it myself, which is why I made this test website. I guess there's no running away from JavaScript.

Quoted from the docs: "TensorFlow.js executes operations on the GPU by running WebGL shader programs. These shaders are assembled and compiled lazily when the user asks to execute an operation".

Say what you will about JavaScript; it's powerful enough to run client-side ML! (Some people even claim that you could do all the training in JS...but I haven't gotten there yet)

Advantages of client-side Machine Learning

Running ML models in a browser or even in a React Native mobile app can greatly simplify the application architecture. Instead of building a beefy server that can handle many prediction requests, you can simply cache the model in a Content Delivery Network and make the user's device do all the work for you!

Of course, one of the downsides of client-side ML is that your model is published for everyone to see, not good if you have some secret sauce. Also, extremely large models could eat up network bandwidth and a user's processing power, and not all client devices have GPU hardware.