Advanced Media Workflows with Cloudinary Add-Ons

Why It Matters

Multiple image tagging add-ons (Amazon Rekognition, Google, Imagga) can be enabled simultaneously for more comprehensive tagging results using Cloudinary SDKs.
Image tags from Google or Amazon can be automatically translated into multiple languages by combining their tagging add-ons with the Google Translation add-on.
Video transcriptions can be automatically translated into multiple languages by combining a video transcription add-on with the Google Translation add-on.

Some of the available Cloudinary Add-ons can be composed to create some powerful media workflows. In this post, we will take a look at some of these add-ons and see how they can be used together in interesting ways. If you’re not familiar with Cloudinary Add-ons yet, they’re utilities that add enhanced capabilities to your Cloudinary-powered media. For example, some of the popular add-ons are AI-powered enhancements like image background removal and video transcriptions.

Getting Started With Add-ons

Before getting started, make sure to sign up for free if you don’t already have a Cloudinary account. The Add-ons page in your account dashboard will let you know which Add-ons are available on your account. If they aren’t available, click here to learn how to register for Add-ons. We’ll start with a simple example using image tagging add-ons.

Enable Multiple Image Tagging Add-ons

Tagging assets has a broad set of practical uses. It’s great for digital asset management (DAM), categorization, moderation, and more. Cloudinary offers a few automatic image tagging add-ons: Amazon Rekognition Auto Tagging, Google Auto Tagging, and Imagga Auto Tagging. Each service provider analyzes and tags content using different AI models. Each one has subtle differences, which provides variation in tagging results. If we want the most comprehensive range of tags for our images, it might make sense to enable more than just one.

Enable Multiple Tagging Add-ons With Cloudinary Node SDK

Using Cloudinary SDKs, you can provide a comma-separated list of tagging add-ons using the categorization option on the upload method.

cloudinary.v2.uploader
.upload("ice_skating.jpg", 
  { categorization: "aws_rek_tagging,google_tagging,imagga_tagging" })
.then(result=>console.log(result));
Code language: JavaScript (javascript)

This will return tagging results from all three providers:

{
"info": {
    "categorization": {
      "imagga_tagging": {
        "status": "complete",
        "data": [
          {  "tag": "person",  "confidence": 1.0 },
          ...
        ]
       },
       "google_tagging": {
	       "status": "complete",
	       "data":
	        [
		        {"tag": "skating," "confidence": 0.9689},
		        ...
	        ]
        },
      "aws_rek_tagging": {
        "status": "complete",
        "data": [
	         {"tag": "Human", "confidence": 0.9922},
	         ...
         ]
       }
     }
   }
 }
Code language: JSON / JSON with Comments (json)

Enabling all three tagging add-ons will give you a wider range of tags for the images that you upload.

Combine Tagging and Translation Add-ons

A good use case for combining add-ons is to have your image tags automatically translated into additional languages. Imagga Auto Tagging is the only tagging add-on with this feature built in. If you’re using Google or Amazon tagging, you’ll need to enable the Google Translation add-on to translate your tags. Let’s take a look at how we can combine Google Auto Tagging with Google Translation to get multi-language image tagging.

cloudinary.v2.uploader
.upload("windmill_day.jpg", 
  { categorization: "google_tagging:en:fr:es", 
    auto_tagging: 0.6 })
.then(result=>console.log(result));
Code language: JavaScript (javascript)

With the Google Tagging and Google Translation Add-ons enabled, we just need to provide the languages that we want to translate our tags to the categorization option on the upload call. In this example, we’ll translate the tags to English, French, and Spanish. Each translation is configured with a language code separated by a colon: google_tagging:en:fr:es In the results, each tag is a map of the language code to the translated tag:

  "info": {
    "categorization": {
      "google_tagging": {
        "status": "complete",
        "data": [
          { "tag": {
              "en": "windmill",
              "fr": "moulin à vent",
              "es": "molino" },
            "confidence": 0.9753 },
Code language: JavaScript (javascript)

The Google Translation add-on also works with the Google Automatic Video Tagging Add-on if you’re working with video assets.

Combine Video Transcription and Translation Add-ons

Another great use case is translating video transcriptions. Video Transcription automatically creates transcriptions that can be used for displaying subtitles on your videos. When combined with the Google Translation Add-on, it can automatically detect the language used in the audio track and generate the transcript for the subtitles in the correct language. Configuring transcription translations on video upload is an easy process, just like the previous image tagging example. Here’s how we can do it using the NodeJS SDK:

cloudinary.v2.uploader
.upload("my-video.mp4",
  { resource_type: "video",
    auto_transcription: {
      "translate": ["fr-FR", "es-ES", "de-DE"]
    }
  })
.then(result=>console.log(result));
Code language: JavaScript (javascript)

We’ll use the upload method again to upload a video. Pass an object to auto_transcription to configure the translations we want to include. In this example, the video will be transcribed in French, Spanish, and German. Unlike with our image tagging translations, the video transcription is not returned immediately in the result.

"info": {   
    "auto_transcription": {
        "status": "pending"
    }
 }
Code language: JavaScript (javascript)

The transcription and translations will be completed in the background and associated with the video when the transcription file is ready. After the process is finished, you can play the video in the Cloudinary Video Player and it will provide subtitles for languages that you configured for translation. You can also combine the Google AI Video Transcription Add-on with Google Translation to achieve a similar result.

Summary

By enabling multiple Cloudinary Add-ons together, you can create powerful workflows, such as comprehensive tagging with multiple providers, translating image tags into different languages, or generating multilingual subtitles for videos. Contact us today to learn more about how Cloudinary can help simplify your content development workflows.

Advanced Media Workflows With Cloudinary Add-Ons

Why It Matters

Getting Started With Add-ons

Enable Multiple Image Tagging Add-ons

Enable Multiple Tagging Add-ons With Cloudinary Node SDK

Combine Tagging and Translation Add-ons

Combine Video Transcription and Translation Add-ons

Summary

Start Using Cloudinary

Products

Solutions

Developers

Company

Contact Us

Why It Matters

Getting Started With Add-ons

Enable Multiple Image Tagging Add-ons

Enable Multiple Tagging Add-ons With Cloudinary Node SDK

Combine Tagging and Translation Add-ons

Combine Video Transcription and Translation Add-ons

Summary

Continue Reading

Start Using Cloudinary