Perform text translation using Vertex AI, Gemini, and NodeJS

Reading Time: 5 minutes

Loading

Introduction

Internationalization (i18n) is an important aspect of commercial websites because commercial owners want to sell their products to customers worldwide. Even though English is one of the most popular languages in the world, not everyone can read and write it fluently. Therefore, websites normally provide additional languages, such as Spanish and Chinese, for visitors to change the language of the web pages. Before generative AI becomes popular, people hire agencies to perform text translations. Nowadays, software engineers can write codes using Vertex AI and the Gemini 1.5 Pro model to translate texts and to persist the results in the database. Client applications can retrieve these translations and update the texts based on the selected language.

In this blog post, I will describe how I use NodeJS, Vertex AI, and the Gemini model to translate English phrases into Spanish.

let's go

Create a Google Cloud Project

Navigate to Google Cloud Console, https://console.cloud.google.com/, to create a Google Cloud Project. Then, a billing account can be added to the cloud project because the Gemini 1.5 Pro model uses tokens to translate texts. Fortunately, the tokens are very cheap and the cost to perform text translations in this demo is low.

Create a new NodeJS Project

mkdir nodejs-gemini-translation
cd ./nodejs-gemini-translation
touch model.ts
touch index.ts

I created a folder and added two new TypeScript files. model.ts defines a generative AI model. index.ts defines the main program that uses the LLM to perform the text translations.

Install dependencies

npm i --save-exact @google-cloud/vertexai
npm i --save-exact --save-dev @types/node ts-node

Add Script in package.json

"scripts": {
    "start": "TARGET=es node -r ts-node/register --env-file=.env index.ts"
  }

This node program uses Node 20; therefore, I use the --env-file flag to load the environment variables from the .env file.

Define Google Cloud variables

// .env.example

GOOGLE_PROJECT_ID=<google project id>
GOOGLE_LOCATION=asia-east2
GOOGLE_MODEL=gemini-1.5-pro-001

Copy .env.example to .env in the folder. Replace GOOGLE_PROJECT_ID, GOOGLE_LOCATION and GOOGLE_MODEL with the project, location, and model, respectively.

  • GOOGLE_PROJECT_ID – Google Cloud Project Id
  • GOOGLE_LOCATION – Location of the Google Cloud. The default value is asia-east2 that is Hong Kong.
  • GOOGLE_MODEL – Large Language Model and the default value is gemini-1.5-pro-001

Add .env to the .gitignore file to prevent accidentally committing the project ID to the GitHub repo.

// .gitignore

node_modules
.env

Set up Application Default Credentials

gcloud auth application-default login 

Set up an Application Default Credential (ADC) for use by the Vertex AI SDK in the local development environment.

gcloud auth application-default revoke 

After not using the Application Default Credential, execute the above command in a terminal to revoke the ADC.

Next, I call the Vertex AI SDK to create a Gemini model to generate translations between two languages.

Create Gemini 1.5 Pro Large Language Model

// model.ts

import { HarmBlockThreshold, HarmCategory, VertexAI } from '@google-cloud/vertexai';

const project = process.env.GOOGLE_PROJECT_ID || '';
const location = process.env.GOOGLE_LOCATION || 'asia-east-2';
const model = process.env.GOOGLE_MODEL || 'gemini-1.5.-pro-latest';

const vertexAi = new VertexAI({ project, location });
export const generativeModel = vertexAi.getGenerativeModel({
    model,
    generationConfig: {
        candidateCount: 1,
        maxOutputTokens: 1024,
        temperature: 0,
        topP: 0.5,
        topK: 10,
    },
    safetySettings: [
        {
            category: HarmCategory.HARM_CATEGORY_HATE_SPEECH,
            threshold: HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE,
        },
        {
            category: HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT,
            threshold: HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE,
        },
        {
            category: HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT,
            threshold: HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE,
        },
        {
            category: HarmCategory.HARM_CATEGORY_HARASSMENT,
            threshold: HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE,
        }
    ],
});

The Vertex AI LLM requires a valid project, google location, and model. Fortunately, the .env environment file provides the values.

generativeModel is Gemini 1.5 Pro model, with temperature, topP, topK, maxOutputTokens, and safetySettings.

Invoke model to perform text translations

// index.ts

import { generativeModel } from './model';

const source = 'en';
const target = process.env.TARGET || 'ja';

async function main() {
    const samples = [
        'Good morning',
        'Hello, how are you today?',
        'How much does 3 apples, 5 pineapples and 4 oranges cost?',
        'My favorite hobby is riding bicycle.',
    ]

    const arrTranslations: Record<string, string>[] = [];

    for (let str of samples) {
        const generatedContents = await generativeModel.generateContent({
            systemInstruction: `You are a translation expert that can translate between source and target languages.
                If you dont know the translation, then return "UNKNOWN TRANSLATION".
                For numbers, please use words to represent instead of the arabic values.
                The response is target_language_code|translation.
            `,
            contents: [
                {
                    role: 'user',
                    parts: [{ text: `${source}:${str} ${target}:` }],
                }
            ],
        });

        const candidates = generatedContents.response.candidates || [];
        for (const candidate of candidates) {
            const translations: Record<string, string> = {
                [source]: str
            };

            (candidate.content.parts || []).forEach((part) => {
                console.log('part', part);
                const line = (part.text || '').trim();
                const [targetLanguage, text = ''] = line.split('|');    
                translations[targetLanguage] = text;
            });

            arrTranslations.push(translations);
        }
    }

    console.log(arrTranslations);
}

main();

The index.ts imports the generativeModel to generate contents that are the Spanish translations. The generateContent method expects a system instruction that describes the context of the model.

systemInstruction: `You are a translation expert that can translate between source and target languages. 
If you dont know the translation, then return "UNKNOWN TRANSLATION". 
For numbers, please use words to represent instead of the arabic values.
The response is target_language_code|translation.`

The model is a translation expert who translates texts. If the expert does not have the answer, it will output “UNKNOWN TRANSLATION”. I force the model to give me texts instead of Arabic numbers for fun. When the text is “3 apples”, the output is “tres mazanas”. I also changed the separator from “:” to “|” by including “The response is target_language_code|translation.” in the instruction.

contents: [
      {
              role: 'user',
              parts: [{ text: `${source}:${str} ${target}:` }],
      }
],

The role is user, and the part consists of the source language code, source text, and the target language code. For example, parts: [{ text: 'en:Good Morning es:' }].

const candidates = generatedContents.response.candidates || [];
 for (const candidate of candidates) {
        const translations: Record<string, string> = {
                [source]: str
         };

         (candidate.content.parts || []).forEach((part) => {
                console.log('part', part);
                const line = (part.text || '').trim();
                const [targetLanguage, text = ''] = line.split('|');    
                translations[targetLanguage] = text;
         });

        arrTranslations.push(translations);
 }

The candidates store the translations in parts.

part { text: 'es|Buenos días \n' }
part { text: 'es|Hola, ¿cómo estás hoy? \n' }
part {
  text: 'es|¿Cuánto cuestan tres manzanas, cinco piñas y cuatro naranjas?'
}
part { text: 'es|Mi pasatiempo favorito es andar en bicicleta. \n' }

The function splits the part by ‘|’ to obtain the target language code and the target text. These are the key and the value of the translations map. Finally, the translations map is appended to the arrTranslation array. Repeat the same steps for all the source texts and terminate.

Test the translations

Run the start script in package.json in the terminal to generate the Spanish translations.

npm start

The terminal should output an array of JSON objects with the source and the target texts.

[
    {
        "en": "Good morning",
        "es": "Buenos Dias"
    },
    {
        "en": "Hello, how are you today?",
        "es": "Hola, ¿cómo estás hoy?"
    },
    {
        "en": "How much does 3 apples and 4 oranges cost?",
        "es": "¿Cuánto cuestan tres manzanas, cinco piñas y cuatro naranjas?"
    }
]

This concludes my blog post about using Vertex AI, Gemini, and NodeJS to perform text translations. I only scratched the surface of Vertex AI because it offers many large language models and services to build interesting projects and solve real-world problems in different domains. I hope you like the content and continue to follow my learning experience in Angular, NestJS, and other technologies.

Resources:

  1. Github Repo: https://github.com/railsstudent/nodejs-gemini-translation
  2. Vertex AI Documentation: https://cloud.google.com/vertex-ai/generative-ai/docs/translate/translate-text#tt-api
  3. Summarization Large Documents Code Lab – https://cloud.google.com/vertex-ai/generative-ai/docs/quotas