Create a generative AI application with Angular and Gemini REST API

Reading Time: 7 minutes

Loading

Introduction

In this blog post, I show how to create a generative AI application that uses Angular and Gemini REST API. Therefore, it is feasible to build basic generative AI without a backend.

The application demonstrates 2 use cases

  • Generate text from a text input or multimodal input (text + image)
  • Generate text from multimodal input (text + image)
let's go

Generate Gemini API Key

Go to https://aistudio.google.com/app/apikey to generate an API key for a new or an existing Google Cloud project

Install dependency

npm i --save-exact ngx-markdown

Generate configuration file

It is a bad practice to include the Gemini API key in our source codes and commit the files into the Github repo. Then, anyone can use the key to make requests to the Gemini REST API, and the Google Cloud bill can become very expensive.

My solution is to write a shell script that generates a configuration file containing the API key.

// generate-config-file.sh

if [ $# -lt 1 ]; then
    echo "Usage: $0 <api key>"
    exit 1
fi

apiConfig='{ 
    "apiKey": "'"$1"'"
}'

outputFile='src/assets/config.json'
echo $apiConfig > $outputFile

This script accepts an API key and writes the JSON object to the src/assets/config.json JSON file.

./generate-config-file.sh   some-api-key

The command creates the following JSON object in the configuration file

{
    "apiKey": "some-api-key"
}

Add src/assets/config.json to the .gitignore file to ensure we don’t accidentally commit the Gemini API Key to the GitHub repo.

// .gitignore

src/assets/config.json

Then, the Angular application can import the JSON file and inject the API key in an EnvironmentProvider in the next step.

Define custom Gemini Provider

// harm-category.enum.ts

export enum HARM_CATEGORY {
    HARM_CATEGORY_UNSPECIFIED="HARM_CATEGORY_UNSPECIFIED",
    HARM_CATEGORY_DEROGATORY="HARM_CATEGORY_DEROGATORY",
    HARM_CATEGORY_TOXICITY="HARM_CATEGORY_TOXICITY",
    HARM_CATEGORY_VIOLENCE="HARM_CATEGORY_VIOLENCE",
    HARM_CATEGORY_SEXUAL="HARM_CATEGORY_SEXUAL",
    HARM_CATEGORY_MEDICAL="HARM_CATEGORY_MEDICAL",
    HARM_CATEGORY_DANGEROUS="HARM_CATEGORY_DANGEROUS",
    HARM_CATEGORY_HARASSMENT="HARM_CATEGORY_HARASSMENT",
    HARM_CATEGORY_HATE_SPEECH="HARM_CATEGORY_HATE_SPEECH",
    HARM_CATEGORY_SEXUALLY_EXPLICIT="HARM_CATEGORY_SEXUALLY_EXPLICIT",
    HARM_CATEGORY_DANGEROUS_CONTENT="HARM_CATEGORY_DANGEROUS_CONTENT"
}
// threshold.enun.ts

export enum THRESHOLD {
    HARM_BLOCK_THRESHOLD_UNSPECIFIED = "HARM_BLOCK_THRESHOLD_UNSPECIFIED",
    BLOCK_LOW_AND_ABOVE = "BLOCK_LOW_AND_ABOVE",
    BLOCK_MEDIUM_AND_ABOVE = "BLOCK_MEDIUM_AND_ABOVE",
    BLOCK_ONLY_HIGH = "BLOCK_ONLY_HIGH",
    BLOCK_NONE = "BLOCK_NONE"
};
// gemini.interface.ts

export interface GeminiConfig {
    maxOutputTokens: number,
    temperature: number,
    topP: number,
    topK: number
};

export interface GeminiSafetySetting {
    category: HARM_CATEGORY,
    threshold: THRESHOLD
}
// gemini.constant.ts

import { GeminiConfig, GeminiSafetySetting } from './interfaces/genmini.interface';

export const GEMINI_API_KEY = new InjectionToken<string>('API_KEY');
export const GEMINI_PRO_URL = new InjectionToken<string>('GEMINI_PRO_URL');
export const GEMINI_PRO_VISION_URL = new InjectionToken<string>('GEMINI_PRO_VISION_URL');

export const GEMINI_GENERATION_CONFIG = new InjectionToken<GeminiConfig>('GEMINI_GENERATION_CONFIG');
export const GEMINI_SAFETY_SETTINGS = new InjectionToken<GeminiSafetySetting[]>('GEMINI_SAFETY_SETTINGS');
// gemini.provider.ts


import { EnvironmentProviders, inject, makeEnvironmentProviders } from '@angular/core';
import config from '../../assets/config.json';
import { CORE_GUARD } from '../core/core.constant';
import { HARM_CATEGORY } from './enums/harm-category.enum';
import { THRESHOLD } from './enums/threshold.enum';
import { GEMINI_API_KEY, GEMINI_GENERATION_CONFIG, GEMINI_PRO_URL, GEMINI_PRO_VISION_URL, GEMINI_SAFETY_SETTINGS } from './gemini.constant';

export function provideGeminiApi(): EnvironmentProviders {
    const genAIBase = 'https://generativelanguage.googleapis.com/v1beta/models';

    return makeEnvironmentProviders([
        {
            provide: GEMINI_API_KEY,
            useValue: config.apiKey,
        },
        {
            provide: GEMINI_GENERATION_CONFIG,
            useValue: {
                "maxOutputTokens": 1024,
                "temperature": 0.2,
                "topP": 0.5,
                "topK": 3
            },
        },
        {
            provide: GEMINI_SAFETY_SETTINGS,
            useValue: [
                {
                    "category": HARM_CATEGORY.HARM_CATEGORY_HATE_SPEECH,
                    "threshold": THRESHOLD.BLOCK_MEDIUM_AND_ABOVE
                },
                {
                    "category": HARM_CATEGORY.HARM_CATEGORY_DANGEROUS_CONTENT,
                    "threshold": THRESHOLD.BLOCK_MEDIUM_AND_ABOVE
                },
                {
                    "category": HARM_CATEGORY.HARM_CATEGORY_SEXUALLY_EXPLICIT,
                    "threshold": THRESHOLD.BLOCK_MEDIUM_AND_ABOVE
                },
                {
                    "category": HARM_CATEGORY.HARM_CATEGORY_HARASSMENT,
                    "threshold": THRESHOLD.BLOCK_MEDIUM_AND_ABOVE
                }
            ],
        },
        {
            provide: GEMINI_PRO_URL,
            useFactory: () => {
                const coreGuard = inject(CORE_GUARD, { self: true, optional: true });
                if (coreGuard) {
                    throw new TypeError('provideGeminiApi cannot load more than once');
                }

                const apiKey = inject(GEMINI_API_KEY);
                return `${genAIBase}/gemini-pro:generateContent?key=${apiKey}`;
            }
        },
        {
            provide: GEMINI_PRO_VISION_URL,
            useFactory: () => {
                const apiKey = inject(GEMINI_API_KEY);
                return `${genAIBase}/gemini-pro-vision:generateContent?key=${apiKey}`;
            }
        },
    ]);
}

In genimi.provider.ts, I inject the API Key, Gemini generation configuration, Gemini safety settings, Gemini Pro URL, and Gemini Pro Vision URL.

Gemini Pro URL – the Gemini endpoint that generates text response from text input

Gemini Pro Vision URL – the Gemini endpoint that generates text response from multimodal inputs. Multimodal inputs mean texts and images supplied by users.

Bootstrap Application

Next, I register provideGeminiApi provider in bootstrapApplication.

// app.config.ts

import { provideHttpClient } from '@angular/common/http';
import { provideRouter, withComponentInputBinding } from '@angular/router';
import { provideMarkdown } from 'ngx-markdown';
import { routes } from './app.routes';
import { provideGeminiApi } from './gemini/gemini.provider';

export const appConfig = {
    providers: [
      ... other providers ...
      provideGeminiApi(),
    ]
};
// main.ts

import { bootstrapApplication } from '@angular/platform-browser';
import { appConfig } from '~app/app.config';
import { AppComponent } from './app/app.component';

bootstrapApplication(AppComponent, appConfig)
  .catch(err => console.error(err));

I have bootstrapped the Angular application and the next step is to create a Gemini Service to receive prompts and generate texts.

Create GeminiService

// generate-text.operator.ts

import { Observable, catchError, map, of, retry, tap } from 'rxjs';

export function generateText(numRetries: number) {
    return function(source: Observable<any>) {
      return source.pipe(
          retry(numRetries),
          tap((response) => console.log(response)),
          map((response) => response.candidates?.[0].content?.parts?.[0].text || 'No response' ),
          catchError((err) => {
            console.error(err);
            return of('Error occurs');
          })
        );
      }
 }
// gemini.service.ts

import { HttpClient } from '@angular/common/http';
import { Injectable, inject } from '@angular/core';
import { Observable, catchError, map, of, retry, tap } from 'rxjs';
import { GEMINI_GENERATION_CONFIG, GEMINI_PRO_URL, GEMINI_PRO_VISION_URL, GEMINI_SAFETY_SETTINGS } from '../gemini.constant';
import { GeminiResponse } from '../interfaces/generate-response.interface';
import { MultimodalInquiry } from '../interfaces/genmini.interface';

@Injectable({
  providedIn: 'root'
})
export class GeminiService {
  private readonly geminiProUrl = inject(GEMINI_PRO_URL);
  private readonly geminiProVisionUrl = inject(GEMINI_PRO_VISION_URL);
  private readonly generationConfig = inject(GEMINI_GENERATION_CONFIG);
  private readonly safetySetting = inject(GEMINI_SAFETY_SETTINGS);
  private httpClient = inject(HttpClient);

  generateText(prompt: string): Observable<string> {
    return this.httpClient.post<GeminiResponse>(this.geminiProUrl, {
      "contents": [
        {
            "role": "user",
            "parts": [
              {
                "text": prompt
              }
          ]
        }
      ],
      "generation_config": this.generationConfig,
      "safetySettings": this.safetySetting
    }, {
      headers: {
        "Content-Type": "application/json"
      }
    })
    .pipe(generateText(3));
  }
  
  generateTextFromMultimodal({ prompt, mimeType, base64Data }: MultimodalInquiry): Observable<string> {
    return this.httpClient.post<GeminiResponse>(this.geminiProVisionUrl, {
      "contents": [
        {
            "role": "user",
            "parts": [
              {
                "text": prompt
              },
              {
                "inline_data": {
                  "mime_type": mimeType,
                  "data": base64Data
                }
              }
          ]
        }
      ],
      "generation_config": this.generationConfig,
      "safetySettings": this.safetySetting
    }, {
      headers: {
        "Content-Type": "application/json"
      }
    })
    .pipe(generateText(3));
  }
}

generateText – this method receives a prompt and generates the text. The HTTP request retries 3 times before returning “No response” or “Error occurs”.

generateTextFromMultimodal – this method receives a prompt, mime type, and the inline Base64 data of the image. Similarly, the request retries 3 times before return “No response” or “Error occurs”.

Build shared components for the user interfaces

Create an Angular component to input prompt and submit the request when user clicks the “Ask me anything” button

// prompt-box.coponent.ts

@Component({
  selector: 'app-prompt-box',
  standalone: true,
  imports: [FormsModule],
  template: `
    <div>
      <textarea rows="3" [(ngModel)]="prompt"></textarea>
      <button (click)="askMe.emit()" [disabled]="vm.isLoading">{{ vm.buttonText }}</button>
    </div>
  `,
  changeDetection: ChangeDetectionStrategy.OnPush
})
export class PromptBoxComponent {
  prompt = model.required<string>();
  loading = input.required<boolean>();

  viewModel = computed(() => ({
    isLoading: this.loading(),
    buttonText: this.loading() ? 'Processing' : 'Ask me anything',
  }));

  get vm() {
    return this.viewModel();
  }

  askMe = output();
}

An Image Preview component allows users to select an image (jpg, jpeg, png) from a file dialog and preview it.

// gemini.interface.ts

export interface ImageInfo {
    base64DataURL: string;
    base64Data: string;
    mimeType: string;
    filename: string;
} 
// image-preview.component.ts

@Component({
  selector: 'app-image-preview',
  standalone: true,
  template: `
    <div>
      <label for="fileInput">Select an image:</label>
      <input id="fileInput" name="fileInput" (change)="fileChange($event)"
        alt="image input" type="file" accept=".jpg,.jpeg,.png" />
    </div>
    @if(imageInfo(); as imageInfo) {
      <img [src]="imageInfo.base64DataURL" [alt]="imageInfo.filename" width="250" height="250" />
    }
  `,
  changeDetection: ChangeDetectionStrategy.OnPush
})
export class ImagePreviewComponent {
  imageInfo = model.required<ImageInfo | null>();
  
  fileChange(event: any) {
    const imageFile: File | undefined = event.target.files?.[0];
    if (!imageFile) {
      return;
    }

    const reader = new FileReader();
    reader.readAsDataURL(imageFile);
    reader.onloadend = () => {
      const fileResult = reader.result;
      if (fileResult && typeof fileResult === 'string') {
        const data = fileResult.substring(`data:${imageFile.type};base64,`.length);
        this.imageInfo.set({
          base64DataURL: fileResult,
          base64Data: data,
          mimeType: imageFile.type,
          filename: imageFile.name
        });
      }
    }
  }
}

When users select a file, fileChange creates a FileReader and reads the data URL from the image file. When the read completes, I update the mime type, file name, and base64 image data in the imageInfo model input.

// chat-history.component.ts

import { ChangeDetectionStrategy, Component, input } from '@angular/core';
import { MarkdownComponent } from 'ngx-markdown';
import { HistoryItem } from '../interfaces/history-item.interface';
import { LineBreakPipe } from '../pipes/line-break.pipe';

@Component({
  selector: 'app-chat-history',
  standalone: true,
  imports: [MarkdownComponent, LineBreakPipe],
  template: `
    <h3>Chat History</h3>
    @if (chatHistory().length > 0) {
      <div class="scrollable-list">
        <ol>
          @for (history of chatHistory(); track history) {
            <li>
              <p>{{ history.prompt }}</p>
              <markdown [data]="lineBreakPipe.transform(history.response)" />
            </li>
          }
        </ol>
      </div>
    } @else {
      <p>No history</p>
    }
  `,
  changeDetection: ChangeDetectionStrategy.OnPush,
})
export class ChatHistoryComponent {
  chatHistory = input.required<HistoryItem[]>();
  lineBreakPipe = new LineBreakPipe();
}

ChatHistoryComponent lists all the prompts and the generated texts from the earliest to the latest.

Building the user interfaces for the application

GenerateTextComponent is a component that displays a prompt box for a user to input a prompt and generate the text.

// generate-text.component.ts

@Component({
  selector: 'app-generate-text',
  standalone: true,
  imports: [FormsModule, ChatHistoryComponent, PromptBoxComponent, AsyncPipe],
  template: `
    <h3>Input a prompt to receive an answer from the Google Gemini AI</h3>
    <app-prompt-box [loading]="loading()" [(prompt)]="prompt" />
    @if (chatHistory$ | async; as chatHistory) {
      <app-chat-history [chatHistory]="chatHistory" />
    }
  `,
  changeDetection: ChangeDetectionStrategy.OnPush,
})
export class GenerateTextComponent implements OnInit {
  promptBox = viewChild.required(PromptBoxComponent);

  geminiService = inject(GeminiService);
  prompt = signal('');
  loading = signal(false);

  chatHistory$!: Observable<HistoryItem[]>;

  ngOnInit(): void {
    this.chatHistory$ = outputToObservable(this.promptBox().askMe)
      .pipe(
        filter(() => this.prompt() !== ''),
        tap(() => this.loading.set(true)),
        switchMap(() => 
          this.geminiService.generateText(this.prompt()).pipe(finalize(() => this.loading.set(false)))
        ),
        scan((acc, response) => acc.concat({ prompt: this.prompt(), response }), [] as HistoryItem[]),
        startWith([] as HistoryItem[])
      );
  }
}

GenerateTextMultimodalComponent is a component that displays an image selector and a prompt box. A user selects an image, inputs a prompt, and clicks the button to generate text.

// generate-text-multimodal.component.ts

@Component({
  selector: 'app-generate-text-multimodal',
  standalone: true,
  imports: [
    FormsModule, 
    ChatHistoryComponent, 
    ImagePreviewComponent, 
    PromptBoxComponent, 
    AsyncPipe
  ],
  template: `
    <h3>Input a prompt and select an image to receive an answer from the Google Gemini AI</h3>
    <div class="container">
      <app-image-preview class="image-preview" [(imageInfo)]="imageInfo" />
      <app-prompt-box [loading]="loading()" [(prompt)]="prompt" />
    </div>
    @if (chatHistory$ | async; as chatHistory) {
      <app-chat-history [chatHistory]="chatHistory" />
    }
  `,
  changeDetection: ChangeDetectionStrategy.OnPush
})
export class GenerateTextMultimodalComponent implements OnInit {
  promptBox = viewChild.required(PromptBoxComponent);
  
  geminiService = inject(GeminiService);
  prompt = signal('');
  loading = signal(false);
  imageInfo = signal<ImageInfo | null>(null);

  viewModel = computed(() => ({
    isLoading: this.loading(),
    base64Data: this.imageInfo()?.base64Data,
    mimeType: this.imageInfo()?.mimeType,
    prompt: this.prompt(),   
  }));

  chatHistory$!: Observable<HistoryItem[]>;

  get vm() {
    return this.viewModel();
  }

  ngOnInit(): void {
    this.chatHistory$ = outputToObservable(this.promptBox().askMe)
      .pipe(
        filter(() => this.vm.prompt !== '' && !!this.vm.base64Data),
        tap(() => this.loading.set(true)),
        switchMap(() => {
          const { isLoading, ...rest } = this.vm;
          return this.geminiService.generateTextFromMultimodal(rest as MultimodalInquiry)
            .pipe(finalize(() => this.loading.set(false)))
        }),
        scan((acc, response) => acc.concat({ prompt: this.prompt(), response }), [] as HistoryItem[]),
        startWith([] as HistoryItem[])
      );
  }
}

Add routes to navigate to the user interfaces

Create a navigation bar to route to either GenerateTextComponent or GenerateTextMultimodalComponent.

// app.routes.ts

export const routes: Route[] = [
    {
        path: '',
        pathMatch: 'full',
        loadComponent: () => import('./gemini/generate-text/generate-text.component')
            .then((m) => m.GenerateTextComponent)
    },
    {
        path: 'text-multimodal',
        loadComponent: () => import('./gemini/generate-text-multimodal/generate-text-multimodal.component')
            .then((m) => m.GenerateTextMultimodalComponent)
    },
    {
        path: '**',
        redirectTo: '',
    }
];
// app-menu.component.ts

import { ChangeDetectionStrategy, Component } from '@angular/core';
import { RouterLink } from '@angular/router';

@Component({
  selector: 'app-app-menu',
  standalone: true,
  imports: [RouterLink],
  template: `
    <div class="menu-container">
      <ul class="menu">
        <li><a routerLink="/">Generate Text from Text Input</a></li>
        <li><a routerLink="/text-multimodal">Generate Text from Text and Image Inputs</a></li>
      </ul>
    </div>
  `,
  changeDetection: ChangeDetectionStrategy.OnPush
})
export class AppMenuComponent {}

In appConfig, provideRouter(routes) registers the routes that navigate to different components.

// app.config.ts

import { provideRouter, withComponentInputBinding } from '@angular/router';
import { routes } from './app.routes';

export const appConfig = {
    providers: [
     ... other providers ...
      provideRouter(routes),
    ]
 };

Add the navigation bar and router outlet to the AppComponent

// app.component.ts

@Component({
  selector: 'app-root',
  standalone: true,
  imports: [RouterOutlet, AppMenuComponent],
  template: `
    <div>
      <app-app-menu />
       <h2>{{ title }}</h2>
      <router-outlet />
    </div>
  `,
  changeDetection: ChangeDetectionStrategy.OnPush
})
export class AppComponent {
  title = 'Gemini AI Generate Text Demo';
}

That is it. I create a generative AI application with only Angular and Gemini REST API. However, the application can extend to a full-stack application by replacing the REST API calls with endpoints to the backend APIs.

This is the end of the blog post that analyzes data retrieval patterns in Angular. I hope you like the content and continue to follow my learning experience in Angular, NestJS, and other technologies.

Resources:

  1. Github Repo: https://github.com/railsstudent/ng-ai-google-demo
  2. Gemini REST API tutorials: https://ai.google.dev/tutorials/rest_quickstart