Generating replies using Groq and Gemma in NestJS

Reading Time: 7 minutes



In this blog post, I demonstrated generating replies with Groq SDK and Gemma 7B model. In auction sites such as eBay, buyers can provide ratings and comments on sales transactions. When the feedback is negative, the seller must reply promptly to resolve the dispute. This demo aims to generate responses in the same language of the buyer according to the the tone (positive, neutral or negative) and topics. Chatbot and user engaged in multi-turn conversations to obtain the language, sentiment, and topics of the feedback. Finally, the model generates the final reply to keep customers happy.

let's go

Generate Groq API Key

Log in Groq Cloud and navigate to to generate an API key.

Create a new NestJS Project

nest new nestjs-groq-customer-feedback

Install dependencies

npm i --save-exact @nestjs/swagger @nestjs/throttler dotenv compression helmet class-validator class-transformer groq-sdk

Generate a Feedback Module

nest g mo advisoryFeedback
nest g co advisoryFeedback/presenters/http/advisoryFeedback --flat
nest g s advisoryFeedback/application/advisoryFeedback --flat
nest g s advisoryFeedback/application/advisoryFeedbackPromptChainingService --flat

Create a AdvisoryFeedbackModule module, a controller, a service for the API and another service to build chained prompts.

Define GROQ environment variables

// .env.example

GROQ_API_KEY=<groq api key>

Copy .env.example to .env, and replace GROQ_API_KEY and GROQ_MODEL with the actual API Key and the Gemma model, respectively.

  • PORT – port number of the NestJS application
  • GROQ_MODEL – GROQ model and I used Gemma 7B in this demo

Add .env to the .gitignore file to prevent accidentally committing the Groq API Key to the GitHub repo.

Add configuration files

The project has 3 configuration files. validate.config.ts validates the payload is valid before any request can route to the controller to execute

// validate.config.ts

import { ValidationPipe } from '@nestjs/common';

export const validateConfig = new ValidationPipe({
  whitelist: true,
  stopAtFirstError: true,
  forbidUnknownValues: false,

env.config.ts extracts the environment variables from process.env and stores the values in the env object.

import dotenv from 'dotenv';


export const env = {
  PORT: parseInt(process.env.PORT || '3001'),
  GROQ: {
    API_KEY: process.env.GROQ_API_KEY || '',
    MODEL_NAME: process.env.GROQ_MODEL || 'llama3-8b-8192',

throttler.config.ts defines the rate limit of the API

// throttler.config.ts

import { ThrottlerModule } from '@nestjs/throttler';

export const throttlerConfig = ThrottlerModule.forRoot([
    ttl: 60000,
    limit: 10,

Each route allows ten requests in 60,000 milliseconds or 1 minute.

Bootstrap the application

// bootstrap.ts

export class Bootstrap {
  private app: NestExpressApplication;

  async initApp() { = await NestFactory.create(AppModule);

  enableCors() {;

  setupMiddleware() {{ limit: '1000kb' }));{ extended: false }));;;

  setupGlobalPipe() {;

  async startApp() {

  setupSwagger() {
    const config = new DocumentBuilder()
      .setTitle('ESG Advisory Feedback with Groq and Gemma')
      .setDescription('Integrate with Groq to improve ESG advisory feebacking by prompt chaining')
      .addTag('Groq, Gemma, Prompt Chaining')
    const document = SwaggerModule.createDocument(, config);
    SwaggerModule.setup('api',, document);

Added a Bootstrap class to setup Swagger, middleware, global validation, CORS, and finally application start.

// main.ts

import { env } from '~configs/env.config';
import { Bootstrap } from '~core/bootstrap';

async function bootstrap() {
  const bootstrap = new Bootstrap();
  await bootstrap.initApp();
  await bootstrap.startApp();

  .then(() => console.log(`The application starts successfully at port ${env.PORT}`))
  .catch((error) => console.error(error));

The bootstrap function enabled CORS, registered middleware to the application, set up Swagger documentation, and used a global pipe to validate payloads.

I have laid down the groundwork and the next step is to add an endpoint to receive payload for generating replies with prompt chaining.

Define Feedback DTO

// feedback.dto.ts

import { IsNotEmpty, IsString } from 'class-validator';

export class FeedbackDto {
  prompt: string;

FeedbackDto accepts a prompt that is the customer feedback.

Construct Gemma Model

// groq.constant.ts

// groq.provider.ts

import { Provider } from '@nestjs/common';
import { GROQ_CHAT_MODEL } from '../constants/groq.constant';
import Groq from 'groq-sdk';
import { env } from '~configs/env.config';

export const GroqChatModelProvider: Provider<Groq.Chat> = {
  provide: GROQ_CHAT_MODEL,
  useFactory: () => new Groq({ apiKey: env.GROQ.API_KEY }).chat,

GroqChatModelProvider is a Gemma model that writes a short reply in the same language of the feedback.

Implement Reply Service

// groq.config.ts

import { ChatCompletionCreateParamsNonStreaming } from 'groq-sdk/resources/chat/completions';
import { env } from '~configs/env.config';

export const MODEL_CONFIG: Omit<ChatCompletionCreateParamsNonStreaming, 'messages'> = {
  model: env.GROQ.MODEL_NAME,
  temperature: 0.5,
  max_tokens: 1024,
  top_p: 0.5,
  stream: false,
// sentiment-analysis.type.ts

export type SentimentAnalysis = {
  sentiment: 'POSITIVE' | 'NEUTRAL' | 'NEGATIVE';
  topic: string;
// advisory-feedback-prompt-chaining.service.ts

// Omit the import statements 

export class AdvisoryFeedbackPromptChainingService {
  private readonly logger = new Logger(;
  private chatbot = this.groq.completions;

  constructor(@Inject(GROQ_CHAT_MODEL) private groq: Groq.Chat) {}

  async generateReply(feedback: string): Promise<string> {
    try {
      const instruction = `You are a professional ESG advisor who can reply in the same language as the customer's feedback. 
    The reply is short and should also address the sentiment and topics of the feedback.`;

      const messages: ChatCompletionMessageParam[] = [
          role: 'system',
          content: instruction,
          role: 'user',
          content: `Please identify the language used in the feedback. Give me the language name, and nothing else.
        If the language is Chinese, please specify Traditional Chinese or Simplified Chinese. 
        If you do not know the language, give 'Unknown'.
        Feedback: ${feedback}

      const response = await this.chatbot.create({
      const language = response.choices?.[0]?.message?.content || '';

        { role: 'assistant', content: language },
          role: 'user',
          content: `Identify the sentiment and topic of feedback and return the JSON output { "sentiment": 'POSITIVE' | 'NEUTRAL' | 'NEGATIVE', "topic": string }.`,

      const analysis = await this.chatbot.create({
      const jsonAnalysis = JSON.parse(analysis.choices?.[0]?.message?.content || '') as SentimentAnalysis;
      const { sentiment, topic } = jsonAnalysis;
      this.logger.log(`sentiment -> ${sentiment}, topic -> ${topic}`);

      const chainedPrompt = `The customer wrote a ${sentiment} feedback about ${topic} in ${language}. Please give a short reply.`;
        { role: 'assistant', content: `The sentiment is ${sentiment} and the topics are ${topic}` },
        { role: 'user', content: chainedPrompt },

      const result = await this.chatbot.create({

      const text = result.choices[0]?.message?.content || '';
      this.logger.log(`text -> ${text}`);
      return text;
    } catch (ex) {
      throw ex;

AdvisoryFeedbackPromptChainingService injects a chat model in the constructor.

  • groq – A Chat API to have an assistant to answer the queries of the user.
  • generateReply – In this method, a user asked the chat model about the language, sentiment and topics of the feedback. Then, the assistant gave the answers according to the instructions of the prompts. Next, I manually appended the queries and answers to the messages array to update the chat history. It was important because the chatbot referred to previous conversations to form the correct context to answer future questions. Finally, the chatbot generated replies in the same language based on sentiment and topics.
const response = await this.chatbot.create({
 const language = response.choices?.[0]?.message?.content || '';
        { role: 'assistant', content: language },
          role: 'user',
          content: `Identify the sentiment and topic of feedback and return the JSON output { "sentiment": 'POSITIVE' | 'NEUTRAL' | 'NEGATIVE', "topic": string }.`,

this.chatbot.create returned the language, I appended the value and the next user query to the messages array.

The process for generating replies ended by producing the text output from generateReply. The method asked questions iteratively and wrote a descriptive prompt for the LLM to draft a reply that was polite and addressed the need of the customer.

// advisory-feedback.service.ts

// Omit the import statements to save space

export class AdvisoryFeedbackService {
  constructor(private promptChainingService: AdvisoryFeedbackPromptChainingService) {}

  generateReply(prompt: string): Promise<string> {
    return this.promptChainingService.generateReply(prompt);

AdvisoryFeedbackService injects AdvisoryFeedbackPromptChainingService and constructs multiple chains to ask the chat model to generate a reply.

Implement Advisory Feedback Controller

// advisory-feedback.controller.ts

// Omit the import statements to save space

export class AdvisoryFeedbackController {
  constructor(private service: AdvisoryFeedbackService) {}

  generateReply(@Body() dto: FeedbackDto): Promise<string> {
    return this.service.generateReply(dto.prompt);

The AdvisoryFeedbackController injects AdvisoryFeedbackService using Groq SDK and Gemma 7B model. The endpoint invokes the method to generate a reply from the prompt.

  • /esg-advisory-feedback – generate a reply from a prompt

Module Registration

The AdvisoryFeedbackModule provides AdvisoryFeedbackPromptChainingService, AdvisoryFeedbackService and GroqChatModelProvider. The module has one controller that is AdvisoryFeedbackController.

// advisory-feedback.module.ts

// Omit the import statements due to brevity reason 

  controllers: [AdvisoryFeedbackController],
  providers: [GroqChatModelProvider, AdvisoryFeedbackPromptChainingService, AdvisoryFeedbackService],
export class AdvisoryFeedbackModule {}

Import AdvisoryFeedbackModule into AppModule.

// app.module.ts

  imports: [throttlerConfig, AdvisoryFeedbackModule],
  controllers: [AppController],
  providers: [
      provide: APP_GUARD,
      useClass: ThrottlerGuard,
export class AppModule {}

Test the endpoints

I can test the endpoints with cURL, Postman or Swagger documentation after launching the application.

npm run start:dev

The URL of the Swagger documentation is http://localhost:3001/api.


curl --location 'http://localhost:3001/esg-advisory-feedback' \
--header 'Content-Type: application/json' \
--data '{
    "prompt": "Looking ahead, the needs of our customers will increasingly be defined by sustainable choices. ESG reporting through diginex has brought us uniformity, transparency and direction. It provides us with a framework to be able to demonstrate to all stakeholders - customers, employees, and investors - what we are doing and to be open and transparent."

Dockerize the application

// .dockerignore


Create a .dockerignore file for Docker to ignore some files and directories.

// Dockerfile

# Use an official Node.js runtime as the base image
FROM node:20-alpine

# Set the working directory in the container

# Copy package.json and package-lock.json to the working directory
COPY package*.json ./

# Install the dependencies
RUN npm install

# Copy the rest of the application code to the working directory
COPY . .

# Expose a port (if your application listens on a specific port)

# Define the command to run your application
CMD [ "npm", "run", "start:dev"]

I added the Dockerfile that installed the dependencies, built the NestJS application, and started it at port 3001.

// docker-compose.yaml

version: '3.8'

      context: .
      dockerfile: Dockerfile
      - PORT=${PORT}
      - "${PORT}:${PORT}"
      - ai
    restart: unless-stopped

I added the docker-compose.yaml in the current folder, which was responsible for creating the NestJS application container.

Launch the Docker application

docker-compose up

Navigate to http://localhost:3001/api to read and execute the API.

This concludes my blog post about using Groq SDK and Gemma 7b model to tackle generating replies regardless the written languages. Generating replies with Generative AI reduces the efforts that a writer needs to compose a polite reply to any customer. I hope you like the content and continue to follow my learning experience in Angular, NestJS, Generative AI, and other technologies.


  1. Github Repo:
  2. Groq Chat Completion:
  3. Groq Cookbook: