Building Dynamic Audio with Emotion & Pace: Gemini 3.1 Flash TTS, Angular & Firebase Cloud Functions

Reading Time: 6 minutesGoogle released the Gemini 3.1 Flash TTS Preview model for AI audio generation in the Gemini API, Gemini in Vertex AI, and Gemini AI Studio. This model introduces a new Audio tags feature to exhibit expressive human emotion, pace, and style. This application explores Firebase AI Logic to analyze an uploaded image to generate recommendations, … Read more

Extending a Video with Angular, Veo 3.1 Lite, Firebase Cloud Functions, and Firebase Cloud Storage [GDE]

Reading Time: 13 minutesExtending a Video with Angular, Veo 3.1 Lite, Firebase Cloud Functions, and Firebase Cloud Storage Google released the Veo 3.1 Lite model for AI video generation in the Gemini API, Gemini in Vertex AI, and Gemini AI Studio. This model solves a common developer pain point: generating high-quality videos quickly and at a lower cost. … Read more

Observability at Scale: Mastering ADK Callbacks for Cost, Latency, and Auditability

Reading Time: 11 minutesObservability at Scale: Mastering ADK Callbacks for Cost, Latency, and Auditability AI orchestrators receive significant attention; however, when deployments become latent and costly, developers often overlook a critical capability: ADK callback hooks. The design patterns and best practices of callback hooks enable developers to refactor logic from agents to callback hooks to add observability, reduce … Read more

Stop Wasting Tokens: Building Deterministic Custom Agents with Google ADK

Reading Time: 8 minutesIn the world of AI orchestration, it’s tempting to use a Large Language Model (LLM) for every step of a workflow. However, as applications scale, the “LLM-first” approach can introduce unnecessary latency, costs, and unpredictability. The Google Agent Development Kit (ADK) provides a powerful alternative: the BaseAgent. This post explores how to create a custom, … Read more

Automating Technical Blog Localization with Gemini CLI Agent Skills

Reading Time: 12 minutesThe Use Case: Scaling Content to Global Tech Communities When content writers write English blog posts, they often want to distribute them to different tech communities to share knowledge, increase impact, and reach. However, many readers in these communities do not read, write, or speak English. When the blog post is posted in these localized, … Read more

Building a Digital Docent: Master Agentic Vision with Gemini 3

Reading Time: 10 minutesThe Pain Point: The Invisible World of the “Granular” Masterpiece When a postcard miniaturizes a traditional Chinese painting, the reduced scale obscures details. Consider a standard postcard depicting workers and horses pulling wagons in a crowded street during the Qing dynasty. Adults and children walk on a busy bridge while shops on both sides sell … Read more

How to Refactor a Complex Blog Review Prompt into Reusable AI Agents

Reading Time: 14 minutesTechnical writers and ESL professionals often rely on complex prompts to polish our work. However, as prompts grow into monolithic walls of text, they become brittle, hard to debug, and expensive to run. Modularity addresses this challenge. In this post, I transition from a single Custom Command to a modular, portable Gemini CLI extension implemented … Read more

Fetching Live Sports Data with Gemini 3: A Guide to Grounded, Structured JSON

Reading Time: 13 minutesRetrieving accurate, up-to-date sports statistics with LLMs is notoriously difficult due to hallucinations and outdated training data. In this post, I will explore retrieving the Premier League 2025/2026 Player statistics using the Gemini 3 Flash Preview model, URL Context, Grounding with Google Search, and structured output. I will describe lessons learned and how I extracted … Read more

Building an AI-Powered Alt Text Generator with Angular, Firebase AI Logic, and Gemini 3

Reading Time: 5 minutesIn this project, I stepped out of my comfort zone to upgrade from Gemini 2.5 to Gemini 3 in Vertex AI. The goal was to build an intelligent Image Alt Text Generator that goes beyond simple description and hashtags. By leveraging Gemini 3.0 (Pro Preview), this application analyzes an image to generate alternate text, hashtags, … Read more