feat(cli): add terminal image display support using ink-picture #13883

ali-aljufairi · 2025-11-27T04:52:53Z

Add the ability to display images directly in the terminal when MCP tools return image data. This enables extensions like nanobanana to show generated images inline without requiring external viewers.

Changes:

Add ink-picture dependency for cross-terminal image rendering
Create ImageDisplay component with self-contained TerminalInfoProvider
Add ImageData and ImageResult types to core tools module
Update MCP tool handler to detect and return image blocks
Integrate image display into ToolResultDisplay component

Supported terminal protocols (auto-detected):

Kitty graphics protocol
iTerm2 inline images
Sixel graphics
Half-block/Braille/ASCII fallbacks

BREAKING CHANGE: None

⚠️ This is a Proof of Concept (POC) - NOT ready for merge

Add the ability to display images directly in the terminal when MCP tools return image data. This enables extensions like nanobanana to show generated images inline without requiring external viewers.

Note: This is an experimental feature exploring terminal image display capabilities. We acknowledge that not all terminals support inline images, and there are known limitations.

Details

What this PR does:

Adds ink-picture dependency for cross-terminal image rendering
Creates ImageDisplay component with self-contained TerminalInfoProvider
Adds ImageData and ImageResult types to core tools module
Updates MCP tool handler to detect and return image blocks from extensions
Integrates image display into ToolResultDisplay component

Terminal Protocol Support (via ink-picture):

✅ Kitty graphics protocol
✅ iTerm2 inline images
✅ Sixel graphics
⚠️ Half-block/Braille/ASCII fallbacks (lower quality)

Known Limitations & Issues:

Not all terminals support inline images - This is a fundamental limitation. Terminals without graphics support will show ASCII/text fallbacks
Flickering - There is a known flickering issue during image rendering. This somthing to look at not sure why though
Image sizing - Currently fixed at 128x128 terminal cells

Changes required in nanobanana extension (not in this PR):

To test this feature, the following modifications were made to the nanobanana MCP server locally:

types.ts - Added imageData and imageMimeType fields to ImageGenerationResponse
imageGenerator.ts - Capture first image base64 data during generation
index.ts - Include image content block in MCP response:
```typescript
content.push({
type: '''image''',
data: result.imageData,
mimeType: result.imageMimeType || '''image/png'''
});
```

Demo

2025-11-27_13-37-40.mov

Related Issues

How to Validate

Prerequisites:

A terminal that supports inline images (tested on iTerm2 only)
nanobanana extension with image data modifications (see above)

Steps:

Build the packages: npm run build
Run gemini CLI in iTerm2
Use nanobanana to generate an image: /nanobanana generate image of a sunset
Observe the image displayed inline in the terminal

Expected Results:

Image should render inline in the terminal output
On unsupported terminals, ASCII/text fallback should display

Demo:

Success Criteria for Future Development

Before this can be considered for merge:

Resolve or document flickering issue
Test on multiple terminals (Kitty, WezTerm, VS Code integrated terminal)
Add configuration option to enable/disable image display
Consider adding protocol selection setting for users
Document terminal compatibility matrix
Upstream nanobanana extension changes (if desired)

Pre-Merge Checklist

⚠️ POC - Not all items completed

Add the ability to display images directly in the terminal when MCP tools return image data. This enables extensions like nanobanana to show generated images inline without requiring external viewers. Changes: - Add ink-picture dependency for cross-terminal image rendering - Create ImageDisplay component with self-contained TerminalInfoProvider - Add ImageData and ImageResult types to core tools module - Update MCP tool handler to detect and return image blocks - Integrate image display into ToolResultDisplay component Supported terminal protocols (auto-detected): - Kitty graphics protocol - iTerm2 inline images - Sixel graphics - Half-block/Braille/ASCII fallbacks BREAKING CHANGE: None

gemini-code-assist · 2025-11-27T04:53:09Z

Summary of Changes

Hello @ali-aljufairi, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces an experimental feature to render images directly in the terminal output of the CLI. By integrating the ink-picture library and modifying the tool result display, the CLI can now interpret and show image data returned by extensions, providing a richer interactive experience. This is currently a Proof of Concept, with identified limitations and a clear path for future development.

Highlights

Terminal Image Display: Added the capability to display images directly within the terminal when MCP tools return image data, enhancing the CLI experience for extensions like nanobanana.
New Component and Types: Introduced an ImageDisplay React component for rendering images and defined ImageData and ImageResult types to handle image data from tools.
MCP Tool Integration: Updated the MCP tool handler to detect and process image blocks from extension responses, converting them into ImageResult objects for display.
Cross-Terminal Support: Leverages the ink-picture library to support various terminal graphics protocols (Kitty, iTerm2, Sixel) with fallbacks to half-block/Braille/ASCII for wider compatibility.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces an exciting proof-of-concept for displaying images directly in the terminal. The implementation using ink-picture is a great start. I've identified a few areas for improvement to enhance robustness and correctness before this feature is ready for merging. My main concerns are around handling of mixed-content responses from tools, which could lead to information loss in the UI, and a bug in the new ImageDisplay component where a prop is not being used. Addressing these points will make the feature more reliable.

gemini-code-assist · 2025-11-27T04:55:14Z

packages/cli/src/ui/components/ImageDisplay.tsx

+export const ImageDisplay: React.FC<ImageDisplayProps> = ({
+  image,
+  width,
+  height,
+}) => {
+  const { base64, filePath, mimeType, alt } = image;
+
+  // Determine the image source
+  let src: string | undefined;
+
+  if (base64) {
+    // Convert base64 to data URL
+    const mime = mimeType || 'image/png';
+    src = `data:${mime};base64,${base64}`;
+  } else if (filePath) {
+    src = filePath;
+  }
+
+  if (!src) {
+    return (
+      <Box>
+        <Text color="yellow">[Image: Unable to display - no valid source]</Text>
+      </Box>
+    );
+  }
+
+  return (
+    <TerminalInfoProvider>
+      <Box flexDirection="column">
+        <Image
+          src={src}
+          width={width}
+          height={height}
+          alt={alt || 'Generated image'}
+        />
+        {alt && (
+          <Box marginTop={1}>
+            <Text dimColor>{alt}</Text>
+          </Box>
+        )}
+      </Box>
+    </TerminalInfoProvider>
+  );
+};


The preserveAspectRatio prop is defined in ImageDisplayProps but is not being destructured from the props, and therefore not passed to the underlying Image component. This makes the prop non-functional, as it will always use the default behavior of ink-picture.

export const ImageDisplay: React.FC<ImageDisplayProps> = ({ image, width, height, preserveAspectRatio, }) => { const { base64, filePath, mimeType, alt } = image; // Determine the image source let src: string | undefined; if (base64) { // Convert base64 to data URL const mime = mimeType || 'image/png'; src = `data:${mime};base64,${base64}`; } else if (filePath) { src = filePath; } if (!src) { return ( <Box> <Text color="yellow">[Image: Unable to display - no valid source]</Text> </Box> ); } return ( <TerminalInfoProvider> <Box flexDirection="column"> <Image src={src} width={width} height={height} preserveAspectRatio={preserveAspectRatio} alt={alt || 'Generated image'} /> {alt && ( <Box marginTop={1}> <Text dimColor>{alt}</Text> </Box> )} </Box> </TerminalInfoProvider> ); };

gemini-code-assist · 2025-11-27T04:55:14Z

packages/core/src/tools/mcp-tool.ts

+  if (imageBlock) {
+    // Collect any text content
+    const textParts = mcpContent
+      .filter((block): block is McpTextBlock => block.type === 'text')
+      .map((block) => block.text)
+      .join('\n');
+
+    // Try to extract file path from text content (e.g., "• /path/to/image.png")
+    const filePathMatch = textParts.match(
+      /[•\-*]\s*([^\n]+\.(png|jpg|jpeg|gif|webp))/i,
+    );
+    const filePath = filePathMatch ? filePathMatch[1].trim() : undefined;
+
+    const imageResult: ImageResult = {
+      image: {
+        // Prefer file path over base64 for better compatibility with ink-picture
+        filePath,
+        base64: filePath ? undefined : imageBlock.data,
+        mimeType: imageBlock.mimeType,
+        alt: textParts || undefined,
+      },
+    };
+    return imageResult;
+  }


The current logic for displaying images finds the first image block and immediately returns, causing any other non-text content blocks in the response to be ignored by the UI. This can lead to information loss if a tool returns an image alongside other important content like resource links.

To prevent this, consider processing the image only if it's the sole non-text content type. If other content types are present, you could fall back to the existing string-based display for all blocks to ensure no information is hidden from the user.

gemini-code-assist · 2025-11-27T04:55:14Z

packages/core/src/tools/mcp-tool.ts

+    // Try to extract file path from text content (e.g., "• /path/to/image.png")
+    const filePathMatch = textParts.match(
+      /[•\-*]\s*([^\n]+\.(png|jpg|jpeg|gif|webp))/i,
+    );
+    const filePath = filePathMatch ? filePathMatch[1].trim() : undefined;


Extracting the file path from the text content using a regular expression is brittle as it relies on a specific text format. If the tool's text output for file paths changes, this logic will break.

A more robust approach would be for the MCP tool to provide the file path in a structured field within the image content block itself, avoiding the need for string parsing. While the server-side changes might be out of scope for this PR, this approach should be considered for a more resilient implementation.

ali-aljufairi requested review from a team as code owners November 27, 2025 04:52

ali-aljufairi marked this pull request as draft November 27, 2025 04:53

gemini-code-assist bot reviewed Nov 27, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(cli): add terminal image display support using ink-picture #13883

feat(cli): add terminal image display support using ink-picture #13883

ali-aljufairi commented Nov 27, 2025

Uh oh!

gemini-code-assist bot commented Nov 27, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Nov 27, 2025

Uh oh!

gemini-code-assist bot Nov 27, 2025

Uh oh!

gemini-code-assist bot Nov 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

feat(cli): add terminal image display support using ink-picture #13883

Are you sure you want to change the base?

feat(cli): add terminal image display support using ink-picture #13883

Conversation

ali-aljufairi commented Nov 27, 2025

Details

What this PR does:

Terminal Protocol Support (via ink-picture):

Known Limitations & Issues:

Changes required in nanobanana extension (not in this PR):

Demo

Related Issues

How to Validate

Prerequisites:

Steps:

Expected Results:

Demo:

Success Criteria for Future Development

Pre-Merge Checklist

Uh oh!

gemini-code-assist bot commented Nov 27, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Nov 27, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 27, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 27, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant