gemini-imagegen by everyinc/compound-engineering-plugin
npx skills add https://github.com/everyinc/compound-engineering-plugin --skill gemini-imagegen使用 Google 的 Gemini API 生成和编辑图像。必须设置环境变量 GEMINI_API_KEY。
| 模型 | 分辨率 | 最佳用途 |
|---|---|---|
gemini-3-pro-image-preview | 1K-4K | 所有图像生成(默认) |
注意: 始终使用此 Pro 模型。仅在明确要求时才使用其他模型。
gemini-3-pro-image-preview1:1, , , , , , , , ,
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
2:33:23:44:34:55:49:1616:921:91K(默认), 2K, 4K
import os
from google import genai
from google.genai import types
client = genai.Client(api_key=os.environ["GEMINI_API_KEY"])
# 基本生成 (1K, 1:1 - 默认值)
response = client.models.generate_content(
model="gemini-3-pro-image-preview",
contents=["Your prompt here"],
config=types.GenerateContentConfig(
response_modalities=['TEXT', 'IMAGE'],
),
)
for part in response.parts:
if part.text:
print(part.text)
elif part.inline_data:
image = part.as_image()
image.save("output.png")
from google.genai import types
response = client.models.generate_content(
model="gemini-3-pro-image-preview",
contents=[prompt],
config=types.GenerateContentConfig(
response_modalities=['TEXT', 'IMAGE'],
image_config=types.ImageConfig(
aspect_ratio="16:9", # 宽屏格式
image_size="2K" # 更高分辨率
),
)
)
# 1K (默认) - 快速,适合预览
image_config=types.ImageConfig(image_size="1K")
# 2K - 质量与速度平衡
image_config=types.ImageConfig(image_size="2K")
# 4K - 最高质量,速度较慢
image_config=types.ImageConfig(image_size="4K")
# 正方形 (默认)
image_config=types.ImageConfig(aspect_ratio="1:1")
# 横向宽屏
image_config=types.ImageConfig(aspect_ratio="16:9")
# 超宽全景
image_config=types.ImageConfig(aspect_ratio="21:9")
# 纵向
image_config=types.ImageConfig(aspect_ratio="9:16")
# 照片标准
image_config=types.ImageConfig(aspect_ratio="4:3")
传递现有图像并附带文本提示:
from PIL import Image
img = Image.open("input.png")
response = client.models.generate_content(
model="gemini-3-pro-image-preview",
contents=["Add a sunset to this scene", img],
config=types.GenerateContentConfig(
response_modalities=['TEXT', 'IMAGE'],
),
)
使用聊天进行迭代编辑:
from google.genai import types
chat = client.chats.create(
model="gemini-3-pro-image-preview",
config=types.GenerateContentConfig(response_modalities=['TEXT', 'IMAGE'])
)
response = chat.send_message("Create a logo for 'Acme Corp'")
# 保存第一张图片...
response = chat.send_message("Make the text bolder and add a blue gradient")
# 保存优化后的图片...
包含相机细节:镜头类型、光照、角度、氛围。
"一张照片级真实感的特写肖像,85mm镜头,柔和的黄金时刻光线,浅景深"
明确指定风格:
"一个快乐小熊猫的可爱风格贴纸,粗轮廓线,赛璐珞着色,白色背景"
明确说明字体样式和位置:
"创建一个带有'Daily Grind'文字的徽标,使用简洁的无衬线字体,黑白配色,咖啡豆图案"
描述光照设置和表面:
"抛光混凝土上的影棚照明产品照片,三点柔光箱设置,45度角"
基于实时数据生成图像:
response = client.models.generate_content(
model="gemini-3-pro-image-preview",
contents=["Visualize today's weather in Tokyo as an infographic"],
config=types.GenerateContentConfig(
response_modalities=['TEXT', 'IMAGE'],
tools=[{"google_search": {}}]
)
)
组合多个来源的元素:
response = client.models.generate_content(
model="gemini-3-pro-image-preview",
contents=[
"Create a group photo of these people in an office",
Image.open("person1.png"),
Image.open("person2.png"),
Image.open("person3.png"),
],
config=types.GenerateContentConfig(
response_modalities=['TEXT', 'IMAGE'],
),
)
关键: Gemini API 默认返回 JPEG 格式的图像。保存时,始终使用 .jpg 扩展名,以避免媒体类型不匹配。
# 正确 - 使用 .jpg 扩展名 (Gemini 返回 JPEG)
image.save("output.jpg")
# 错误 - 将导致"图像与媒体类型不匹配"错误
image.save("output.png") # 创建了带有 PNG 扩展名的 JPEG 文件!
如果您特别需要 PNG 格式:
from PIL import Image
# 使用 Gemini 生成
for part in response.parts:
if part.inline_data:
img = part.as_image()
# 通过指定格式保存来转换为 PNG
img.save("output.png", format="PNG")
使用 file 命令检查实际格式与扩展名:
file image.png
# 如果输出显示"JPEG image data" - 请重命名为 .jpg!
.jpg 扩展名responseModalities: ["IMAGE"]) 无法与 Google 搜索基础功能一起使用每周安装量
340
代码仓库
GitHub 星标
10.9K
首次出现
2026年1月21日
安全审计
安装于
opencode306
codex300
gemini-cli299
claude-code280
github-copilot280
cursor274
Generate and edit images using Google's Gemini API. The environment variable GEMINI_API_KEY must be set.
| Model | Resolution | Best For |
|---|---|---|
gemini-3-pro-image-preview | 1K-4K | All image generation (default) |
Note: Always use this Pro model. Only use a different model if explicitly requested.
gemini-3-pro-image-preview1:1, 2:3, 3:2, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9
1K (default), 2K, 4K
import os
from google import genai
from google.genai import types
client = genai.Client(api_key=os.environ["GEMINI_API_KEY"])
# Basic generation (1K, 1:1 - defaults)
response = client.models.generate_content(
model="gemini-3-pro-image-preview",
contents=["Your prompt here"],
config=types.GenerateContentConfig(
response_modalities=['TEXT', 'IMAGE'],
),
)
for part in response.parts:
if part.text:
print(part.text)
elif part.inline_data:
image = part.as_image()
image.save("output.png")
from google.genai import types
response = client.models.generate_content(
model="gemini-3-pro-image-preview",
contents=[prompt],
config=types.GenerateContentConfig(
response_modalities=['TEXT', 'IMAGE'],
image_config=types.ImageConfig(
aspect_ratio="16:9", # Wide format
image_size="2K" # Higher resolution
),
)
)
# 1K (default) - Fast, good for previews
image_config=types.ImageConfig(image_size="1K")
# 2K - Balanced quality/speed
image_config=types.ImageConfig(image_size="2K")
# 4K - Maximum quality, slower
image_config=types.ImageConfig(image_size="4K")
# Square (default)
image_config=types.ImageConfig(aspect_ratio="1:1")
# Landscape wide
image_config=types.ImageConfig(aspect_ratio="16:9")
# Ultra-wide panoramic
image_config=types.ImageConfig(aspect_ratio="21:9")
# Portrait
image_config=types.ImageConfig(aspect_ratio="9:16")
# Photo standard
image_config=types.ImageConfig(aspect_ratio="4:3")
Pass existing images with text prompts:
from PIL import Image
img = Image.open("input.png")
response = client.models.generate_content(
model="gemini-3-pro-image-preview",
contents=["Add a sunset to this scene", img],
config=types.GenerateContentConfig(
response_modalities=['TEXT', 'IMAGE'],
),
)
Use chat for iterative editing:
from google.genai import types
chat = client.chats.create(
model="gemini-3-pro-image-preview",
config=types.GenerateContentConfig(response_modalities=['TEXT', 'IMAGE'])
)
response = chat.send_message("Create a logo for 'Acme Corp'")
# Save first image...
response = chat.send_message("Make the text bolder and add a blue gradient")
# Save refined image...
Include camera details: lens type, lighting, angle, mood.
"A photorealistic close-up portrait, 85mm lens, soft golden hour light, shallow depth of field"
Specify style explicitly:
"A kawaii-style sticker of a happy red panda, bold outlines, cel-shading, white background"
Be explicit about font style and placement:
"Create a logo with text 'Daily Grind' in clean sans-serif, black and white, coffee bean motif"
Describe lighting setup and surface:
"Studio-lit product photo on polished concrete, three-point softbox setup, 45-degree angle"
Generate images based on real-time data:
response = client.models.generate_content(
model="gemini-3-pro-image-preview",
contents=["Visualize today's weather in Tokyo as an infographic"],
config=types.GenerateContentConfig(
response_modalities=['TEXT', 'IMAGE'],
tools=[{"google_search": {}}]
)
)
Combine elements from multiple sources:
response = client.models.generate_content(
model="gemini-3-pro-image-preview",
contents=[
"Create a group photo of these people in an office",
Image.open("person1.png"),
Image.open("person2.png"),
Image.open("person3.png"),
],
config=types.GenerateContentConfig(
response_modalities=['TEXT', 'IMAGE'],
),
)
CRITICAL: The Gemini API returns images in JPEG format by default. When saving, always use .jpg extension to avoid media type mismatches.
# CORRECT - Use .jpg extension (Gemini returns JPEG)
image.save("output.jpg")
# WRONG - Will cause "Image does not match media type" errors
image.save("output.png") # Creates JPEG with PNG extension!
If you specifically need PNG format:
from PIL import Image
# Generate with Gemini
for part in response.parts:
if part.inline_data:
img = part.as_image()
# Convert to PNG by saving with explicit format
img.save("output.png", format="PNG")
Check actual format vs extension with the file command:
file image.png
# If output shows "JPEG image data" - rename to .jpg!
.jpg extensionresponseModalities: ["IMAGE"]) won't work with Google Search groundingWeekly Installs
340
Repository
GitHub Stars
10.9K
First Seen
Jan 21, 2026
Security Audits
Gen Agent Trust HubFailSocketPassSnykWarn
Installed on
opencode306
codex300
gemini-cli299
claude-code280
github-copilot280
cursor274
React 组合模式指南:Vercel 组件架构最佳实践,提升代码可维护性
106,200 周安装