重要前提
安装AI Skills的关键前提是:必须科学上网,且开启TUN模式,这一点至关重要,直接决定安装能否顺利完成,在此郑重提醒三遍:科学上网,科学上网,科学上网。查看完整安装教程 →
pydicom by k-dense-ai/claude-scientific-skills
npx skills add https://github.com/k-dense-ai/claude-scientific-skills --skill pydicomPydicom 是一个用于处理 DICOM 文件的纯 Python 包,DICOM 是医学影像数据的标准格式。此技能提供了关于读取、写入和操作 DICOM 文件的指导,包括处理像素数据、元数据和各种压缩格式。
在以下场景中使用此技能:
安装 pydicom 和常用依赖:
uv pip install pydicom
uv pip install pillow # 用于图像格式转换
uv pip install numpy # 用于像素数组操作
uv pip install matplotlib # 用于可视化
要处理压缩的 DICOM 文件,可能需要额外的包:
uv pip install pylibjpeg pylibjpeg-libjpeg pylibjpeg-openjpeg # JPEG 压缩
uv pip install python-gdcm # 替代压缩处理器
使用 pydicom.dcmread() 读取 DICOM 文件:
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
import pydicom
# 读取 DICOM 文件
ds = pydicom.dcmread('path/to/file.dcm')
# 访问元数据
print(f"Patient Name: {ds.PatientName}")
print(f"Study Date: {ds.StudyDate}")
print(f"Modality: {ds.Modality}")
# 显示所有元素
print(ds)
关键点:
dcmread() 返回一个 Dataset 对象ds.PatientName)或标签表示法(例如 ds[0x0010, 0x0010])访问数据元素ds.file_meta 访问文件元数据,如传输语法 UIDgetattr(ds, 'AttributeName', default_value) 或 hasattr(ds, 'AttributeName') 处理缺失的属性从 DICOM 文件中提取和操作图像数据:
import pydicom
import numpy as np
import matplotlib.pyplot as plt
# 读取 DICOM 文件
ds = pydicom.dcmread('image.dcm')
# 获取像素数组(需要 numpy)
pixel_array = ds.pixel_array
# 图像信息
print(f"Shape: {pixel_array.shape}")
print(f"Data type: {pixel_array.dtype}")
print(f"Rows: {ds.Rows}, Columns: {ds.Columns}")
# 应用窗位窗宽进行显示(CT/MRI)
if hasattr(ds, 'WindowCenter') and hasattr(ds, 'WindowWidth'):
from pydicom.pixel_data_handlers.util import apply_voi_lut
windowed_image = apply_voi_lut(pixel_array, ds)
else:
windowed_image = pixel_array
# 显示图像
plt.imshow(windowed_image, cmap='gray')
plt.title(f"{ds.Modality} - {ds.StudyDescription}")
plt.axis('off')
plt.show()
处理彩色图像:
# RGB 图像的形状为 (rows, columns, 3)
if ds.PhotometricInterpretation == 'RGB':
rgb_image = ds.pixel_array
plt.imshow(rgb_image)
elif ds.PhotometricInterpretation == 'YBR_FULL':
from pydicom.pixel_data_handlers.util import convert_color_space
rgb_image = convert_color_space(ds.pixel_array, 'YBR_FULL', 'RGB')
plt.imshow(rgb_image)
多帧图像(视频/序列):
# 对于多帧 DICOM 文件
if hasattr(ds, 'NumberOfFrames') and ds.NumberOfFrames > 1:
frames = ds.pixel_array # 形状: (num_frames, rows, columns)
print(f"Number of frames: {frames.shape[0]}")
# 显示特定帧
plt.imshow(frames[0], cmap='gray')
使用提供的 dicom_to_image.py 脚本或手动转换:
from PIL import Image
import pydicom
import numpy as np
ds = pydicom.dcmread('input.dcm')
pixel_array = ds.pixel_array
# 归一化到 0-255 范围
if pixel_array.dtype != np.uint8:
pixel_array = ((pixel_array - pixel_array.min()) /
(pixel_array.max() - pixel_array.min()) * 255).astype(np.uint8)
# 保存为 PNG
image = Image.fromarray(pixel_array)
image.save('output.png')
使用脚本:python scripts/dicom_to_image.py input.dcm output.png
修改 DICOM 数据元素:
import pydicom
from datetime import datetime
ds = pydicom.dcmread('input.dcm')
# 修改现有元素
ds.PatientName = "Doe^John"
ds.StudyDate = datetime.now().strftime('%Y%m%d')
ds.StudyDescription = "Modified Study"
# 添加新元素
ds.SeriesNumber = 1
ds.SeriesDescription = "New Series"
# 删除元素
if hasattr(ds, 'PatientComments'):
delattr(ds, 'PatientComments')
# 或使用 del
if 'PatientComments' in ds:
del ds.PatientComments
# 保存修改后的文件
ds.save_as('modified.dcm')
移除或替换患者可识别信息:
import pydicom
from datetime import datetime
ds = pydicom.dcmread('input.dcm')
# 通常包含 PHI(受保护的健康信息)的标签
tags_to_anonymize = [
'PatientName', 'PatientID', 'PatientBirthDate',
'PatientSex', 'PatientAge', 'PatientAddress',
'InstitutionName', 'InstitutionAddress',
'ReferringPhysicianName', 'PerformingPhysicianName',
'OperatorsName', 'StudyDescription', 'SeriesDescription',
]
# 移除或替换敏感数据
for tag in tags_to_anonymize:
if hasattr(ds, tag):
if tag in ['PatientName', 'PatientID']:
setattr(ds, tag, 'ANONYMOUS')
elif tag == 'PatientBirthDate':
setattr(ds, tag, '19000101')
else:
delattr(ds, tag)
# 更新日期以保持时间关系
if hasattr(ds, 'StudyDate'):
# 通过随机偏移调整日期
ds.StudyDate = '20000101'
# 保持像素数据完整
ds.save_as('anonymized.dcm')
使用提供的脚本:python scripts/anonymize_dicom.py input.dcm output.dcm
从头创建 DICOM 文件:
import pydicom
from pydicom.dataset import Dataset, FileDataset
from datetime import datetime
import numpy as np
# 创建文件元信息
file_meta = Dataset()
file_meta.MediaStorageSOPClassUID = pydicom.uid.generate_uid()
file_meta.MediaStorageSOPInstanceUID = pydicom.uid.generate_uid()
file_meta.TransferSyntaxUID = pydicom.uid.ExplicitVRLittleEndian
# 创建 FileDataset 实例
ds = FileDataset('new_dicom.dcm', {}, file_meta=file_meta, preamble=b"\0" * 128)
# 添加必需的 DICOM 元素
ds.PatientName = "Test^Patient"
ds.PatientID = "123456"
ds.Modality = "CT"
ds.StudyDate = datetime.now().strftime('%Y%m%d')
ds.StudyTime = datetime.now().strftime('%H%M%S')
ds.ContentDate = ds.StudyDate
ds.ContentTime = ds.StudyTime
# 添加图像特定元素
ds.SamplesPerPixel = 1
ds.PhotometricInterpretation = "MONOCHROME2"
ds.Rows = 512
ds.Columns = 512
ds.BitsAllocated = 16
ds.BitsStored = 16
ds.HighBit = 15
ds.PixelRepresentation = 0
# 创建像素数据
pixel_array = np.random.randint(0, 4096, (512, 512), dtype=np.uint16)
ds.PixelData = pixel_array.tobytes()
# 添加必需的 UID
ds.SOPClassUID = pydicom.uid.CTImageStorage
ds.SOPInstanceUID = file_meta.MediaStorageSOPInstanceUID
ds.SeriesInstanceUID = pydicom.uid.generate_uid()
ds.StudyInstanceUID = pydicom.uid.generate_uid()
# 保存文件
ds.save_as('new_dicom.dcm')
处理压缩的 DICOM 文件:
import pydicom
# 读取压缩的 DICOM 文件
ds = pydicom.dcmread('compressed.dcm')
# 检查传输语法
print(f"Transfer Syntax: {ds.file_meta.TransferSyntaxUID}")
print(f"Transfer Syntax Name: {ds.file_meta.TransferSyntaxUID.name}")
# 解压并保存为未压缩格式
ds.decompress()
ds.save_as('uncompressed.dcm', write_like_original=False)
# 或在保存时压缩(需要适当的编码器)
ds_uncompressed = pydicom.dcmread('uncompressed.dcm')
ds_uncompressed.compress(pydicom.uid.JPEGBaseline8Bit)
ds_uncompressed.save_as('compressed_jpeg.dcm')
常见传输语法:
ExplicitVRLittleEndian - 未压缩,最常见JPEGBaseline8Bit - JPEG 有损压缩JPEGLossless - JPEG 无损压缩JPEG2000Lossless - JPEG 2000 无损RLELossless - 游程编码无损完整列表请参见 references/transfer_syntaxes.md。
处理嵌套数据结构:
import pydicom
ds = pydicom.dcmread('file.dcm')
# 访问序列
if 'ReferencedStudySequence' in ds:
for item in ds.ReferencedStudySequence:
print(f"Referenced SOP Instance UID: {item.ReferencedSOPInstanceUID}")
# 创建序列
from pydicom.sequence import Sequence
sequence_item = Dataset()
sequence_item.ReferencedSOPClassUID = pydicom.uid.CTImageStorage
sequence_item.ReferencedSOPInstanceUID = pydicom.uid.generate_uid()
ds.ReferencedImageSequence = Sequence([sequence_item])
处理多个相关的 DICOM 文件:
import pydicom
import numpy as np
from pathlib import Path
# 读取目录中的所有 DICOM 文件
dicom_dir = Path('dicom_series/')
slices = []
for file_path in dicom_dir.glob('*.dcm'):
ds = pydicom.dcmread(file_path)
slices.append(ds)
# 按切片位置或实例号排序
slices.sort(key=lambda x: float(x.ImagePositionPatient[2]))
# 或: slices.sort(key=lambda x: int(x.InstanceNumber))
# 创建 3D 体积
volume = np.stack([s.pixel_array for s in slices])
print(f"Volume shape: {volume.shape}") # (num_slices, rows, columns)
# 获取间距信息以进行适当缩放
pixel_spacing = slices[0].PixelSpacing # [row_spacing, col_spacing]
slice_thickness = slices[0].SliceThickness
print(f"Voxel size: {pixel_spacing[0]}x{pixel_spacing[1]}x{slice_thickness} mm")
此技能在 scripts/ 目录中包含实用脚本:
通过移除或替换受保护的健康信息(PHI)来匿名化 DICOM 文件。
python scripts/anonymize_dicom.py input.dcm output.dcm
将 DICOM 文件转换为常见的图像格式(PNG、JPEG、TIFF)。
python scripts/dicom_to_image.py input.dcm output.png
python scripts/dicom_to_image.py input.dcm output.jpg --format JPEG
以可读格式提取和显示 DICOM 元数据。
python scripts/extract_metadata.py file.dcm
python scripts/extract_metadata.py file.dcm --output metadata.txt
详细参考资料可在 references/ 目录中找到:
问题:"无法解码像素数据"
uv pip install pylibjpeg pylibjpeg-libjpeg python-gdcm问题:访问标签时出现 "AttributeError"
hasattr(ds, 'AttributeName') 检查属性是否存在,或使用 ds.get('AttributeName', default)问题:图像显示不正确(太暗/太亮)
apply_voi_lut(pixel_array, ds) 或使用 WindowCenter 和 WindowWidth 手动调整问题:大型序列的内存问题
hasattr() 或 get()save_as() 并设置 write_like_original=True官方 pydicom 文档:https://pydicom.github.io/pydicom/dev/
每周安装量
55
仓库
GitHub 星标数
17.3K
首次出现
2026年1月20日
安全审计
安装于
opencode48
codex47
gemini-cli47
cursor45
claude-code44
github-copilot43
Pydicom is a pure Python package for working with DICOM files, the standard format for medical imaging data. This skill provides guidance on reading, writing, and manipulating DICOM files, including working with pixel data, metadata, and various compression formats.
Use this skill when working with:
Install pydicom and common dependencies:
uv pip install pydicom
uv pip install pillow # For image format conversion
uv pip install numpy # For pixel array manipulation
uv pip install matplotlib # For visualization
For handling compressed DICOM files, additional packages may be needed:
uv pip install pylibjpeg pylibjpeg-libjpeg pylibjpeg-openjpeg # JPEG compression
uv pip install python-gdcm # Alternative compression handler
Read a DICOM file using pydicom.dcmread():
import pydicom
# Read a DICOM file
ds = pydicom.dcmread('path/to/file.dcm')
# Access metadata
print(f"Patient Name: {ds.PatientName}")
print(f"Study Date: {ds.StudyDate}")
print(f"Modality: {ds.Modality}")
# Display all elements
print(ds)
Key points:
dcmread() returns a Dataset objectds.PatientName) or tag notation (e.g., ds[0x0010, 0x0010])ds.file_meta to access file metadata like Transfer Syntax UIDgetattr(ds, 'AttributeName', default_value) or hasattr(ds, 'AttributeName')Extract and manipulate image data from DICOM files:
import pydicom
import numpy as np
import matplotlib.pyplot as plt
# Read DICOM file
ds = pydicom.dcmread('image.dcm')
# Get pixel array (requires numpy)
pixel_array = ds.pixel_array
# Image information
print(f"Shape: {pixel_array.shape}")
print(f"Data type: {pixel_array.dtype}")
print(f"Rows: {ds.Rows}, Columns: {ds.Columns}")
# Apply windowing for display (CT/MRI)
if hasattr(ds, 'WindowCenter') and hasattr(ds, 'WindowWidth'):
from pydicom.pixel_data_handlers.util import apply_voi_lut
windowed_image = apply_voi_lut(pixel_array, ds)
else:
windowed_image = pixel_array
# Display image
plt.imshow(windowed_image, cmap='gray')
plt.title(f"{ds.Modality} - {ds.StudyDescription}")
plt.axis('off')
plt.show()
Working with color images:
# RGB images have shape (rows, columns, 3)
if ds.PhotometricInterpretation == 'RGB':
rgb_image = ds.pixel_array
plt.imshow(rgb_image)
elif ds.PhotometricInterpretation == 'YBR_FULL':
from pydicom.pixel_data_handlers.util import convert_color_space
rgb_image = convert_color_space(ds.pixel_array, 'YBR_FULL', 'RGB')
plt.imshow(rgb_image)
Multi-frame images (videos/series):
# For multi-frame DICOM files
if hasattr(ds, 'NumberOfFrames') and ds.NumberOfFrames > 1:
frames = ds.pixel_array # Shape: (num_frames, rows, columns)
print(f"Number of frames: {frames.shape[0]}")
# Display specific frame
plt.imshow(frames[0], cmap='gray')
Use the provided dicom_to_image.py script or convert manually:
from PIL import Image
import pydicom
import numpy as np
ds = pydicom.dcmread('input.dcm')
pixel_array = ds.pixel_array
# Normalize to 0-255 range
if pixel_array.dtype != np.uint8:
pixel_array = ((pixel_array - pixel_array.min()) /
(pixel_array.max() - pixel_array.min()) * 255).astype(np.uint8)
# Save as PNG
image = Image.fromarray(pixel_array)
image.save('output.png')
Use the script: python scripts/dicom_to_image.py input.dcm output.png
Modify DICOM data elements:
import pydicom
from datetime import datetime
ds = pydicom.dcmread('input.dcm')
# Modify existing elements
ds.PatientName = "Doe^John"
ds.StudyDate = datetime.now().strftime('%Y%m%d')
ds.StudyDescription = "Modified Study"
# Add new elements
ds.SeriesNumber = 1
ds.SeriesDescription = "New Series"
# Remove elements
if hasattr(ds, 'PatientComments'):
delattr(ds, 'PatientComments')
# Or using del
if 'PatientComments' in ds:
del ds.PatientComments
# Save modified file
ds.save_as('modified.dcm')
Remove or replace patient identifiable information:
import pydicom
from datetime import datetime
ds = pydicom.dcmread('input.dcm')
# Tags commonly containing PHI (Protected Health Information)
tags_to_anonymize = [
'PatientName', 'PatientID', 'PatientBirthDate',
'PatientSex', 'PatientAge', 'PatientAddress',
'InstitutionName', 'InstitutionAddress',
'ReferringPhysicianName', 'PerformingPhysicianName',
'OperatorsName', 'StudyDescription', 'SeriesDescription',
]
# Remove or replace sensitive data
for tag in tags_to_anonymize:
if hasattr(ds, tag):
if tag in ['PatientName', 'PatientID']:
setattr(ds, tag, 'ANONYMOUS')
elif tag == 'PatientBirthDate':
setattr(ds, tag, '19000101')
else:
delattr(ds, tag)
# Update dates to maintain temporal relationships
if hasattr(ds, 'StudyDate'):
# Shift dates by a random offset
ds.StudyDate = '20000101'
# Keep pixel data intact
ds.save_as('anonymized.dcm')
Use the provided script: python scripts/anonymize_dicom.py input.dcm output.dcm
Create DICOM files from scratch:
import pydicom
from pydicom.dataset import Dataset, FileDataset
from datetime import datetime
import numpy as np
# Create file meta information
file_meta = Dataset()
file_meta.MediaStorageSOPClassUID = pydicom.uid.generate_uid()
file_meta.MediaStorageSOPInstanceUID = pydicom.uid.generate_uid()
file_meta.TransferSyntaxUID = pydicom.uid.ExplicitVRLittleEndian
# Create the FileDataset instance
ds = FileDataset('new_dicom.dcm', {}, file_meta=file_meta, preamble=b"\0" * 128)
# Add required DICOM elements
ds.PatientName = "Test^Patient"
ds.PatientID = "123456"
ds.Modality = "CT"
ds.StudyDate = datetime.now().strftime('%Y%m%d')
ds.StudyTime = datetime.now().strftime('%H%M%S')
ds.ContentDate = ds.StudyDate
ds.ContentTime = ds.StudyTime
# Add image-specific elements
ds.SamplesPerPixel = 1
ds.PhotometricInterpretation = "MONOCHROME2"
ds.Rows = 512
ds.Columns = 512
ds.BitsAllocated = 16
ds.BitsStored = 16
ds.HighBit = 15
ds.PixelRepresentation = 0
# Create pixel data
pixel_array = np.random.randint(0, 4096, (512, 512), dtype=np.uint16)
ds.PixelData = pixel_array.tobytes()
# Add required UIDs
ds.SOPClassUID = pydicom.uid.CTImageStorage
ds.SOPInstanceUID = file_meta.MediaStorageSOPInstanceUID
ds.SeriesInstanceUID = pydicom.uid.generate_uid()
ds.StudyInstanceUID = pydicom.uid.generate_uid()
# Save the file
ds.save_as('new_dicom.dcm')
Handle compressed DICOM files:
import pydicom
# Read compressed DICOM file
ds = pydicom.dcmread('compressed.dcm')
# Check transfer syntax
print(f"Transfer Syntax: {ds.file_meta.TransferSyntaxUID}")
print(f"Transfer Syntax Name: {ds.file_meta.TransferSyntaxUID.name}")
# Decompress and save as uncompressed
ds.decompress()
ds.save_as('uncompressed.dcm', write_like_original=False)
# Or compress when saving (requires appropriate encoder)
ds_uncompressed = pydicom.dcmread('uncompressed.dcm')
ds_uncompressed.compress(pydicom.uid.JPEGBaseline8Bit)
ds_uncompressed.save_as('compressed_jpeg.dcm')
Common transfer syntaxes:
ExplicitVRLittleEndian - Uncompressed, most commonJPEGBaseline8Bit - JPEG lossy compressionJPEGLossless - JPEG lossless compressionJPEG2000Lossless - JPEG 2000 losslessRLELossless - Run-Length Encoding losslessSee references/transfer_syntaxes.md for complete list.
Handle nested data structures:
import pydicom
ds = pydicom.dcmread('file.dcm')
# Access sequences
if 'ReferencedStudySequence' in ds:
for item in ds.ReferencedStudySequence:
print(f"Referenced SOP Instance UID: {item.ReferencedSOPInstanceUID}")
# Create a sequence
from pydicom.sequence import Sequence
sequence_item = Dataset()
sequence_item.ReferencedSOPClassUID = pydicom.uid.CTImageStorage
sequence_item.ReferencedSOPInstanceUID = pydicom.uid.generate_uid()
ds.ReferencedImageSequence = Sequence([sequence_item])
Work with multiple related DICOM files:
import pydicom
import numpy as np
from pathlib import Path
# Read all DICOM files in a directory
dicom_dir = Path('dicom_series/')
slices = []
for file_path in dicom_dir.glob('*.dcm'):
ds = pydicom.dcmread(file_path)
slices.append(ds)
# Sort by slice location or instance number
slices.sort(key=lambda x: float(x.ImagePositionPatient[2]))
# Or: slices.sort(key=lambda x: int(x.InstanceNumber))
# Create 3D volume
volume = np.stack([s.pixel_array for s in slices])
print(f"Volume shape: {volume.shape}") # (num_slices, rows, columns)
# Get spacing information for proper scaling
pixel_spacing = slices[0].PixelSpacing # [row_spacing, col_spacing]
slice_thickness = slices[0].SliceThickness
print(f"Voxel size: {pixel_spacing[0]}x{pixel_spacing[1]}x{slice_thickness} mm")
This skill includes utility scripts in the scripts/ directory:
Anonymize DICOM files by removing or replacing Protected Health Information (PHI).
python scripts/anonymize_dicom.py input.dcm output.dcm
Convert DICOM files to common image formats (PNG, JPEG, TIFF).
python scripts/dicom_to_image.py input.dcm output.png
python scripts/dicom_to_image.py input.dcm output.jpg --format JPEG
Extract and display DICOM metadata in a readable format.
python scripts/extract_metadata.py file.dcm
python scripts/extract_metadata.py file.dcm --output metadata.txt
Detailed reference information is available in the references/ directory:
Issue: "Unable to decode pixel data"
uv pip install pylibjpeg pylibjpeg-libjpeg python-gdcmIssue: "AttributeError" when accessing tags
hasattr(ds, 'AttributeName') or use ds.get('AttributeName', default)Issue: Incorrect image display (too dark/bright)
apply_voi_lut(pixel_array, ds) or manually adjust with WindowCenter and WindowWidthIssue: Memory issues with large series
hasattr() or get()save_as() with write_like_original=TrueOfficial pydicom documentation: https://pydicom.github.io/pydicom/dev/
Weekly Installs
55
Repository
GitHub Stars
17.3K
First Seen
Jan 20, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykPass
Installed on
opencode48
codex47
gemini-cli47
cursor45
claude-code44
github-copilot43
免费AI数据抓取智能体:自动化收集、丰富与存储网站/API数据
1,300 周安装
使用 Vercel Geist 设计系统创建 Remotion 视频 - 深色主题动画制作指南
275 周安装
AMap Skill:高德地图API集成工具,快速实现地理信息查询与路线规划
280 周安装
Paperclip AI智能体适配器开发指南:连接编排层与运行时
55 周安装
Feishu Docx Exporter:飞书/Lark文档转Markdown工具,支持AI分析、批量导出与内容管理
277 周安装
SwiftData 教程:iOS 17+ 原生持久化框架,与 SwiftUI 集成和 CloudKit 同步
277 周安装
阿里云CloudFW云防火墙测试指南 - 最小可行性测试与安全审计
275 周安装