administering-linux by ancoleman/ai-design-components
npx skills add https://github.com/ancoleman/ai-design-components --skill administering-linux在现代云原生环境中,用于管理服务器、部署应用程序和排查生产问题的全面 Linux 系统管理。
本技能教授面向 DevOps 工程师、SRE、后端开发人员和平台工程师的基础及中级 Linux 系统管理。重点关注基于 systemd 的发行版(Ubuntu、RHEL、Debian、Fedora),涵盖服务管理、进程监控、文件系统操作、用户管理、性能调优、日志分析和网络配置。
现代基础设施即使采用容器化,也需要扎实的 Linux 基础。容器宿主机运行 Linux,Kubernetes 节点需要优化,排查生产问题需要理解 systemd、进程和日志。
不涵盖内容:
network-architecture 技能security-hardening 技能configuration-management 技能kubernetes-operations 技能在部署自定义应用程序、排查系统缓慢问题、调查服务故障、优化工作负载、管理用户、配置 SSH、监控磁盘空间、调度任务、诊断网络问题或应用性能调优时使用。
服务管理:
systemctl start nginx # 启动服务
systemctl stop nginx # 停止服务
systemctl restart nginx # 重启服务
systemctl status nginx # 检查状态
systemctl enable nginx # 启用开机启动
journalctl -u nginx -f # 跟踪服务日志
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
进程监控:
top # 交互式进程监视器
htop # 增强型进程监视器
ps aux | grep process_name # 查找特定进程
kill -15 PID # 优雅关闭 (SIGTERM)
kill -9 PID # 强制终止 (SIGKILL)
磁盘使用情况:
df -h # 文件系统使用情况
du -sh /path/to/dir # 目录大小
ncdu /path # 交互式磁盘分析器
日志分析:
journalctl -f # 跟踪所有日志
journalctl -u service -f # 跟踪服务日志
journalctl --since "1 hour ago" # 按时间过滤
journalctl -p err # 仅显示错误
用户管理:
useradd -m -s /bin/bash username # 创建用户并创建家目录
passwd username # 设置密码
usermod -aG sudo username # 添加到 sudo 组
userdel -r username # 删除用户及其家目录
Systemd 是标准的初始化系统和服务管理器。Systemd 单元定义了服务、定时器、目标和其他系统资源。
单元文件位置(优先级顺序):
/etc/systemd/system/ - 自定义单元(最高优先级)/run/systemd/system/ - 运行时单元(临时)/lib/systemd/system/ - 系统提供的单元(请勿修改)关键单元类型: .service(服务)、.timer(计划任务)、.target(单元组)、.socket(套接字激活)
基本 systemctl 命令:
systemctl daemon-reload # 更改后重新加载单元文件
systemctl list-units --type=service
systemctl list-timers # 显示所有定时器
systemctl cat nginx.service # 显示单元文件内容
systemctl edit nginx.service # 创建覆盖文件
有关详细的 systemd 参考,请参阅 references/systemd-guide.md。
进程是具有唯一 PID 的运行中的程序。理解进程状态、信号和资源使用情况对于故障排除至关重要。
进程状态: R(运行中)、S(休眠中)、D(不可中断休眠/I/O)、Z(僵尸)、T(已停止)
常见信号: SIGTERM(15)优雅关闭、SIGKILL(9)强制终止、SIGHUP(1)重新加载配置
进程优先级:
nice -n 10 command # 以较低优先级启动
renice -n 5 -p PID # 更改运行中进程的优先级
基本目录:/(根目录)、/etc/(配置)、/var/(可变数据)、/opt/(可选软件)、/usr/(用户程序)、/home/(用户目录)、/tmp/(临时文件)、/boot/(引导加载程序)
文件系统类型快速参考:
有关文件系统管理的详细信息(包括 LVM 和 RAID),请参阅 references/filesystem-management.md。
Ubuntu/Debian (apt):
apt update && apt upgrade # 更新系统
apt install package # 安装软件包
apt remove package # 移除软件包
apt search keyword # 搜索软件包
RHEL/CentOS/Fedora (dnf):
dnf update # 更新所有软件包
dnf install package # 安装软件包
dnf remove package # 移除软件包
dnf search keyword # 搜索软件包
系统服务使用原生软件包管理器;桌面应用程序和跨发行版兼容性使用 snap/flatpak。
调查工作流程:
识别瓶颈:
top # 快速概览
uptime # 负载平均值
CPU 问题(使用率 >80%):
top # 按 Shift+P 根据 CPU 排序
ps aux --sort=-%cpu | head
内存问题(使用了交换空间):
free -h # 内存使用情况
top # 按 Shift+M 根据内存排序
磁盘 I/O 问题(wa% 高):
iostat -x 1 # 磁盘统计信息
iotop # 按进程的 I/O
网络问题:
ss -tunap # 活动连接
iftop # 带宽监视器
有关全面的故障排除,请参阅 references/troubleshooting-guide.md。
快速决策:
步骤 1:创建单元文件
sudo nano /etc/systemd/system/myapp.service
步骤 2:单元文件内容
[Unit]
Description=My Web Application
After=network.target postgresql.service
Requires=postgresql.service
[Service]
Type=simple
User=myapp
Group=myapp
WorkingDirectory=/opt/myapp
Environment="PORT=8080"
ExecStart=/opt/myapp/bin/server
ExecReload=/bin/kill -HUP $MAINPID
Restart=on-failure
RestartSec=5s
StandardOutput=journal
# Security hardening
PrivateTmp=true
NoNewPrivileges=true
ProtectSystem=strict
ReadWritePaths=/var/lib/myapp
[Install]
WantedBy=multi-user.target
步骤 3:部署和启动
sudo useradd -r -s /bin/false myapp
sudo mkdir -p /var/lib/myapp
sudo chown myapp:myapp /var/lib/myapp
sudo systemctl daemon-reload
sudo systemctl enable myapp.service
sudo systemctl start myapp.service
sudo systemctl status myapp.service
完整示例请参阅 examples/systemd-units/。
为计划任务创建服务和定时器单元。定时器单元指定 OnCalendar= 计划和 Persistent=true 以处理错过的作业。服务单元具有 Type=oneshot。完整示例请参阅 examples/systemd-units/backup.timer 和 backup.service。
生成 SSH 密钥:
ssh-keygen -t ed25519 -C "admin@example.com"
ssh-copy-id admin@server
加固 sshd_config:
sudo nano /etc/ssh/sshd_config
关键设置:
PermitRootLogin no
PasswordAuthentication no
PubkeyAuthentication yes
MaxAuthTries 3
AllowUsers admin deploy
X11Forwarding no
Port 2222 # 可选
应用更改:
sudo sshd -t # 测试
sudo systemctl restart sshd # 应用(保持备份会话!)
完整的 SSH 配置请参阅 examples/configs/sshd_config.hardened 和 references/security-hardening.md。
在 /etc/sysctl.d/99-custom.conf 中配置 sysctl 参数以进行网络调优(TCP 缓冲区、BBR 拥塞控制)、内存管理(交换倾向、缓存压力)和文件描述符。在 /etc/security/limits.conf 中设置 ulimits 以限制 nofile 和 nproc。配置 I/O 调度器和 CPU 调控器。全面的调优请参阅 references/performance-tuning.md 和 examples/configs/ 中的模板。
使用 systemctl status myapp 和 journalctl -u myapp 调查问题。通过时间 --since、严重性 -p err 或使用 grep 搜索模式来过滤日志。使用 top、df -h、free -h 与系统指标关联。使用 journalctl -k | grep -i oom 检查 OOM 终止。详细工作流程请参阅 references/troubleshooting-guide.md。
接口管理:
ip addr show # 显示所有接口
ip link set eth0 up # 启用接口
ip addr add 192.168.1.100/24 dev eth0
路由:
ip route show # 显示路由表
ip route get 8.8.8.8 # 显示到 IP 的路由
ip route add 10.0.0.0/24 via 192.168.1.1
套接字统计:
ss -tunap # 所有 TCP/UDP 连接
ss -tlnp # 监听中的 TCP 端口
ss -ulnp # 监听中的 UDP 端口
ss -tnp state established # 已建立的连接
Ubuntu (ufw):
sudo ufw status
sudo ufw enable
sudo ufw allow 22/tcp # 允许 SSH
sudo ufw allow 80/tcp # 允许 HTTP
sudo ufw allow from 192.168.1.0/24 # 允许来自子网
sudo ufw default deny incoming
RHEL/CentOS (firewalld):
firewall-cmd --state
firewall-cmd --list-all
firewall-cmd --add-service=http --permanent
firewall-cmd --add-port=8080/tcp --permanent
firewall-cmd --reload
完整的网络配置(包括 netplan、NetworkManager 和 DNS)请参阅 references/network-configuration.md。
crontab -e # 编辑用户 crontab
# 格式:分钟 小时 日 月 星期 命令
0 2 * * * /usr/local/bin/backup.sh # 每天凌晨 2:00
*/5 * * * * /usr/local/bin/check-health.sh # 每 5 分钟
0 3 * * 0 /usr/local/bin/weekly-cleanup.sh # 每周日凌晨 3 点
@reboot /usr/local/bin/startup-script.sh # 启动时运行
OnCalendar=daily # 每天午夜
OnCalendar=*-*-* 02:00:00 # 每天凌晨 2:00
OnCalendar=Mon *-*-* 09:00:00 # 每周一上午 9 点
OnCalendar=*-*-01 00:00:00 # 每月 1 号
OnBootSec=5min # 启动后 5 分钟
top、htop - 实时进程监视器ps - 报告进程状态pgrep/pkill - 按名称查找/终止journalctl - 查询 systemd 日志grep - 搜索文本模式tail -f - 跟踪日志文件df - 磁盘空间使用情况du - 目录空间使用情况lsblk - 列出块设备ncdu - 交互式磁盘分析器ip - 网络配置ss - 套接字统计ping - 测试连通性dig/nslookup - DNS 查询tcpdump - 数据包捕获Linux 系统管理是 Kubernetes 节点管理的基础。节点优化(sysctl 调优)、作为 systemd 服务的 kubelet、通过 journald 的容器日志、用于资源限制的 cgroups。
示例:
# /etc/sysctl.d/99-kubernetes.conf
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
有关 Kubernetes 特定操作,请参阅 kubernetes-operations 技能。
Linux 系统管理提供知识;配置管理将其自动化。Ansible playbook 可自动化 systemd 服务创建和系统调优。
有关大规模自动化,请参阅 configuration-management 技能。
本技能涵盖 SSH 和防火墙基础知识。有关高级安全(MFA、证书、CIS 基准、合规性),请参阅 security-hardening 技能。
CI/CD 流水线使用这些技能部署到 Linux 服务器。使用 systemctl 进行部署,使用 journalctl 进行监控。
有关部署自动化,请参阅 building-ci-pipelines 技能。
references/systemd-guide.md - 全面的 systemd 参考(单元文件、依赖关系、目标)references/performance-tuning.md - 完整的 sysctl、ulimits、cgroups、I/O 调度器指南references/filesystem-management.md - LVM、RAID、文件系统类型、权限references/network-configuration.md - ip/ss 命令、netplan、NetworkManager、DNS、防火墙references/security-hardening.md - SSH 加固、防火墙、SELinux/AppArmor 基础references/troubleshooting-guide.md - 常见问题、诊断工作流程、解决方案examples/systemd-units/ - 服务、定时器和目标单元文件examples/scripts/ - 备份、健康检查和维护脚本examples/configs/ - sshd_config、sysctl.conf、logrotate 示例软件包管理器:apt,网络:netplan,防火墙:ufw,软件源:/etc/apt/sources.list
软件包管理器:dnf,网络:NetworkManager,防火墙:firewalld,软件源:/etc/yum.repos.d/,默认启用 SELinux
软件包管理器:pacman,网络:NetworkManager,滚动发布,社区软件包使用 AUR
官方文档:
相关技能:
kubernetes-operations - Linux 上的容器编排configuration-management - 大规模自动化 Linux 管理security-hardening - 高级安全与合规性building-ci-pipelines - 通过 CI/CD 部署performance-engineering - 深度性能分析每周安装次数
61
代码仓库
GitHub 星标数
305
首次出现
2026年1月28日
安全审计
已安装于
opencode53
gemini-cli50
codex45
github-copilot44
claude-code42
cursor41
Comprehensive Linux system administration for managing servers, deploying applications, and troubleshooting production issues in modern cloud-native environments.
This skill teaches fundamental and intermediate Linux administration for DevOps engineers, SREs, backend developers, and platform engineers. Focus on systemd-based distributions (Ubuntu, RHEL, Debian, Fedora) covering service management, process monitoring, filesystem operations, user administration, performance tuning, log analysis, and network configuration.
Modern infrastructure requires solid Linux fundamentals even with containerization. Container hosts run Linux, Kubernetes nodes need optimization, and troubleshooting production issues requires understanding systemd, processes, and logs.
Not Covered:
network-architecture skillsecurity-hardening skillconfiguration-management skillkubernetes-operations skillUse when deploying custom applications, troubleshooting slow systems, investigating service failures, optimizing workloads, managing users, configuring SSH, monitoring disk space, scheduling tasks, diagnosing network issues, or applying performance tuning.
Service Management:
systemctl start nginx # Start service
systemctl stop nginx # Stop service
systemctl restart nginx # Restart service
systemctl status nginx # Check status
systemctl enable nginx # Enable at boot
journalctl -u nginx -f # Follow service logs
Process Monitoring:
top # Interactive process monitor
htop # Enhanced process monitor
ps aux | grep process_name # Find specific process
kill -15 PID # Graceful shutdown (SIGTERM)
kill -9 PID # Force kill (SIGKILL)
Disk Usage:
df -h # Filesystem usage
du -sh /path/to/dir # Directory size
ncdu /path # Interactive disk analyzer
Log Analysis:
journalctl -f # Follow all logs
journalctl -u service -f # Follow service logs
journalctl --since "1 hour ago" # Filter by time
journalctl -p err # Show errors only
User Management:
useradd -m -s /bin/bash username # Create user with home dir
passwd username # Set password
usermod -aG sudo username # Add to sudo group
userdel -r username # Delete user and home dir
Systemd is the standard init system and service manager. Systemd units define services, timers, targets, and other system resources.
Unit File Locations (priority order):
/etc/systemd/system/ - Custom units (highest priority)/run/systemd/system/ - Runtime units (transient)/lib/systemd/system/ - System-provided units (don't modify)Key Unit Types: .service (services), .timer (scheduled tasks), .target (unit groups), .socket (socket-activated)
Essential systemctl Commands:
systemctl daemon-reload # Reload unit files after changes
systemctl list-units --type=service
systemctl list-timers # Show all timers
systemctl cat nginx.service # Show unit file content
systemctl edit nginx.service # Create override file
For detailed systemd reference, see references/systemd-guide.md.
Processes are running programs with unique PIDs. Understanding process states, signals, and resource usage is essential for troubleshooting.
Process States: R (running), S (sleeping), D (uninterruptible sleep/I/O), Z (zombie), T (stopped)
Common Signals: SIGTERM (15) graceful, SIGKILL (9) force, SIGHUP (1) reload config
Process Priority:
nice -n 10 command # Start with lower priority
renice -n 5 -p PID # Change priority of running process
Essential directories: / (root), /etc/ (config), /var/ (variable data), /opt/ (optional software), /usr/ (user programs), /home/ (user directories), /tmp/ (temporary), /boot/ (boot loader)
Filesystem Types Quick Reference:
For filesystem management details including LVM and RAID, see references/filesystem-management.md.
Ubuntu/Debian (apt):
apt update && apt upgrade # Update system
apt install package # Install package
apt remove package # Remove package
apt search keyword # Search packages
RHEL/CentOS/Fedora (dnf):
dnf update # Update all packages
dnf install package # Install package
dnf remove package # Remove package
dnf search keyword # Search packages
Use native package managers for system services; snap/flatpak for desktop apps and cross-distro compatibility.
Investigation Workflow:
Identify bottleneck:
top # Quick overview
uptime # Load averages
CPU Issues (usage >80%):
top # Press Shift+P to sort by CPU
ps aux --sort=-%cpu | head
Memory Issues (swap used):
free -h # Memory usage
top # Press Shift+M to sort by memory
Disk I/O Issues (high wa%):
iostat -x 1 # Disk statistics
iotop # I/O by process
Network Issues:
ss -tunap # Active connections
iftop # Bandwidth monitor
For comprehensive troubleshooting, see references/troubleshooting-guide.md.
Quick Decision:
Step 1: Create unit file
sudo nano /etc/systemd/system/myapp.service
Step 2: Unit file content
[Unit]
Description=My Web Application
After=network.target postgresql.service
Requires=postgresql.service
[Service]
Type=simple
User=myapp
Group=myapp
WorkingDirectory=/opt/myapp
Environment="PORT=8080"
ExecStart=/opt/myapp/bin/server
ExecReload=/bin/kill -HUP $MAINPID
Restart=on-failure
RestartSec=5s
StandardOutput=journal
# Security hardening
PrivateTmp=true
NoNewPrivileges=true
ProtectSystem=strict
ReadWritePaths=/var/lib/myapp
[Install]
WantedBy=multi-user.target
Step 3: Deploy and start
sudo useradd -r -s /bin/false myapp
sudo mkdir -p /var/lib/myapp
sudo chown myapp:myapp /var/lib/myapp
sudo systemctl daemon-reload
sudo systemctl enable myapp.service
sudo systemctl start myapp.service
sudo systemctl status myapp.service
For complete examples, see examples/systemd-units/.
Create service and timer units for scheduled tasks. Timer unit specifies OnCalendar= schedule and Persistent=true for missed jobs. Service unit has Type=oneshot. See examples/systemd-units/backup.timer and backup.service for complete examples.
Generate SSH key:
ssh-keygen -t ed25519 -C "admin@example.com"
ssh-copy-id admin@server
Harden sshd_config:
sudo nano /etc/ssh/sshd_config
Key settings:
PermitRootLogin no
PasswordAuthentication no
PubkeyAuthentication yes
MaxAuthTries 3
AllowUsers admin deploy
X11Forwarding no
Port 2222 # Optional
Apply changes:
sudo sshd -t # Test
sudo systemctl restart sshd # Apply (keep backup session!)
For complete SSH configuration, see examples/configs/sshd_config.hardened and references/security-hardening.md.
Configure sysctl parameters in /etc/sysctl.d/99-custom.conf for network tuning (tcp buffers, BBR congestion control), memory management (swappiness, cache pressure), and file descriptors. Set ulimits in /etc/security/limits.conf for nofile and nproc. Configure I/O schedulers and CPU governors. For comprehensive tuning, see references/performance-tuning.md and examples/configs/ for templates.
Use systemctl status myapp and journalctl -u myapp to investigate issues. Filter logs by time --since, severity -p err, or search patterns with grep. Correlate with system metrics using top, df -h, free -h. Check for OOM kills with journalctl -k | grep -i oom. For detailed workflows, see references/troubleshooting-guide.md.
Interface Management:
ip addr show # Show all interfaces
ip link set eth0 up # Bring interface up
ip addr add 192.168.1.100/24 dev eth0
Routing:
ip route show # Show routing table
ip route get 8.8.8.8 # Show route to IP
ip route add 10.0.0.0/24 via 192.168.1.1
Socket Statistics:
ss -tunap # All TCP/UDP connections
ss -tlnp # Listening TCP ports
ss -ulnp # Listening UDP ports
ss -tnp state established # Established connections
Ubuntu (ufw):
sudo ufw status
sudo ufw enable
sudo ufw allow 22/tcp # Allow SSH
sudo ufw allow 80/tcp # Allow HTTP
sudo ufw allow from 192.168.1.0/24 # Allow from subnet
sudo ufw default deny incoming
RHEL/CentOS (firewalld):
firewall-cmd --state
firewall-cmd --list-all
firewall-cmd --add-service=http --permanent
firewall-cmd --add-port=8080/tcp --permanent
firewall-cmd --reload
For complete network configuration including netplan, NetworkManager, and DNS, see references/network-configuration.md.
crontab -e # Edit user crontab
# Format: minute hour day month weekday command
0 2 * * * /usr/local/bin/backup.sh # Daily at 2:00 AM
*/5 * * * * /usr/local/bin/check-health.sh # Every 5 minutes
0 3 * * 0 /usr/local/bin/weekly-cleanup.sh # Weekly Sunday 3 AM
@reboot /usr/local/bin/startup-script.sh # Run at boot
OnCalendar=daily # Every day at midnight
OnCalendar=*-*-* 02:00:00 # Daily at 2:00 AM
OnCalendar=Mon *-*-* 09:00:00 # Every Monday at 9 AM
OnCalendar=*-*-01 00:00:00 # 1st of every month
OnBootSec=5min # 5 minutes after boot
top, htop - Real-time process monitorps - Report process statuspgrep/pkill - Find/kill by namejournalctl - Query systemd journalgrep - Search text patternstail -f - Follow log filesdf - Disk space usagedu - Directory space usagelsblk - List block devicesncdu - Interactive disk analyzerip - Network configurationss - Socket statisticsping - Test connectivitydig/nslookup - DNS queriestcpdump - Packet captureLinux administration is the foundation for Kubernetes node management. Node optimization (sysctl tuning), kubelet as systemd service, container logs via journald, cgroups for resource limits.
Example:
# /etc/sysctl.d/99-kubernetes.conf
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
For Kubernetes-specific operations, see kubernetes-operations skill.
Linux administration provides knowledge; configuration management automates it. Ansible playbooks automate systemd service creation and system tuning.
For automation at scale, see configuration-management skill.
This skill covers SSH and firewall basics. For advanced security (MFA, certificates, CIS benchmarks, compliance), see security-hardening skill.
CI/CD pipelines deploy to Linux servers using these skills. Uses systemctl for deployment and journalctl for monitoring.
For deployment automation, see building-ci-pipelines skill.
references/systemd-guide.md - Comprehensive systemd reference (unit files, dependencies, targets)references/performance-tuning.md - Complete sysctl, ulimits, cgroups, I/O scheduler guidereferences/filesystem-management.md - LVM, RAID, filesystem types, permissionsreferences/network-configuration.md - ip/ss commands, netplan, NetworkManager, DNS, firewallreferences/security-hardening.md - SSH hardening, firewall, SELinux/AppArmor basicsreferences/troubleshooting-guide.md - Common issues, diagnostic workflows, solutionsexamples/systemd-units/ - Service, timer, and target unit filesexamples/scripts/ - Backup, health check, and maintenance scriptsexamples/configs/ - sshd_config, sysctl.conf, logrotate examplesPackage Manager: apt, Network: netplan, Firewall: ufw, Repositories: /etc/apt/sources.list
Package Manager: dnf, Network: NetworkManager, Firewall: firewalld, Repositories: /etc/yum.repos.d/, SELinux enabled by default
Package Manager: pacman, Network: NetworkManager, Rolling release, AUR for community packages
Official Documentation:
Related Skills:
kubernetes-operations - Container orchestration on Linuxconfiguration-management - Automate Linux admin at scalesecurity-hardening - Advanced security and compliancebuilding-ci-pipelines - Deploy via CI/CDperformance-engineering - Deep performance analysisWeekly Installs
61
Repository
GitHub Stars
305
First Seen
Jan 28, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykWarn
Installed on
opencode53
gemini-cli50
codex45
github-copilot44
claude-code42
cursor41
Azure 升级评估与自动化工具 - 轻松迁移 Functions 计划、托管层级和 SKU
104,900 周安装
Knip 代码清理工具:自动查找并移除未使用的文件、依赖和导出
211 周安装
Magento 2 Hyvä CMS 组件创建器 - 快速构建自定义CMS组件
213 周安装
Ralplan 共识规划工具:AI 驱动的迭代规划与决策制定 | 自动化开发工作流
213 周安装
ln-724-artifact-cleaner:自动清理在线平台项目产物,移除平台依赖,准备生产部署
204 周安装
Scanpy 单细胞 RNA-seq 数据分析教程 | Python 生物信息学工具包
206 周安装
AlphaFold 数据库技能:AI预测蛋白质3D结构检索、下载与分析完整指南
207 周安装