As system administrators and developers, we’ve all been there: your server suddenly slows down, CPU usage spikes, or memory consumption skyrockets. By the time you open top or htop, the culprit process has already disappeared. Those short-lived, resource-hungry processes are notoriously difficult to catch and diagnose.
This frustration led me to create Proc-Monitor – a lightweight, dependency-free Linux process monitoring tool designed specifically to catch these elusive resource hogs and trace them back to their source.
The Problem: Short-Lived Resource Consumers
Traditional monitoring tools like top, htop, or even ps are excellent for real-time snapshots, but they share a common weakness: they only show what’s happening right now. If a process spawns, consumes 90% CPU for 200 milliseconds, and then exits, you’ll likely miss it entirely.
These brief but intense resource spikes can cause:
- Service latency and timeouts
- Degraded user experience
- Cascading failures in microservice architectures
- Mysterious performance issues that seem to “just happen”
Even worse, when you finally do catch a high-resource process, you’re left wondering: Which service or parent process spawned this? Understanding the process hierarchy is crucial for effective troubleshooting.
Introducing Proc-Monitor
Proc-Monitor solves these problems by continuously monitoring the /proc filesystem with configurable check intervals as low as 100ms. When it detects high CPU or RAM usage, it captures comprehensive information including:
- Complete process details (PID, name, command line)
- Resource usage (CPU percentage, RAM usage)
- Parent service detection (which systemd service owns the process)
- Process hierarchy (full parent chain up to init)
- User information (who owns the process)
- Timestamps (when the spike occurred)
All of this happens with zero external dependencies – just Python 3.6+ and the standard library.
Key Features
1. Dual Monitoring Modes
Proc-Monitor supports two distinct monitoring strategies:
Threshold Mode: Captures any process exceeding configured CPU or RAM thresholds. Perfect for catching unexpected spikes and anomalies.
Top-N Mode: Continuously tracks the top N processes by resource usage. Ideal for identifying your system’s most resource-intensive processes over time.
2. Systemd Service Integration
One of Proc-Monitor’s most powerful features is its ability to identify which systemd service spawned a process by parsing cgroup information:
def get_systemd_service(pid):
"""Get the systemd service name for a process."""
content = read_file_safe(f'/proc/{pid}/cgroup')
if not content:
return "Unknown"
for line in content.split('\n'):
if '.service' in line:
parts = line.strip().split('/')
for part in reversed(parts):
if '.service' in part:
return partCode language: Python (python)
This means instead of just knowing that process 12345 was consuming resources, you’ll know it was spawned by apache2.service, docker.service, or your custom application service.
3. Fast Detection with Configurable Intervals
The tool supports check intervals as low as 0.1 seconds (100ms), making it possible to catch even very brief resource spikes:
{
"cpu_threshold": 30.0,
"check_interval": 0.1,
"track_ram": false
}Code language: JSON / JSON with Comments (json)
4. Comprehensive Reporting
When you stop monitoring (CTRL+C), Proc-Monitor generates a detailed JSON report with:
- Aggregated statistics by service
- Individual event details
- Configuration used during monitoring
- Timestamps and resource usage trends
How It Works: Under the Hood
Proc-Monitor directly reads the Linux /proc filesystem without relying on external tools or libraries. Here’s how the core monitoring works:
CPU Usage Calculation
CPU percentage is calculated by comparing process CPU ticks between check intervals:
def calculate_cpu_percent(pid, stat, current_time, total_cpu_delta):
"""Calculate CPU percentage for a process."""
proc_time = stat['utime'] + stat['stime']
if pid in prev_proc_stats:
prev_proc_time, prev_time = prev_proc_stats[pid]
time_delta = current_time - prev_time
if time_delta > 0 and total_cpu_delta > 0:
proc_delta = proc_time - prev_proc_time
cpu_percent = (proc_delta / total_cpu_delta) * 100 * NUM_CPUS
return cpu_percent
return 0.0Code language: Python (python)
This approach accounts for multi-core systems and provides accurate CPU usage percentages even for short-lived processes.
Memory Usage Tracking
Memory information is extracted from /proc/<pid>/statm:
def get_process_memory(pid):
"""Get memory usage from /proc/<pid>/statm."""
content = read_file_safe(f'/proc/{pid}/statm')
if not content:
return 0, 0.0
parts = content.split()
rss_pages = int(parts[1])
page_size = os.sysconf('SC_PAGE_SIZE')
rss_bytes = rss_pages * page_size
<em># Calculate percentage of total RAM</em>
meminfo = read_file_safe('/proc/meminfo')
<em># ... calculate percentage</em>Code language: Python (python)
Parent Chain Discovery
Understanding which process spawned the high-resource consumer is crucial:
def get_parent_chain(pid, max_depth=10):
"""Get the parent process chain up to init (PID 1)."""
chain = []
current_pid = pid
for _ in range(max_depth):
stat = get_process_stat(current_pid)
if not stat:
break
chain.append((current_pid, stat['name']))
if current_pid == 1:
break
current_pid = stat['ppid']
return chainCode language: Python (python)
Getting Started
Installation and Basic Usage
The simplest way to try Proc-Monitor is with a one-line command:
curl -sL https://raw.githubusercontent.com/cagatayuresin/proc-monitor/main/proc_monitor.py | sudo python3 -Code language: Bash (bash)
Or download and run locally:
wget https://raw.githubusercontent.com/cagatayuresin/proc-monitor/main/proc_monitor.py
sudo python3 proc_monitor.pyCode language: Bash (bash)
Configuration
Create a config.json file for customized monitoring:
{
"mode": "threshold",
"top_n": 5,
"cpu_threshold": 50.0,
"ram_threshold": 10.0,
"check_interval": 0.3,
"output_file": "resource_report.json",
"track_cpu": true,
"track_ram": true
}Code language: JSON / JSON with Comments (json)
Real-World Use Cases
1. Finding Memory Leaks
Configure aggressive RAM monitoring with a low threshold:
{
"mode": "threshold",
"ram_threshold": 5.0,
"check_interval": 1.0,
"track_cpu": false,
"track_ram": true
}Code language: JSON / JSON with Comments (json)
This helps identify processes with growing memory consumption over time.
2. Catching CPU Spikes
For debugging sudden CPU usage spikes:
{
"mode": "threshold",
"cpu_threshold": 30.0,
"check_interval": 0.1,
"track_cpu": true,
"track_ram": false
}Code language: JSON / JSON with Comments (json)
The 100ms check interval ensures even brief spikes are captured.
3. Production System Auditing
Use Top-N mode for continuous monitoring of your most resource-intensive processes:
{
"mode": "top_n",
"top_n": 10,
"check_interval": 0.5
}Code language: JSON / JSON with Comments (json)
Example Output
[2024-01-15 10:30:45] [CPU] stress (PID:12345)
CPU: 98.5% | RAM: 0.3% (12.4 MB)
Service: stress-test.service
User: root
Chain: stress(12345) -> bash(12300) -> systemd(1)
Cmd: /usr/bin/stress --cpu 1Code language: Bash (bash)
The generated JSON report provides even more detail for post-analysis:
{
"generated_at": "2024-01-15 10:35:22",
"summary": {
"total_events": 150,
"by_service": {
"apache2.service": {
"count": 100,
"processes": [...]
}
}
},
"events": [...]
}Code language: JSON / JSON with Comments (json)
Design Decisions and Trade-offs
Why No External Dependencies?
I deliberately designed Proc-Monitor to use only Python’s standard library for several reasons:
- Universal compatibility: Works on any Linux system with Python 3.6+
- Easy deployment: No pip installs or virtual environments needed
- Minimal attack surface: Fewer dependencies mean fewer security concerns
- Lightweight: No overhead from heavy monitoring frameworks
Why Direct /proc Access?
Reading /proc directly provides:
- Maximum portability across Linux distributions
- No dependency on system utilities that might not be installed
- Fine-grained control over what data to collect
- Minimal performance overhead
Limitations and Considerations
Proc-Monitor is designed for specific use cases and has some intentional limitations:
- Linux only: Requires the
/procfilesystem - Root recommended: Full process information requires elevated privileges
- Disk I/O: Frequent
/procreads can generate disk activity (though minimal) - Not a replacement: Complements, not replaces, traditional monitoring tools
Performance Considerations
At its default 0.3-second check interval, Proc-Monitor has minimal system impact. However, you can tune performance based on your needs:
- Lower intervals (0.1s): Catches more short-lived processes but uses more CPU
- Higher intervals (1.0s): Lower overhead but might miss brief spikes
- Selective tracking: Disable CPU or RAM tracking if you only need one metric
Future Enhancements
While Proc-Monitor is feature-complete for its intended purpose, potential future additions could include:
- Network usage tracking
- Disk I/O monitoring
- Container/cgroup-aware monitoring
- Alert notifications (email, webhook)
- Web-based dashboard
- Historical trend analysis
Contributing
Proc-Monitor is open source (MIT License) and welcomes contributions! Whether it’s bug reports, feature requests, or pull requests, community involvement helps make the tool better for everyone.
Check out the GitHub repository to get involved.
Conclusion
Proc-Monitor fills a specific gap in the Linux monitoring ecosystem: catching and identifying short-lived, high-resource processes. Its zero-dependency design, dual monitoring modes, and systemd service integration make it a valuable tool for system administrators, DevOps engineers, and developers troubleshooting performance issues.
Whether you’re debugging mysterious CPU spikes, hunting memory leaks, or simply want better visibility into your system’s resource usage, Proc-Monitor provides the insights you need without the complexity of heavyweight monitoring solutions.
Try it today and never wonder “what was that process?” again.
Download: GitHub – proc-monitor
Quick Start: curl -sL https://raw.githubusercontent.com/cagatayuresin/proc-monitor/main/proc_monitor.py | sudo python3 -
License: MIT