Pernah nggak sih lo digituin klien jam 2 pagi? Gue pernah. Web investasi yang gue handle tiba-tiba down, user panic gabisa withdraw duit mereka. Ini bukan app biasa yang cuma nampilin artikel. Ini app yang handle duit orang. Every second downtime = kehilangan trust, dan trust hilang = revenue hilang.
Waktu itu gue sadar, monitoring itu bukan "nice to have". Monitoring itu mandatory, especially buat aplikasi finance atau investasi.
Kenapa Monitoring itu Bukan Pilihan
Dulu gue mikir: "Udah deploy, done." Salah besar. Production itu kayak anak bayi, perlu dipantau 24/7. Lo gabisa "set and forget".
Ekspektasi developer:
- Deploy ke server ✅
- Setup SSL ✅
- Done! Time to chill 🍹
Realita production:
- "Kok tiba-tiba lemot?"
- "Database connection pool habis kenapa?"
- "CPU spike 100%?"
- "User komplain checkout gagal terus"
Production itu bukan final destination, tapi awal journey panjang. Tanpa visibility, sama aja nyetir dalam gelap tanpa lampu.
App Biasa vs App yang Handle Duit
App katalog down 30 menit:
- User: "Ah yasudah nanti aja"
- Risk: Low
App investasi down 30 menit:
- User: "DUIT GUE MANA?!" 😱
- Risk: EXTREME
Makanya monitoring buat financial app itu survival mechanism. Lo harus tau sebelum user komplain.
Pain Points yang Gue Alamin
1. Slow Performance: Response time dari 200ms jadi 5 detik. User frustrated, tapi belum komplain. Tanpa monitoring, lo tau pas server udah crash.
2. Memory Leak: Memory naik pelan-pelan 40% → 60% → 95%. Boom! Server hang. Biasanya gara-gara queue jobs yang nggak clear memory.
3. Database Connection Hell: "Too many connections" error. Connection pool habis, long-running queries, atau script lupa close connection.
4. Disk Space Full: Log files numpuk, storage penuh, app gabisa write log atau upload file.
Yang Bakal Gue Bahas
Based on real experience handle production apps financial:
5 Komponen Monitoring:
- Server & Infrastructure
- Application Performance (Laravel-specific)
- Database & Cache
- Uptime & Availability
- Business Metrics
Stack Praktis:
- Quick to implement
- Low/zero cost untuk start
- Actually useful
Invest waktu di monitoring sekarang = save hours debugging nanti. Plus, lo bisa tidur nyenyak.
Bagian 2: Server & Infrastructure Monitoring - Fondasi yang Ga Boleh Diabaikan
Foundation semua monitoring ya server infrastructure. Percuma code perfect kalau server sekarat. Gue pernah ngalamin CPU spike 100% gara-gara runaway process. Web totally unresponsive, panic mode.
CPU & Memory: The Basic Yet Critical
Install htop:
sudo apt install htop -y
Di htop lo liat:
- CPU usage per core
- Memory (used, buffers, cache)
- Swap usage (kalau ini kepakai = trouble)
Setup Alert CPU:
#!/bin/bash
THRESHOLD=80
CPU_USAGE=$(top -bn1 | grep "Cpu(s)" | awk '{print $2}' | cut -d'%' -f1)
if (( $(echo "$CPU_USAGE > $THRESHOLD" | bc -l) )); then
curl -X POST <https://hooks.slack.com/your-webhook> \\
-H 'Content-Type: application/json' \\
-d "{\\"text\\":\\"⚠️ CPU: ${CPU_USAGE}%\\"}"
fi
Setup cron: */5 * * * * /path/to/cpu_monitor.sh
Memory Leak di Laravel:
// ❌ Bad
User::all(); // Load semua ke memory
// ✅ Good
User::chunk(100, function($users) {
// Process per chunk
});
Disk Space: The Silent Killer
Gue pernah kena pagerduty jam 3 pagi gara-gara disk full. Laravel logs 50GB+.
# Check space
df -h
du -sh storage/logs/*
Laravel Log Rotation:
// config/logging.php
'daily' => [
'days' => 14, // Keep 14 days only
],
Cleanup Script:
#!/bin/bash
find storage/logs -name "*.log" -mtime +14 -delete
php artisan cache:clear
Network Monitoring
# Install iftop untuk real-time traffic
sudo apt install iftop
sudo iftop -i eth0
# Bandwidth stats
sudo apt install vnstat
vnstat -d
Netdata - Game Changer
Ini seriously powerful tapi super easy:
bash <(curl -Ss <https://my-netdata.io/kickstart.sh>)
Access via http://your-ip:19999. Lo dapat:
- Real-time CPU, Memory, Disk, Network
- MySQL/PostgreSQL metrics
- Built-in alerting
Setup Slack Alert:
Edit /etc/netdata/health_alarm_notify.conf:
SLACK_WEBHOOK_URL="<https://hooks.slack.com/>..."
Docker Stats
docker stats --format "table {{.Container}}\\t{{.CPUPerc}}\\t{{.MemUsage}}"
Resource Limits:
# docker-compose.yml
services:
app:
deploy:
resources:
limits:
cpus: '2'
memory: 2G
DigitalOcean Monitoring
Kalau pake DO, install agent:
curl -sSL <https://repos.insights.digitalocean.com/install.sh> | sudo bash
Lo dapat free:
- CPU, Memory, Disk graphs
- Bandwidth tracking
- Email/SMS alerts
Red Flags Checklist
🚨 CRITICAL:
- CPU > 90%
- Memory > 95%
- Disk < 10%
⚠️ WARNING:
- CPU > 70%
- Memory > 80%
- Disk < 20%
Quick Troubleshooting
top -o %CPU # CPU hog
top -o %MEM # Memory hog
netstat -an | grep ESTABLISHED | wc -l
php artisan queue:failed
sudo systemctl restart nginx
Pro Tips:
- Record baseline metrics
- Monitor trends, not just snapshots
- Automate alerts dan cleanup
Bagian 3: Application Performance Monitoring - Laravel Spesifik
Server metrics on point, tapi aplikasi tetep slow? Gue pernah case CPU 30%, memory 50%, tapi aplikasi lemot. Ternyata N+1 queries bikin database overwhelmed.
Laravel Telescope - Production Buddy
Install:
composer require laravel/telescope
php artisan telescope:install
php artisan migrate
Production-Safe Config:
// TelescopeServiceProvider.php
protected function gate()
{
Gate::define('viewTelescope', function ($user) {
return in_array($user->email, ['[email protected]']);
});
}
.env Production:
TELESCOPE_ENABLED=true
TELESCOPE_ENTRIES_MAX_LENGTH=1000
TELESCOPE_PRUNE_HOURS=48
Auto Prune:
// Kernel.php
$schedule->command('telescope:prune --hours=48')->daily();
Daily Monitoring Workflow
1. Check Exceptions
Telescope → Exceptions → Sort by frequency. Red flags:
- Same exception ratusan kali
- Exception di payment/withdrawal
- Database connection errors
2. Slow Queries
Queries > 1 second = red flag.
// ❌ Bad - 5.2s
$investments = DB::table('investments')
->join('users', ...)
->join('portfolios', ...)
->select('*')
->get();
// ✅ Good - 0.08s
$investments = DB::table('investments')
->join('users', ...)
->select('investments.id', 'users.name')
->limit(100)
->get();
3. Failed Jobs
Jobs tab → Filter "failed". Common causes:
- API timeout
- Database connection lost
- Memory limit
4. Request Performance
Healthy response times:
- API: < 200ms
- Pages: < 500ms
- Heavy: < 2s
N+1 Problem - Laravel's Enemy
Ini penyebab #1 performance issue.
// ❌ Bad - 101 queries
$investments = Investment::latest()->take(50)->get();
// Di view:
@foreach($investments as $inv)
{{ $inv->user->name }} // +50 queries
{{ $inv->portfolio->title }} // +50 queries
@endforeach
// ✅ Good - 3 queries
$investments = Investment::with(['user', 'portfolio'])
->latest()
->take(50)
->get();
Laravel Horizon - Queue Monitoring
Install:
composer require laravel/horizon
php artisan horizon:install
Config:
// config/horizon.php
'production' => [
'supervisor-1' => [
'queue' => ['default', 'high', 'low'],
'processes' => 10,
'tries' => 3,
'timeout' => 300,
],
],
Monitor:
- Failed jobs
- Wait time
- Jobs per minute
Kalau queue size terus naik:
- Worker kurang
- Jobs stuck
- Processing lama
Custom Metrics
public function processWithdrawal($user, $amount)
{
$start = microtime(true);
try {
$result = $this->executeWithdrawal($user, $amount);
Log::channel('metrics')->info('withdrawal_success', [
'user_id' => $user->id,
'amount' => $amount,
'duration_ms' => (microtime(true) - $start) * 1000,
]);
return $result;
} catch (\\Exception $e) {
Log::channel('metrics')->error('withdrawal_failed', [
'error' => $e->getMessage(),
]);
throw $e;
}
}
Sentry - Error Tracking
Install:
composer require sentry/sentry-laravel
Config:
// config/sentry.php
return [
'dsn' => env('SENTRY_LARAVEL_DSN'),
'traces_sample_rate' => 0.2,
'send_default_pii' => true,
];
Lo Dapat:
- Real-time alerts (Slack/email)
- Stack traces dengan line number
- User context
- Request details
- Breadcrumbs
Load Testing
# 100 concurrent users, 1000 requests
ab -n 1000 -c 100 <https://yourapp.com/api/investments>
Target Sehat:
- p50 < 100ms
- p95 < 500ms
- p99 < 1s
Debugbar vs Telescope
Debugbar:
- ✅ Development only
- ❌ NEVER production - security risk
Telescope:
- ✅ Production-ready
- ✅ Historical data
- ✅ Filtering
Performance Checklist
composer install --optimize-autoloader --no-dev
php artisan config:cache
php artisan route:cache
php artisan view:cache
php artisan optimize
Real Performance Wins
Case 1: API Response
- Before: 2.5s
- Issue: N+1 queries
- After: 180ms
- 92% improvement
Case 2: Queue
- Before: 500 jobs/hour
- Issue: Single worker
- After: 5000 jobs/hour
- 10x throughput
Pro Tips:
- Profile di staging, not production
- Baseline before optimize
- Fix bottlenecks first (80/20)
- Always benchmark after changes
Bagian 4: Database & Cache Monitoring - Jantungnya Aplikasi
Database dan cache itu literally jantung aplikasi. Server bisa kuat, code optimized, tapi kalau database lemot atau cache miss rate tinggi, aplikasi jadi bottleneck.
MySQL/PostgreSQL Performance
Enable Slow Query Log:
-- MySQL
SET GLOBAL slow_query_log = 'ON';
SET GLOBAL long_query_time = 1;
SET GLOBAL slow_query_log_file = '/var/log/mysql/slow.log';
Analyze Slow Queries:
# Install pt-query-digest
sudo apt install percona-toolkit
# Analyze log
pt-query-digest /var/log/mysql/slow.log
Output kasih tau:
- Query mana yang paling sering slow
- Total execution time
- Lock time
- Rows examined vs returned
Laravel Query Monitoring
Enable Query Log Temporarily:
// Debug specific request
DB::enableQueryLog();
// Your code here
dd(DB::getQueryLog());
Track Queries di Middleware:
// app/Http/Middleware/LogQueries.php
public function handle($request, Closure $next)
{
DB::listen(function ($query) {
if ($query->time > 1000) { // > 1 second
Log::warning('Slow query detected', [
'sql' => $query->sql,
'time' => $query->time,
'bindings' => $query->bindings,
]);
}
});
return $next($request);
}
Index Optimization
Ini sering missed tapi super critical.
Check Missing Indexes:
-- MySQL - Find queries without indexes
SELECT * FROM sys.statements_with_full_table_scans;
-- Check index usage
SHOW INDEX FROM investments;
Add Proper Indexes:
// Migration
Schema::table('investments', function (Blueprint $table) {
// Single column
$table->index('status');
// Composite index (order matters!)
$table->index(['user_id', 'status', 'created_at']);
// Unique index
$table->unique(['user_id', 'investment_code']);
});
Index Strategy:
// ✅ Good - Uses composite index
Investment::where('user_id', $userId)
->where('status', 'active')
->orderBy('created_at', 'desc')
->get();
// ❌ Bad - Index not utilized properly
Investment::where('status', 'active')
->where('user_id', $userId) // Wrong order
->get();
Database Connection Pool
Monitor Connections:
-- MySQL
SHOW PROCESSLIST;
SHOW STATUS LIKE 'Threads_connected';
SHOW STATUS LIKE 'Max_used_connections';
Laravel Config:
// config/database.php
'mysql' => [
'connections' => [
'max_connections' => 100,
'wait_timeout' => 5,
],
'options' => [
PDO::ATTR_PERSISTENT => true,
],
],
Too Many Connections?
Common causes:
- Long-running queries
- Connection leaks (tidak di-close)
- Insufficient pool size
// ✅ Always use try-finally
try {
DB::beginTransaction();
// Your queries
DB::commit();
} catch (\\Exception $e) {
DB::rollBack();
throw $e;
}
Redis Cache Monitoring
Check Redis Stats:
redis-cli INFO stats
redis-cli INFO memory
Monitor Cache Hit Rate:
redis-cli INFO stats | grep keyspace_hits
redis-cli INFO stats | grep keyspace_misses
Calculate Hit Rate:
Hit Rate = hits / (hits + misses) * 100
Target: > 80% hit rate
Laravel Cache Events:
// AppServiceProvider.php
use Illuminate\\Support\\Facades\\Cache;
public function boot()
{
Cache::listen(function ($event) {
if ($event instanceof \\Illuminate\\Cache\\Events\\CacheMissed) {
Log::channel('cache')->info('Cache miss', [
'key' => $event->key,
]);
}
if ($event instanceof \\Illuminate\\Cache\\Events\\KeyForgotten) {
Log::channel('cache')->info('Cache cleared', [
'key' => $event->key,
]);
}
});
}
Cache Strategy
Tag-Based Cache:
// ✅ Good - Easy to invalidate
Cache::tags(['users', 'investments'])->put('user-investments-'.$userId, $data, 3600);
// Invalidate specific tags
Cache::tags(['investments'])->flush();
Cache Warm-Up:
// Command untuk pre-populate cache
php artisan cache:warm
// WarmCacheCommand.php
public function handle()
{
$popularInvestments = Investment::popular()->get();
foreach ($popularInvestments as $investment) {
Cache::remember("investment-{$investment->id}", 3600, function () use ($investment) {
return $investment->load(['user', 'portfolio', 'transactions']);
});
}
}
Query Optimization Checklist
1. Select Only Needed Columns:
// ❌ Bad
$users = User::all();
// ✅ Good
$users = User::select('id', 'name', 'email')->get();
2. Use Chunking for Large Data:
// ❌ Bad - Memory intensive
$investments = Investment::all();
// ✅ Good
Investment::chunk(1000, function ($investments) {
foreach ($investments as $investment) {
// Process
}
});
3. Avoid N+1 dengan Eager Loading:
// ✅ Always eager load relationships
$investments = Investment::with(['user', 'portfolio'])->get();
4. Use Exists Instead of Count:
// ❌ Bad
if (Investment::where('user_id', $userId)->count() > 0) {}
// ✅ Good
if (Investment::where('user_id', $userId)->exists()) {}
Database Backup Monitoring
Automated Backup Script:
#!/bin/bash
# backup.sh
DATE=$(date +%Y%m%d_%H%M%S)
BACKUP_DIR="/backups/mysql"
DB_NAME="your_database"
# Create backup
mysqldump -u root -p$DB_PASSWORD $DB_NAME | gzip > $BACKUP_DIR/backup_$DATE.sql.gz
# Keep only last 7 days
find $BACKUP_DIR -name "backup_*.sql.gz" -mtime +7 -delete
# Verify backup
if [ $? -eq 0 ]; then
echo "Backup success: $DATE" >> /var/log/backup.log
else
echo "Backup failed: $DATE" >> /var/log/backup.log
# Send alert
curl -X POST <https://hooks.slack.com/>... -d '{"text":"❌ Backup failed"}'
fi
Schedule Backup:
# Crontab - Daily 2 AM
0 2 * * * /path/to/backup.sh
Monitor Backup Size:
#!/bin/bash
# Check if backup size is reasonable
LATEST_BACKUP=$(ls -t /backups/mysql/*.gz | head -1)
BACKUP_SIZE=$(stat -f%z "$LATEST_BACKUP")
MIN_SIZE=10485760 # 10MB
if [ $BACKUP_SIZE -lt $MIN_SIZE ]; then
echo "⚠️ Backup size too small: $BACKUP_SIZE bytes"
# Send alert
fi
Real-Time Database Metrics
Create Dashboard Query:
-- Active connections
SELECT COUNT(*) as active_connections
FROM information_schema.PROCESSLIST
WHERE COMMAND != 'Sleep';
-- Database size
SELECT
table_schema AS 'Database',
ROUND(SUM(data_length + index_length) / 1024 / 1024, 2) AS 'Size (MB)'
FROM information_schema.TABLES
GROUP BY table_schema;
-- Table sizes
SELECT
table_name AS 'Table',
ROUND(((data_length + index_length) / 1024 / 1024), 2) AS 'Size (MB)'
FROM information_schema.TABLES
WHERE table_schema = 'your_database'
ORDER BY (data_length + index_length) DESC;
Redis Memory Management
Check Memory Usage:
redis-cli INFO memory | grep used_memory_human
Set Max Memory:
# redis.conf
maxmemory 2gb
maxmemory-policy allkeys-lru
Monitor Key Count:
redis-cli DBSIZE
redis-cli KEYS "cache:*" | wc -l
Pro Tips
- Index Dengan Bijak: Too many indexes = slower writes. Balance read vs write performance.
- Cache Expiry: Set realistic TTL. Jangan cache forever atau too short.
- Monitor Query Patterns: Use EXPLAIN regularly untuk understand query execution.
- Connection Pooling: Configure proper pool size based on traffic.
- Regular Maintenance: Run OPTIMIZE TABLE bulanan untuk defragment.
Quick Commands:
# MySQL health check
mysql -e "SHOW GLOBAL STATUS" | grep -E "Threads_connected|Questions|Slow_queries"
# Redis health
redis-cli PING
redis-cli INFO | grep -E "connected_clients|used_memory"
# Laravel queue check
php artisan queue:monitor redis:default --max=100
Database dan cache ini backbone aplikasi. Optimize disini = biggest performance impact. Next, kita bahas uptime monitoring dan alerting system biar lo bisa sleep well.
Bagian 5: Uptime & Alert System - Sleep Well Strategy
Monitoring bagus tapi kalau ga ada alert, percuma. Lo ga bisa melek 24/7. Makanya perlu alert system yang smart - notify pas urgent, ga spam pas normal.
Uptime Monitoring Tools
UptimeRobot (Free Tier):
- Monitor 50 URLs gratis
- Check setiap 5 menit
- Alert via email, SMS, Slack
- Public status page
Setup:
- Daftar uptimerobot.com
- Add Monitor → HTTP(s)
- URL: https://yourapp.com
- Set alert contacts
Better Uptime (Premium):
Kalau butuh lebih serious:
- Multi-region monitoring
- 30-second checks
- API endpoint monitoring
- Incident management
- On-call scheduling
Laravel Health Check Endpoint
Jangan cuma monitor homepage. Bikin endpoint khusus buat health check.
// routes/web.php
Route::get('/health', function () {
$checks = [
'app' => 'ok',
'database' => checkDatabase(),
'cache' => checkCache(),
'queue' => checkQueue(),
'storage' => checkStorage(),
];
$allOk = !in_array('failed', $checks);
return response()->json([
'status' => $allOk ? 'healthy' : 'unhealthy',
'checks' => $checks,
'timestamp' => now(),
], $allOk ? 200 : 503);
});
function checkDatabase() {
try {
DB::connection()->getPdo();
return 'ok';
} catch (\\Exception $e) {
return 'failed';
}
}
function checkCache() {
try {
Cache::put('health_check', true, 60);
return Cache::get('health_check') ? 'ok' : 'failed';
} catch (\\Exception $e) {
return 'failed';
}
}
function checkQueue() {
try {
$size = Queue::size();
return $size < 1000 ? 'ok' : 'warning';
} catch (\\Exception $e) {
return 'failed';
}
}
function checkStorage() {
try {
$disk = disk_free_space('/');
$total = disk_total_space('/');
$percent = ($disk / $total) * 100;
return $percent > 20 ? 'ok' : 'warning';
} catch (\\Exception $e) {
return 'failed';
}
}
Point UptimeRobot ke /health. Kalau return non-200, langsung alert.
Alert Strategy - Tier System
Ini penting banget. Jangan semua alert = critical. Lo akan alert fatigue.
Tier 1 - CRITICAL (Call/SMS):
- Website completely down
- Payment gateway error
- Database connection failed
- API returning 500 errors > 5 minutes
Tier 2 - HIGH (Slack immediately):
- CPU > 90% for 5 minutes
- Memory > 95%
- Disk space < 10%
- Queue jobs failed > 50
- Response time > 3s
Tier 3 - MEDIUM (Slack batched):
- CPU > 80% for 10 minutes
- Disk space < 30%
- Cache hit rate < 70%
- Unusual traffic patterns
Tier 4 - LOW (Email daily digest):
- Performance degradation trends
- Resource usage patterns
- Security scan results
Slack Integration
Setup Webhook:
- Slack → Apps → Incoming Webhooks
- Add to Workspace → Choose channel
- Copy webhook URL
Exception Handler Integration:
// app/Exceptions/Handler.php
use Illuminate\\Support\\Facades\\Http;
public function report(Throwable $exception)
{
if ($this->shouldReport($exception) && app()->environment('production')) {
$this->sendToSlack($exception);
}
parent::report($exception);
}
private function sendToSlack($exception)
{
$webhookUrl = config('services.slack.webhook');
Http::post($webhookUrl, [
'text' => '🚨 Production Error',
'attachments' => [[
'color' => 'danger',
'fields' => [
[
'title' => 'Error',
'value' => get_class($exception),
'short' => true
],
[
'title' => 'Message',
'value' => $exception->getMessage(),
'short' => false
],
[
'title' => 'File',
'value' => $exception->getFile() . ':' . $exception->getLine(),
'short' => false
],
[
'title' => 'URL',
'value' => request()->fullUrl(),
'short' => false
],
]
]]
]);
}
Custom Alerts:
// app/Services/AlertService.php
class AlertService
{
public static function critical($message, $data = [])
{
Http::post(config('services.slack.critical_webhook'), [
'text' => "🔴 CRITICAL: {$message}",
'attachments' => [[
'color' => 'danger',
'fields' => self::formatData($data)
]]
]);
// Send SMS via Twilio/SNS
self::sendSMS($message);
}
public static function warning($message, $data = [])
{
Http::post(config('services.slack.webhook'), [
'text' => "⚠️ Warning: {$message}",
'attachments' => [[
'color' => 'warning',
'fields' => self::formatData($data)
]]
]);
}
}
// Usage
AlertService::critical('Database connection failed', [
'server' => config('database.default'),
'timestamp' => now(),
]);
Maintenance Mode Strategy
Graceful Maintenance:
# Enable dengan retry (user akan auto-retry tiap 60 detik)
php artisan down --retry=60
# Secret bypass token
php artisan down --secret="maintenance-bypass-token"
# Access via: yourapp.com/maintenance-bypass-token
Custom Maintenance Page:
// resources/views/errors/503.blade.php
<!DOCTYPE html>
<html>
<head>
<title>Maintenance Mode</title>
<meta http-equiv="refresh" content="60">
</head>
<body>
<h1>We're Upgrading!</h1>
<p>We'll be back in approximately 10 minutes.</p>
<p>This page will auto-refresh.</p>
</body>
</html>
Scheduled Maintenance:
// Console/Kernel.php
$schedule->call(function () {
Artisan::call('down');
// Run maintenance tasks
Artisan::call('backup:run');
Artisan::call('optimize:clear');
Artisan::call('optimize');
Artisan::call('up');
})->weeklyOn(1, '02:00'); // Monday 2 AM
Daily Monitoring Checklist
Morning Routine (5 menit):
# 1. Check Telescope exceptions
# Browser → telescope.yourapp.com/exceptions
# 2. Review slow queries
# Telescope → Queries → Sort by duration
# 3. Check failed jobs
php artisan queue:failed
# 4. Server resources
htop
df -h
# 5. Review Slack alerts dari semalam
Quick Health Script:
#!/bin/bash
# morning-check.sh
echo "=== Daily Health Check ==="
echo ""
# Server uptime
echo "Server Uptime:"
uptime
# Disk space
echo -e "\\nDisk Space:"
df -h | grep -v tmpfs
# CPU Load
echo -e "\\nCPU Load:"
top -bn1 | grep "Cpu(s)"
# Memory
echo -e "\\nMemory:"
free -h
# Services status
echo -e "\\nServices:"
systemctl is-active nginx
systemctl is-active php8.2-fpm
systemctl is-active mysql
systemctl is-active redis
# Laravel checks
echo -e "\\nLaravel Queue:"
php /var/www/html/artisan queue:size
echo -e "\\nLaravel Cache:"
php /var/www/html/artisan cache:check
echo ""
echo "=== Check Complete ==="
Weekly Review Checklist
Performance Trends:
- Average response time (naik/turun?)
- Error rate trend
- Traffic patterns
- Resource usage patterns
Cost Optimization:
- Server resource utilization
- Database size growth
- Bandwidth usage
- Unused services/resources
Security:
- Failed login attempts
- Suspicious IP patterns
- SSL certificate expiry
- Dependency updates
Alert Fatigue Prevention
Smart Grouping:
// Jangan spam alert, group by time window
class AlertThrottler
{
public static function shouldAlert($key, $minutes = 5)
{
$cacheKey = "alert_throttle_{$key}";
if (Cache::has($cacheKey)) {
return false; // Already alerted recently
}
Cache::put($cacheKey, true, now()->addMinutes($minutes));
return true;
}
}
// Usage
if (AlertThrottler::shouldAlert('high_cpu')) {
AlertService::warning('CPU usage high');
}
Action Plan Template
Pas dapat alert, jangan panic. Follow checklist:
Website Down:
1. Check server status: ping yourserver.com
2. SSH ke server
3. Check services: systemctl status nginx php-fpm mysql
4. Check logs: tail -f /var/log/nginx/error.log
5. Restart if needed: sudo systemctl restart nginx
6. Check application log: tail -f storage/logs/laravel.log
Slow Performance:
1. Check server resources: htop
2. Check active connections: netstat -an | wc -l
3. Check slow queries: tail slow-query.log
4. Check queue size: php artisan queue:size
5. Clear cache if safe: php artisan cache:clear
Database Issues:
1. Check connections: SHOW PROCESSLIST
2. Kill long queries if safe
3. Check disk space: df -h
4. Restart MySQL: sudo systemctl restart mysql
5. Check error log: tail /var/log/mysql/error.log
Mindset yang Bener
Prevention > Reaction:
- Setup monitoring sebelum masalah muncul
- Document runbooks untuk common issues
- Test alert system regularly
Trust but Verify:
- Monitoring tools bisa false positive
- Always verify manually before panic action
- Keep calm, follow checklist
Continuous Improvement:
- Review incidents weekly
- Update alert thresholds based on patterns
- Automate repetitive responses
Penutup - Action Steps
Week 1: Setup Foundation
- [ ] Install server monitoring (Netdata)
- [ ] Setup Telescope
- [ ] Create health check endpoint
- [ ] Configure UptimeRobot
Week 2: Configure Alerts
- [ ] Slack integration
- [ ] Alert tier system
- [ ] Test alert delivery
- [ ] Document response procedures
Month 1: Optimize
- [ ] Review alert patterns
- [ ] Adjust thresholds
- [ ] Add custom metrics
- [ ] Train team on procedures
Remember:
- Monitoring bukan paranoia, it's prevention
- Sleep well knowing sistem under watch
- User trust = business continuity
- Time invested now = hours saved later
Final Thought:
Monitoring yang bagus itu invisible. User ga tau ada, tapi lo tau everything's fine. Dan pas ada masalah, lo tau duluan sebelum user komplain.
Good monitoring = good sleep. Dan good sleep = better code. Circle of life, Simba.
Pernah kena production issue tengah malam? Share pengalaman lo di comment, siapa tau bisa bantu developer lain!
Resources:
- Laravel Telescope Docs
- Netdata
- UptimeRobot
- Join komunitas: BuildWithAngga Discord buat sharing experience