feat: add Google Search Console automated monitoring
- Python monitoring script for daily GSC reports - Kubernetes CronJob for automated execution - Tracks search analytics, crawl errors, and sitemap status - Includes full setup documentation
This commit is contained in:
16
scripts/gsc-monitoring/Dockerfile
Normal file
16
scripts/gsc-monitoring/Dockerfile
Normal file
@@ -0,0 +1,16 @@
|
||||
FROM python:3.11-slim
|
||||
|
||||
WORKDIR /app
|
||||
|
||||
# Install dependencies
|
||||
COPY requirements.txt .
|
||||
RUN pip install --no-cache-dir -r requirements.txt
|
||||
|
||||
# Copy monitoring script
|
||||
COPY monitor.py .
|
||||
|
||||
# Create log directory
|
||||
RUN mkdir -p /var/log/gsc-monitoring
|
||||
|
||||
# Run monitoring
|
||||
CMD ["python", "monitor.py"]
|
||||
185
scripts/gsc-monitoring/README.md
Normal file
185
scripts/gsc-monitoring/README.md
Normal file
@@ -0,0 +1,185 @@
|
||||
# Google Search Console Monitoring Setup Guide
|
||||
|
||||
## Overview
|
||||
This setup creates an automated monitoring system for Google Search Console that runs daily and generates reports.
|
||||
|
||||
## Prerequisites
|
||||
1. Google Cloud account
|
||||
2. Access to Google Search Console for manoonoils.com
|
||||
3. kubectl access to your Kubernetes cluster
|
||||
|
||||
## Setup Steps
|
||||
|
||||
### Step 1: Create Google Cloud Project
|
||||
|
||||
1. Go to https://console.cloud.google.com
|
||||
2. Click "Create Project" (or select existing)
|
||||
3. Name it: `manoonoils-monitoring`
|
||||
4. Note the Project ID
|
||||
|
||||
### Step 2: Enable Search Console API
|
||||
|
||||
1. In your project, go to "APIs & Services" → "Library"
|
||||
2. Search for "Google Search Console API"
|
||||
3. Click "Enable"
|
||||
|
||||
### Step 3: Create Service Account
|
||||
|
||||
1. Go to "IAM & Admin" → "Service Accounts"
|
||||
2. Click "Create Service Account"
|
||||
3. Name: `gsc-monitor`
|
||||
4. Description: `Monitoring service for Google Search Console`
|
||||
5. Click "Create and Continue"
|
||||
6. Role: Select "Search Console Viewer" (or "Owner" if not available)
|
||||
7. Click "Done"
|
||||
|
||||
### Step 4: Create and Download Key
|
||||
|
||||
1. Click on the service account you just created
|
||||
2. Go to "Keys" tab
|
||||
3. Click "Add Key" → "Create New Key"
|
||||
4. Select "JSON" format
|
||||
5. Click "Create" - this downloads the key file
|
||||
6. **SAVE THIS FILE SECURELY** - you cannot download it again!
|
||||
|
||||
### Step 5: Add Service Account to Search Console
|
||||
|
||||
1. Go to https://search.google.com/search-console
|
||||
2. Select your property: `manoonoils.com`
|
||||
3. Click "Settings" (gear icon) → "Users and Permissions"
|
||||
4. Click "Add User"
|
||||
5. Enter the service account email (from the JSON key file, looks like: `gsc-monitor@manoonoils-monitoring.iam.gserviceaccount.com`)
|
||||
6. Permission level: "Full"
|
||||
7. Click "Add"
|
||||
|
||||
### Step 6: Store Credentials in Kubernetes
|
||||
|
||||
On your server (doorwaysftw), run:
|
||||
|
||||
```bash
|
||||
# Copy the JSON key file to the server
|
||||
scp /path/to/service-account-key.json doorwaysftw:/tmp/
|
||||
|
||||
# Create the secret in Kubernetes
|
||||
ssh doorwaysftw "kubectl create secret generic gsc-service-account \
|
||||
--namespace=manoonoils \
|
||||
--from-file=service-account.json=/tmp/service-account-key.json"
|
||||
|
||||
# Verify the secret was created
|
||||
ssh doorwaysftw "kubectl get secret gsc-service-account -n manoonoils"
|
||||
```
|
||||
|
||||
### Step 7: Build and Deploy
|
||||
|
||||
```bash
|
||||
# Build the Docker image
|
||||
cd scripts/gsc-monitoring
|
||||
docker build -t gcr.io/manoonoils/gsc-monitoring:latest .
|
||||
|
||||
# Push to registry (or use local registry)
|
||||
docker push gcr.io/manoonoils/gsc-monitoring:latest
|
||||
|
||||
# Deploy to Kubernetes
|
||||
kubectl apply -f cronjob.yaml
|
||||
|
||||
# Verify it's running
|
||||
kubectl get cronjob gsc-monitoring -n manoonoils
|
||||
```
|
||||
|
||||
### Step 8: Test Manually
|
||||
|
||||
```bash
|
||||
# Run a manual test
|
||||
kubectl create job --from=cronjob/gsc-monitoring gsc-test -n manoonoils
|
||||
|
||||
# Check the logs
|
||||
kubectl logs job/gsc-test -n manoonoils
|
||||
|
||||
# Delete the test job when done
|
||||
kubectl delete job gsc-test -n manoonoils
|
||||
```
|
||||
|
||||
## What It Monitors
|
||||
|
||||
### Daily Reports Include:
|
||||
|
||||
1. **Search Analytics** (Last 7 Days)
|
||||
- Total clicks and impressions
|
||||
- Average CTR and position
|
||||
- Top 5 search queries
|
||||
|
||||
2. **Crawl Errors**
|
||||
- Number of errors by type
|
||||
- Platform-specific issues
|
||||
|
||||
3. **Sitemap Status**
|
||||
- Sitemap processing status
|
||||
- Warnings and errors
|
||||
|
||||
## Viewing Reports
|
||||
|
||||
Reports are saved to `/var/log/gsc-monitoring/` in the pod and can be accessed:
|
||||
|
||||
```bash
|
||||
# Get pod name
|
||||
POD=$(kubectl get pods -n manoonoils -l job-name=gsc-monitoring -o name | head -1)
|
||||
|
||||
# View latest report
|
||||
kubectl exec $POD -n manoonoils -- cat /var/log/gsc-monitoring/$(kubectl exec $POD -n manoonoils -- ls -t /var/log/gsc-monitoring/ | head -1)
|
||||
```
|
||||
|
||||
Or set up log aggregation with your preferred tool.
|
||||
|
||||
## Schedule
|
||||
|
||||
The monitoring runs daily at **9:00 AM UTC**. To change:
|
||||
|
||||
```bash
|
||||
# Edit the cronjob
|
||||
kubectl edit cronjob gsc-monitoring -n manoonoils
|
||||
|
||||
# Change the schedule field (cron format)
|
||||
# Examples:
|
||||
# "0 */6 * * *" # Every 6 hours
|
||||
# "0 0 * * 0" # Weekly on Sunday
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### "Service account key file not found"
|
||||
- Verify the secret was created: `kubectl get secret gsc-service-account -n manoonoils`
|
||||
- Check the key is mounted: `kubectl exec deploy/gsc-monitoring -n manoonoils -- ls -la /etc/gsc-monitoring/`
|
||||
|
||||
### "User does not have permission"
|
||||
- Verify the service account email was added to GSC with "Full" permissions
|
||||
- Wait 5-10 minutes for permissions to propagate
|
||||
|
||||
### "Site not found"
|
||||
- Verify the SITE_URL in `monitor.py` matches exactly (with trailing slash)
|
||||
- Check: https://search.google.com/search-console
|
||||
|
||||
## Security Notes
|
||||
|
||||
- The service account JSON key is stored as a Kubernetes Secret
|
||||
- The key has read-only access to Search Console data
|
||||
- Rotate the key every 90 days for security
|
||||
- Never commit the key file to git
|
||||
|
||||
## Updating the Monitor
|
||||
|
||||
To update the monitoring script:
|
||||
|
||||
1. Edit `monitor.py`
|
||||
2. Rebuild the Docker image
|
||||
3. Push to registry
|
||||
4. Delete and recreate the CronJob:
|
||||
```bash
|
||||
kubectl delete cronjob gsc-monitoring -n manoonoils
|
||||
kubectl apply -f cronjob.yaml
|
||||
```
|
||||
|
||||
## Support
|
||||
|
||||
For issues or feature requests, check:
|
||||
- Google Search Console API docs: https://developers.google.com/webmaster-tools/search-console-api-original/v3
|
||||
- Google Cloud IAM docs: https://cloud.google.com/iam/docs
|
||||
45
scripts/gsc-monitoring/cronjob.yaml
Normal file
45
scripts/gsc-monitoring/cronjob.yaml
Normal file
@@ -0,0 +1,45 @@
|
||||
apiVersion: batch/v1
|
||||
kind: CronJob
|
||||
metadata:
|
||||
name: gsc-monitoring
|
||||
namespace: manoonoils
|
||||
spec:
|
||||
schedule: "0 9 * * *" # Run daily at 9 AM
|
||||
jobTemplate:
|
||||
spec:
|
||||
template:
|
||||
spec:
|
||||
containers:
|
||||
- name: gsc-monitor
|
||||
image: gcr.io/manoonoils/gsc-monitoring:latest
|
||||
env:
|
||||
- name: GSC_KEY_FILE
|
||||
value: /etc/gsc-monitoring/service-account.json
|
||||
- name: PYTHONUNBUFFERED
|
||||
value: "1"
|
||||
volumeMounts:
|
||||
- name: gsc-credentials
|
||||
mountPath: /etc/gsc-monitoring
|
||||
readOnly: true
|
||||
- name: logs
|
||||
mountPath: /var/log/gsc-monitoring
|
||||
volumes:
|
||||
- name: gsc-credentials
|
||||
secret:
|
||||
secretName: gsc-service-account
|
||||
- name: logs
|
||||
emptyDir: {}
|
||||
restartPolicy: OnFailure
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Secret
|
||||
metadata:
|
||||
name: gsc-service-account
|
||||
namespace: manoonoils
|
||||
type: Opaque
|
||||
stringData:
|
||||
service-account.json: |
|
||||
# PLACEHOLDER - Replace with actual service account JSON
|
||||
# Run: kubectl create secret generic gsc-service-account \
|
||||
# --namespace=manoonoils \
|
||||
# --from-file=service-account.json=/path/to/your/service-account-key.json
|
||||
191
scripts/gsc-monitoring/monitor.py
Normal file
191
scripts/gsc-monitoring/monitor.py
Normal file
@@ -0,0 +1,191 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Google Search Console Monitoring Script
|
||||
Monitors search performance, crawl errors, and indexing status
|
||||
"""
|
||||
|
||||
import os
|
||||
import json
|
||||
import sys
|
||||
from datetime import datetime, timedelta
|
||||
from google.oauth2 import service_account
|
||||
from googleapiclient.discovery import build
|
||||
from googleapiclient.errors import HttpError
|
||||
|
||||
# Configuration
|
||||
SITE_URL = "https://manoonoils.com/"
|
||||
SCOPES = ["https://www.googleapis.com/auth/webmasters.readonly"]
|
||||
KEY_FILE = os.environ.get("GSC_KEY_FILE", "/etc/gsc-monitoring/service-account.json")
|
||||
|
||||
|
||||
def get_service():
|
||||
"""Authenticate and return Search Console service"""
|
||||
credentials = service_account.Credentials.from_service_account_file(
|
||||
KEY_FILE, scopes=SCOPES
|
||||
)
|
||||
return build("webmasters", "v3", credentials=credentials)
|
||||
|
||||
|
||||
def get_search_analytics(service, days=7):
|
||||
"""Get search analytics data for the last N days"""
|
||||
end_date = datetime.now().strftime("%Y-%m-%d")
|
||||
start_date = (datetime.now() - timedelta(days=days)).strftime("%Y-%m-%d")
|
||||
|
||||
try:
|
||||
request = {
|
||||
"startDate": start_date,
|
||||
"endDate": end_date,
|
||||
"dimensions": ["query", "page"],
|
||||
"rowLimit": 100,
|
||||
}
|
||||
|
||||
response = (
|
||||
service.searchanalytics().query(siteUrl=SITE_URL, body=request).execute()
|
||||
)
|
||||
|
||||
return response.get("rows", [])
|
||||
except HttpError as e:
|
||||
print(f"Error fetching search analytics: {e}")
|
||||
return []
|
||||
|
||||
|
||||
def get_crawl_errors(service):
|
||||
"""Get crawl errors summary"""
|
||||
try:
|
||||
response = service.urlcrawlerrorscounts().query(siteUrl=SITE_URL).execute()
|
||||
return response.get("countPerTypes", [])
|
||||
except HttpError as e:
|
||||
print(f"Error fetching crawl errors: {e}")
|
||||
return []
|
||||
|
||||
|
||||
def get_sitemaps(service):
|
||||
"""Get sitemap status"""
|
||||
try:
|
||||
response = service.sitemaps().list(siteUrl=SITE_URL).execute()
|
||||
return response.get("sitemap", [])
|
||||
except HttpError as e:
|
||||
print(f"Error fetching sitemaps: {e}")
|
||||
return []
|
||||
|
||||
|
||||
def format_report(analytics, crawl_errors, sitemaps):
|
||||
"""Format monitoring report"""
|
||||
report = []
|
||||
report.append("=" * 70)
|
||||
report.append("GOOGLE SEARCH CONSOLE MONITORING REPORT")
|
||||
report.append(f"Site: {SITE_URL}")
|
||||
report.append(f"Date: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
|
||||
report.append("=" * 70)
|
||||
|
||||
# Search Analytics Summary
|
||||
report.append("\n📊 SEARCH ANALYTICS (Last 7 Days)")
|
||||
report.append("-" * 70)
|
||||
|
||||
if analytics:
|
||||
total_clicks = sum(row["clicks"] for row in analytics)
|
||||
total_impressions = sum(row["impressions"] for row in analytics)
|
||||
avg_ctr = sum(row["ctr"] for row in analytics) / len(analytics) * 100
|
||||
avg_position = sum(row["position"] for row in analytics) / len(analytics)
|
||||
|
||||
report.append(f"Total Clicks: {total_clicks:,}")
|
||||
report.append(f"Total Impressions: {total_impressions:,}")
|
||||
report.append(f"Average CTR: {avg_ctr:.2f}%")
|
||||
report.append(f"Average Position: {avg_position:.1f}")
|
||||
|
||||
# Top 5 queries
|
||||
report.append("\n🔍 Top 5 Queries:")
|
||||
sorted_queries = sorted(analytics, key=lambda x: x["clicks"], reverse=True)[:5]
|
||||
for i, row in enumerate(sorted_queries, 1):
|
||||
query = row["keys"][0]
|
||||
clicks = row["clicks"]
|
||||
impressions = row["impressions"]
|
||||
report.append(
|
||||
f' {i}. "{query}" - {clicks} clicks, {impressions} impressions'
|
||||
)
|
||||
else:
|
||||
report.append("No search analytics data available yet (may take 48-72 hours)")
|
||||
|
||||
# Crawl Errors
|
||||
report.append("\n🚨 CRAWL ERRORS")
|
||||
report.append("-" * 70)
|
||||
|
||||
if crawl_errors:
|
||||
total_errors = sum(error.get("count", 0) for error in crawl_errors)
|
||||
if total_errors > 0:
|
||||
report.append(f"⚠️ Total Errors: {total_errors}")
|
||||
for error in crawl_errors:
|
||||
error_type = error.get("platform", "Unknown")
|
||||
category = error.get("category", "Unknown")
|
||||
count = error.get("count", 0)
|
||||
if count > 0:
|
||||
report.append(f" - {error_type} / {category}: {count}")
|
||||
else:
|
||||
report.append("✅ No crawl errors detected!")
|
||||
else:
|
||||
report.append("✅ No crawl errors detected!")
|
||||
|
||||
# Sitemaps
|
||||
report.append("\n🗺️ SITEMAPS")
|
||||
report.append("-" * 70)
|
||||
|
||||
if sitemaps:
|
||||
for sitemap in sitemaps:
|
||||
path = sitemap.get("path", "Unknown")
|
||||
is_pending = sitemap.get("isPending", False)
|
||||
is_sitemap_index = sitemap.get("isSitemapIndex", False)
|
||||
|
||||
status = "⏳ Pending" if is_pending else "✅ Processed"
|
||||
report.append(f" {path}")
|
||||
report.append(f" Status: {status}")
|
||||
|
||||
if not is_sitemap_index and "warnings" in sitemap:
|
||||
report.append(f" Warnings: {sitemap['warnings']}")
|
||||
if not is_sitemap_index and "errors" in sitemap:
|
||||
report.append(f" Errors: {sitemap['errors']} ⚠️")
|
||||
else:
|
||||
report.append(
|
||||
"⚠️ No sitemaps found. Submit your sitemap to Google Search Console!"
|
||||
)
|
||||
|
||||
report.append("\n" + "=" * 70)
|
||||
|
||||
return "\n".join(report)
|
||||
|
||||
|
||||
def main():
|
||||
"""Main monitoring function"""
|
||||
print("🔍 Starting Google Search Console monitoring...")
|
||||
|
||||
# Check for credentials
|
||||
if not os.path.exists(KEY_FILE):
|
||||
print(f"❌ Error: Service account key file not found at {KEY_FILE}")
|
||||
print("Please set up Google Cloud credentials first.")
|
||||
sys.exit(1)
|
||||
|
||||
try:
|
||||
service = get_service()
|
||||
|
||||
# Gather data
|
||||
analytics = get_search_analytics(service)
|
||||
crawl_errors = get_crawl_errors(service)
|
||||
sitemaps = get_sitemaps(service)
|
||||
|
||||
# Generate and print report
|
||||
report = format_report(analytics, crawl_errors, sitemaps)
|
||||
print(report)
|
||||
|
||||
# Save report to file
|
||||
report_file = f"/var/log/gsc-monitoring/report_{datetime.now().strftime('%Y%m%d_%H%M%S')}.txt"
|
||||
os.makedirs(os.path.dirname(report_file), exist_ok=True)
|
||||
with open(report_file, "w") as f:
|
||||
f.write(report)
|
||||
print(f"\n💾 Report saved to: {report_file}")
|
||||
|
||||
except Exception as e:
|
||||
print(f"❌ Error: {e}")
|
||||
sys.exit(1)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
4
scripts/gsc-monitoring/requirements.txt
Normal file
4
scripts/gsc-monitoring/requirements.txt
Normal file
@@ -0,0 +1,4 @@
|
||||
google-auth>=2.22.0
|
||||
google-auth-oauthlib>=1.0.0
|
||||
google-auth-httplib2>=0.1.1
|
||||
google-api-python-client>=2.95.0
|
||||
Reference in New Issue
Block a user