Job / CronJob
Job completati
Section titled “Job completati”kube_job_status_succeeded{namespace!~"kube-.*|openshift-.*"}Job falliti
Section titled “Job falliti”kube_job_status_failed{namespace!~"kube-.*|openshift-.*"}Job falliti (alert-ready)
Section titled “Job falliti (alert-ready)”kube_job_status_failed > 0Job in completamento lento (completions attese non raggiunte)
Section titled “Job in completamento lento (completions attese non raggiunte)”kube_job_spec_completions - kube_job_status_succeeded - kube_job_status_failed > 0Pod short-lived (firma tipica dei job batch)
Section titled “Pod short-lived (firma tipica dei job batch)”count_over_time( kube_pod_container_status_running{namespace!~"kube-.*|openshift-.*"}[15m]) < 15CronJob
Section titled “CronJob”CronJob sospeso
Section titled “CronJob sospeso”kube_cronjob_spec_suspend != 0CronJob in ritardo (next schedule superato da >1h)
Section titled “CronJob in ritardo (next schedule superato da >1h)”time() - kube_cronjob_next_schedule_time > 3600Ultima esecuzione riuscita (timestamp)
Section titled “Ultima esecuzione riuscita (timestamp)”kube_cronjob_status_last_successful_time{namespace!~"kube-.*|openshift-.*"}CPU/Memoria per job (breakdown per container, utile se il job ha init-container o sidecar)
Section titled “CPU/Memoria per job (breakdown per container, utile se il job ha init-container o sidecar)”sum by (namespace, pod, container) ( rate(container_cpu_usage_seconds_total{ namespace="<NS>", pod=~"<JOB_NAME>-.*", container!="", container!="POD" }[5m]))