Skip to content
@last9

last9

Unified Observability at Scale

Pinned Loading

  1. gpu-telemetry gpu-telemetry Public

    GPU Observability with workload attribution. One OTLP agent per node ties hardware metrics (NVIDIA, AMD, Intel Gaudi) to the K8s pod or Slurm job burning the GPU.

    Python 46 4

  2. last9-mcp-server last9-mcp-server Public

    Last9 MCP Server

    Go 58 12

  3. awesome-sre-agents awesome-sre-agents Public

    A curated list of AI-powered DevOps & SRE (Site Reliability Engineering) agents, tools, and resources for automating and enhancing reliability practices

    71 22

  4. mithai mithai Public

    AI agent framework for infrastructure operations.

    Python 12

  5. slo-computer slo-computer Public

    SLOs, Error windows and alerts are complicated. Here an attempt to make it easy

    Go 133 3

  6. terraform-provider-last9 terraform-provider-last9 Public

    Terraform provider for Last9 - Manage alerts, notification channels, drop rules, forward rules, and scheduled search alerts

    Go 4

Repositories

Showing 10 of 98 repositories

Top languages

Loading…

Most used topics

Loading…