Skills highlighted in blue are preferred key skills
Key Responsibilities:
Design, implement, and maintain data pipelines to ingest and process OpenShift telemetry (metrics, logs, traces) at scale.
Stream OpenShift telemetry via Kafka (producers, topics, schemas) and build resilient consumer services for transformation and enrichment.
Engineer data models and routing for multi-tenant observability; ensure lineage, quality, and SLAs across the stream layer.
Integrate processed telemetry into Splunk for visualization, dashboards, alerting, and analytics to achieve Observability Level 4 (proactive insights).
Implement schema management (Avro/Protobuf), governance, and versioning for telemetry events.
Build automated validation, replay, and backfill mechanisms for data reliability and recovery.
Instrument services with OpenTelemetry; standardize tracing, metrics, and structured logging across platforms.
Use LLMs to enhance observability capabilities (e.g., query assistance, anomaly summarization, runbook generation).
Collaborate with platform, SRE, and application teams to integrate telemetry, alerts, and SLOs.
Ensure security, compliance, and best practices for data pipelines and observability platforms.
Document data flows, schemas, dashboards, and operational runbooks.
Required Skills:
Hands-on experience building streaming data pipelines with Kafka (producers/consumers, schema registry, Kafka Connect/KSQL/KStream).
Proficiency with OpenShift/Kubernetes telemetry (OpenTelemetry, Prometheus) and CLI tooling.
Experience integrating telemetry into Splunk (HEC, UF, sourcetypes, CIM), building dashboards and alerting.
Strong data engineering skills in Python (or similar) for ETL/ELT, enrichment, and validation.
Knowledge of event schemas (Avro/Protobuf/JSON), contracts, and backward/forward compatibility.
Familiarity with observability standards and practices; ability to drive toward Level 4 maturity (proactive monitoring, automated insights).
Understanding of hybrid cloud and multi-cluster telemetry patterns.
Security and compliance for data pipelines: secret management, RBAC, encryption in transit/at rest.
Good problem-solving skills and ability to work in a collaborative team environment.
Strong communication and documentation skills.
Key Responsibilities:
Design, implement, and maintain data pipelines to ingest and process OpenShift telemetry (metrics, logs, traces) at scale.
Stream OpenShift telemetry via Kafka (producers, topics, schemas) and build resilient consumer services for transformation and enrichment.
Engineer data models and routing for multi-tenant observability; ensure lineage, quality, and SLAs across the stream layer.
Integrate processed telemetry into Splunk for visualization, dashboards, alerting, and analytics to achieve Observability Level 4 (proactive insights).
Implement schema management (Avro/Protobuf), governance, and versioning for telemetry events.
Build automated validation, replay, and backfill mechanisms for data reliability and recovery.
Instrument services with OpenTelemetry; standardize tracing, metrics, and structured logging across platforms.
Use LLMs to enhance observability capabilities (e.g., query assistance, anomaly summarization, runbook generation).
Collaborate with platform, SRE, and application teams to integrate telemetry, alerts, and SLOs.
Ensure security, compliance, and best practices for data pipelines and observability platforms.
Document data flows, schemas, dashboards, and operational runbooks.
Required Skills:
Hands-on experience building streaming data pipelines with Kafka (producers/consumers, schema registry, Kafka Connect/KSQL/KStream).
Proficiency with OpenShift/Kubernetes telemetry (OpenTelemetry, Prometheus) and CLI tooling.
Experience integrating telemetry into Splunk (HEC, UF, sourcetypes, CIM), building dashboards and alerting.
Strong data engineering skills in Python (or similar) for ETL/ELT, enrichment, and validation.
Knowledge of event schemas (Avro/Protobuf/JSON), contracts, and backward/forward compatibility.
Familiarity with observability standards and practices; ability to drive toward Level 4 maturity (proactive monitoring, automated insights).
Understanding of hybrid cloud and multi-cluster telemetry patterns.
Security and compliance for data pipelines: secret management, RBAC, encryption in transit/at rest.
Good problem-solving skills and ability to work in a collaborative team environment.
Strong communication and documentation skills.
Key Responsibilities:
Design, implement, and maintain data pipelines to ingest and process OpenShift telemetry (metrics, logs, traces) at scale.
Stream OpenShift telemetry via Kafka (producers, topics, schemas) and build resilient consumer services for transformation and enrichment.
Engineer data models and routing for multi-tenant observability; ensure lineage, quality, and SLAs across the stream layer.
Integrate processed telemetry into Splunk for visualization, dashboards, alerting, and analytics to achieve Observability Level 4 (proactive insights).
Implement schema management (Avro/Protobuf), governance, and versioning for telemetry events.
Build automated validation, replay, and backfill mechanisms for data reliability and recovery.
Instrument services with OpenTelemetry; standardize tracing, metrics, and structured logging across platforms.
Use LLMs to enhance observability capabilities (e.g., query assistance, anomaly summarization, runbook generation).
Collaborate with platform, SRE, and application teams to integrate telemetry, alerts, and SLOs.
Ensure security, compliance, and best practices for data pipelines and observability platforms.
Document data flows, schemas, dashboards, and operational runbooks.
Required Skills:
Hands-on experience building streaming data pipelines with Kafka (producers/consumers, schema registry, Kafka Connect/KSQL/KStream).
Proficiency with OpenShift/Kubernetes telemetry (OpenTelemetry, Prometheus) and CLI tooling.
Experience integrating telemetry into Splunk (HEC, UF, sourcetypes, CIM), building dashboards and alerting.
Strong data engineering skills in Python (or similar) for ETL/ELT, enrichment, and validation.
Knowledge of event schemas (Avro/Protobuf/JSON), contracts, and backward/forward compatibility.
Familiarity with observability standards and practices; ability to drive toward Level 4 maturity (proactive monitoring, automated insights).
Understanding of hybrid cloud and multi-cluster telemetry patterns.
Security and compliance for data pipelines: secret management, RBAC, encryption in transit/at rest.
Good problem-solving skills and ability to work in a collaborative team environment.
Strong communication and documentation skills.
Provide technical support and troubleshooting for IT systems and software within the diagnostic center. Install, configure, and maintain hardware and software systems. Respond promptly to IT-related issues and queries, ensuring minimal disruption to operations. Collaborate with other departments to ensure seamless IT integration and functionality. Maintain accurate records of IT issues and resolutions, and prepare reports as required. Assist in the implementation of IT projects and updates to improve system efficiency.Skills Proficient in Hardware and Software Troubleshooting Expertise in Network Configuration and Management Strong Knowledge of IT Security Protocols Competent in Database Management and Support Ability to Provide Remote IT Support SolutionsRequired Qualifications Bachelors degree in Information Technology, Computer Science, or a related field. Certification in IT support or related IT certifications (CompTIA, CCNA) is preferable