ArlingtonVARecruiter Since 2001
the smart solution for Arlington jobs

Site Reliability Engineer (Splunk ITSI)

Company: Jobot
Location: Arlington
Posted on: January 16, 2022

Job Description:

100% Remote / Site Reliability Engineer / Opportunity to improve lives for our Veterans / Splunk ITSI and modern performance monitoring

This Jobot Job is hosted by Blake Williams

Are you a fit? Easy Apply now by clicking the "Apply" button and sending us your resume.

Salary $120,000 - $170,000 per year

A Bit About Us

Support our Veterans! We are on a mission to transform government IT to drive efficiency and tax payer value to improve the lives of our Veterans. We know how to drive IT transformation so federal agencies can work faster and easier, enabling them to focus on their important roles in serving the needs of our citizens. Our team supports the application and performance monitoring for over 800 applications used by the VA and Veterans Hospitals across the USA.

Why join us?

There's one thing that unites us, no matter where in the world we live we are transformers. Transformer is more than just a catchy title; it's the core of who we are as a company. We wholeheartedly embrace this title, both in how we approach our customer work and in the workplace culture we create. We offer tons of benefits for our employees.

  • 401(k) & Roth retirement plans w/ company match up to 4%
  • Generous Paid time off
  • 10 paid holidays
  • Medical, dental, & vision insurance
  • Flexible Spending Accounts (FSA, DCA, transit, & parking)
  • Health Savings Accounts (HSA) with employer contribution
  • Life and AD&D insurance; 1x salary, up to $300K max
  • Short and long-term disability
  • Voluntary life/AD&D (employee, spouse, & child)
  • Legal plan
  • Pet insurance
  • Critical illness, accident, & hospital insurance
  • Employee assistance program
  • Employee referral program
  • Bereavement/jury duty leave
  • Spot bonus program
    Job Details

    As a Site Reliability Engineer (Splunk ITSI), your focus will be on developing solutions to solve complex business monitoring problems in Splunk ITSI, directly supporting efforts of other SREs and Enterprise Command Center (ECC) monitoring initiatives. A successful candidate will be able to lead business and technical system owners through the identification of Key Performance Indicators that will be used for service mapping and generation of system health scores. A statistical and mathematics background is required to be able to leverage Splunk's machine learning capabilities and the candidate must understand which models and techniques should be used to instrument the given applications and set appropriate alerting thresholds. This position will be dedicated to Splunk support. This position will also support the migration from Splunk SaaS to a GovCloud based instance on Splunk.

    Job Responsibilities
    • Ability to translate business requirements, service level agreements (SLA) and service level objectives (SLO) into monitoring requirements
    • Utilize technical area expertise to develop technical solutions to solve the business problem as an organic part of the organization's operational and functional baseline.
    • Development of a template-based approach to service mappings in ITSI.
    • Utilize Splunk ITSI to create dynamic thresholds and interface with data scientists if a more advanced statistical model is required.
    • Support Major Incidents by adjusting existing or instrumenting new monitoring to address monitoring deficiencies.
    • Support Triage efforts during Major Incidents by deconstructing application performance, interoperability, instrumentation, and human factors to facilitate resolution and development of resilient solutions. Support Problem Management's enterprise root cause analysis (RCA) processes in collaboration with appropriate Office of Information and Technology (OIT) organizations.
    • Capture technical information from the relevant stakeholders and synthesize it into useful information in various formats for OIT senior management and other VA components.
    • Create overarching strategies for design and development of service trees and gaining the most value out of ITSI.
      Ideal Background
      • 5+ years of SRE experience using Splunk ITSI
      • Splunk IT Service Intelligence Certified Admin and Splunk Accredited ITSI Implementation certification would be ideal
      • Ability to develop and implement service dependencies, service maps, KPIs, and thresholds in Splunk ITSI Service Analyzer and Glass Tables.
      • Should have advanced level understanding in the concepts of DevOps and Site Reliability Engineering (SRE) principals.
      • Experience designing and implementing orchestration and automation
      • Experience with other modern performance monitoring and diagnostics tools (examples AppD, Dynatrace, WireShark, etc.)
      • Be a technical expert with expertise across multiple technology areas and the ability to diagnose complex issues throughout many technologies.
      • Bachelors Degree (or 10 years of professional experience in lieu of a degree)
      • Ability to pass a background check including fingerprinting and a Public Trust Clearance
        Interested in hearing more? Easy Apply now by clicking the "Apply" button.

Keywords: Jobot, Arlington , Site Reliability Engineer (Splunk ITSI), Engineering , Arlington, Virginia

Click here to apply!

Didn't find what you're looking for? Search again!

I'm looking for
in category

Log In or Create An Account

Get the latest Virginia jobs by following @recnetVA on Twitter!

Arlington RSS job feeds