Site Reliability Engineering Lead - Front Office (SecDB)

Goldman Sachs

Job Description

More about this job

Site Reliability Engineering (SRE) is an engineering discipline that combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. At Goldman Sachs, SRE is responsible for the availability and reliability of our firm's most critical platform services, and ensures they meet the requirements of our internal and external users. We look for engineers who are motivated to collaborate with our businesses to build and run sustainable production systems, which can evolve and adapt to changes in our fast-paced, global business environment.

How you will fulfill your potential

  • Direct exposure to best-in-class Site Reliability Engineering (SRE) team, disciplines and tooling
  • Define and evangelize strong incident management process and establish a blameless post-mortems culture.
  • Design, Develop & support the Firm's primary business-critical trading infrastructure
  • Manage a team of highly skilled engineers independently in addition to contributing as part of a highly collaborative and globally dispersed team
  • Create and support automation tooling to improve the reliability of the platform and to increase the productivity of the team.
  • Provide critical day-to-day support for a massive scale, distributed system
  • Effective incident risk management, determining severity level & business impact and working with stakeholders
  • The successful candidate will have outstanding verbal and written communications, a natural ability to learn in a fast-paced environment, and will be a self-starter with plenty of initiative.

Skills and Experience
  • Programming expertise in Python, Perl, Shell Scripting
  • Strong communication skills with a track record of working and collaborating with global teams
  • Strong analytical skills with the ability to break down and communicate complex issues, ideas and solutions
  • Experience with managing performance, availability and scale for mid to large sized systems
  • People management & development experience a distinct advantage
  • In-depth knowledge of Unix systems is a pre-requisite, as is a willingness to learn new languages and programming paradigms (functional programming for example)
  • Previous experience in a blameless SRE environment would a plus

Preferred Qualifications
  • Degree in computer science or engineering, or equivalent industry experience
  • Hands-on experience with storage and networking stacks
  • Works effectively and thrives in a team while able to operate independently, self-motivation is essential
  • Strong verbal and written communication skills


At Goldman Sachs, we commit our people, capital and ideas to help our clients, shareholders and the communities we serve to grow. Founded in 1869, we are a leading global investment banking, securities and investment management firm. Headquartered in New York, we maintain offices around the world.

We believe who you are makes you better at what you do. We're committed to fostering and advancing diversity and inclusion in our own workplace and beyond by ensuring every individual within our firm has a number of opportunities to grow professionally and personally, from our training and development opportunities and firmwide networks to benefits, wellness and personal finance offerings and mindfulness programs. Learn more about our culture, benefits, and people at .

We're committed to finding reasonable accommodations for candidates with special needs or disabilities during our recruiting process. Learn more: https://

© The Goldman Sachs Group, Inc., 2020. All rights reserved.
Goldman Sachs is an equal employment/affirmative action employer Female/Minority/Disability/Veteran/Sexual Orientation/Gender Identity

FindTheBestJob is a free service and does not charge a fee at any stage of application or recruitment process. Don’t provide your bank account or credit card details to anyone during job application. FindTheBestJob does not guarantee the availability of a job since organizations may end applications earlier than due date.

Apply Now