Senior Database Reliability Engineer
About the Team
Slack's Database Reliability Engineering team builds and operates the platform services that store data at Slack. We write software to manage thousands of stateful hosts, providing many petabytes of online database capacity. We are also in the midst of transitioning Slack’s core MySQL infrastructure to use Vitess’ flexible sharding and management capabilities. Review our recent presentation slides: Migrating to Vitess at (Slack) Scale.
Slack has a positive, diverse, and supportive culture—we look for people who are curious, inventive, and work to be a little better every single day. In our work together we aim to be smart, humble, hardworking and, above all, collaborative. If this sounds like a good fit for you, why not say hello?
About the Role
What you will be doing
- Leading larger projects, from start to finish, where scope is mostly understood
- Designing and developing new highly-available infrastructure to meet the needs of our growing and evolving product
- Writing software to make the database infrastructure self-managing and self-service
- Advising feature teams on how we can support the database needs of new features under development
- Writing code to capture data about service performance, and create tools and dashboards to provide insight into that data
- Participating in the Database Reliability Engineering on-call rotation, triaging and addressing production issues as they arise
- Contributing to internal tools that help us improve our operations processes, manage our infrastructure, and scale our systems
What you should have
- You have curiosity about how things work
- You've been developing and operating high-traffic Internet applications and can point to things you’ve worked on
- You've deployed server software on Linux, and then operated it at scale. You’ve debugged its problems, and analyzed and optimized its performance
- You are a strong communicator. Explaining complex technical concepts to designers, support, and other engineers is no problem for you
- You enjoy helping onboard new team members, mentoring, and teaching others
- Professional experience operating at least one distributed data storage system, at scale and in a team environment. Some examples include: a relational database like MySQL, a search engine like Solr, or a streaming message bus like Kafka
- Bachelor's degree in Computer Science, Engineering or related field, or equivalent training, fellowship, or work experience
- Solid competency in software engineering, using functional or imperative programming languages -- e.g. PHP, Python, Ruby, Go, C, or Java (used without frameworks)
- Experience using distributed storage systems scaled out across hundreds or thousands of servers
- Experience expressing complex questions in SQL, especially MySQL
- Experience using deployment automation/configuration management (Chef a plus)
- Experience with virtualized environments, especially Amazon Web Services
- Experience in a startup environment
Slack is a layer of the business technology stack that brings together people, data, and applications – a single place where people can effectively work together, find important information, and access hundreds of thousands of critical applications and services to do their best work. From global Fortune 100 companies to corner markets, businesses and teams of all kinds use Slack to bring the right people together with all the right information. Slack is headquartered in San Francisco, CA and has ten offices around the world. For more information on how Slack makes teams better connected, visit slack.com.
Ensuring a diverse and inclusive workplace where we learn from each other is core to Slack’s values. We welcome people of different backgrounds, experiences, abilities and perspectives. We are an equal opportunity employer and a pleasant and supportive place to work.
Come do the best work of your life here at Slack.