Software Engineer, Data Ingestion
Slack is looking for a software engineer to join our Data Ingestion team. You will build scalable backend services and tools to help partners implement, deploy and analyze data assets with a high level of autonomy and limited friction. You will play a meaningful role in making partner interactions with the Data Warehouse pleasant and productive (Data Science, Business Intelligence, Application Engineering, Machine Learning and IT).
At Data Ingestion, we build and operate the platform that ingests data into our Data Warehouse. We write software to manage data ingestion for thousands of stateful hosts and stateless real time logging events with hundreds of billions of records per day. As Slack’s data grows (along with the number of customers, features and employees), the goal of the Data Ingestion team is to build a highly scalable and resilient ingestion platform to acquire high quality data efficiently and provide easy to use workflow and orchestration capabilities for our users to manage the data lifecycle.
As a member in Data Ingestion, you will help to design and build abstractions that hide the complexity of the underlying Big Data stack and allow partners to focus on their strengths: data modeling, data analysis, search or machine learning. You will have deep technical skills, be a self-starter, detail and quality oriented, and passionate about driving data driven decisions and having a huge impact at Slack!
What you will be doing
- Optimize the end-to-end workflow of data users at Slack (from crafting libraries to providing abstractions used to define jobs, schedule data pipelines or access data assets).
- Provide transparency into our data flows (comprehensive view of sources, transformations, sinks, data lineage).
- Automate and handle the lifecycle of data sets (schema evolution, metadata management, change and backfill management, deprecation and migration).
- Streamline the creation of new data sets with accessible frameworks and Domain Specific Languages (DSL).
- Improve the data quality and reliability of the pipelines (monitoring and failure detection).
- Supply reusable backend abstractions to ingest or access data sets (batch or low latency APIs).
What you should have
- 2+ years of experience working with Big Data technologies (e.g. Spark, Kafka, Hadoop, Hive, Presto, Parquet, Airflow, EMR, S3, etc).
- Good understanding of polyglot data persistence (relational, key/value, document, column, graph).
- Skilled at crafting and building robust backend data services (distributed systems, concurrency models, microservices).
- Strong dedication to code quality, automation and operational excellence: unit/integration tests, scripts, workflows.
- Expertise in object-oriented and/or functional programming languages (e.g. Python, Java/Scala, Go).
- Excellent written and verbal communication and interpersonal skills; able to effectively collaborate with partners.
- Bachelor's degree in Computer Science, Engineering or related field, or equivalent training, fellowship, or work experience.
Slack has transformed business communication. It’s the leading channel-based messaging platform, used by millions to align their teams, unify their systems, and drive their businesses forward. Only Slack offers a secure, enterprise-grade environment that can scale with the largest companies in the world. It is a new layer of the business technology stack where people can work together more effectively, connect all their other software tools and services, and find the information they need to do their best work. Slack is where work happens.
Ensuring a diverse and inclusive workplace where we learn from each other is core to Slack’s values. We welcome people of different backgrounds, experiences, abilities and perspectives. We are an equal opportunity employer and a pleasant and supportive place to work.
Come do the best work of your life here at Slack.