Description
Experience: 3-6+ Years
Location: Noida/Gurugram/Remote
Skills: PYTHON, PYSPARK, SQL, AZURE DATA FACTORY, DATABRICKS, DATA LAKE, AZURE FUNCTION, DATA PIPELINE.
- Design and engineer the cloud/big data solutions, develop a modern data analytics lake.
- Develop & maintain data pipelines for batch & stream processing using modern cloud or open source ETL/ELT tools.
- Liaise with business team and technical leads, gather requirements, identify data sources, identify data quality issues, design target data structures, develop pipelines and data processing routines, perform unit testing and support UAT.
- Implement continuous integration, continuous deployment, DevOps practice.
- Create, document, and manage data guidelines, governance, and lineage metrics.
- Technically lead, design and develop distributed, high-throughput, low-latency, highly available data processing and data systems.
- Build monitoring tools for server-side components; work cohesively in India-wide distributed team.
- Identify, design, and implement internal process improvements and tools to automate data processing and ensure data integrity while meeting data security standards.
- Build tools for better discovery and consumption of data for various consumption models in the organization – DataMarts, Warehouses, APIs, Ad Hoc Data explorations.
- Create data views, data as a service APIs from big data stores to feed into analysis engines, visualization engines, etc.
- Work with a data scientist and business analytics team to assist in data ingestion and data-related technical issues