Lead Data Engineer
Sparks, MD or Raleigh, NC area! The Lead Data Engineer will play a critical implementation role on the Data Engineering and Data Services team and will be responsible for data pipeline solutions design and development, troubleshooting, and optimization; tuning on the next generation data and analytics platform being developed with leading edge big data technologies in a highly secure cloud infrastructure.
The Lead Data Engineer will serve as a liaison to platform user groups ensuring successful implementation of capabilities on the new platform. The Lead Data Engineer will also take a lead role on functional teams or projects.
● Deliver end-to-end data and analytics capabilities, including data ingest, data transformation, data science, and data visualization in collaboration with Data and Analytics stakeholder groups.
● Design and deploy databases and data pipelines to support analytics projects.
● Develop scalable and fault-tolerant workflows.
● Clearly document issues, solutions, findings, and recommendations to be shared internally & externally.
● Learn and apply tools and technologies proficiently, including:
○ Languages: SQL (standard and DB-specific), Python, Scala, Bash
○ Frameworks: Hadoop, Spark, Kafka
○ Cloud Computing: AWS
○ Tools/Products: Data Science Studio, Alteryx, Jupyter, Tableau, Power BI
● Performance optimization for queries and dashboards
● Develop and deliver clear, compelling briefings to internal and external stakeholders on findings, recommendations, and solutions.
● Analyze client data & systems to determine whether requirements can be met.
● Test and validate data pipelines, transformations, datasets, reports, and dashboards built by team.
● Develop and communicate solutions architectures and present solutions to both business and technical stakeholders.
● Provide end user support to other data engineers and analysts.
● Be a team leader and take lead role on functional teams or projects.
● Leads others to solve complex problems; uses sophisticated analytical thought to exercise judgment and identify innovative solutions.
● Interprets internal/external business challenges and recommends best practices to improve products, processes, or services.
· Expert experience in the following:
o SQL, Python, PySpark. Other programming languages (R, Scala, SAS, Java, etc.) are a plus
o Data and analytics technologies including SQL/NoSQL/Graph databases, ETL, and BI
o Knowledge of CI/CD and related tools such as Gitlab, AWS CodeCommit, etc
o AWS services including EMR, Glue, Athena, Batch, Lambda Cloudwatch, DynamoDB, EC2, Cloudformation, IAM and EDS
· Solid scripting skills (e.g., bash/shell scripts, Python)
· Proven work experience in the following:
o Data streaming technologies
o Big Data technologies including, Hadoop, Spark, Hive, Teradata, etc.
o Linux command-line operations
o Networking knowledge (OSI network layers, TCP/IP, virtualization)
· Candidate should be able to lead the team, communicate with business, gather, and interpret business requirements.
· Experience with agile delivery methodologies using Jira or similar tools.
· Experience working with remote teams.
· AWS Solutions Architect / Developer / Data Analytics Specialty certifications, Professional certification is a plus.
· Bachelor’s Degree in computer science or relevant field, Master’s Degree is a plus
· 8-12 years of relevant experience or equivalent combination of experience and education