Lead Data Engineer
Employment Type: Full-Time
Developing mission-critical systems that help keep people safe is what we do. At General Dynamics Mission Systems, you’ll be part of the team that helps heroes make a true impact. The work we do is important. The challenges we face are career-defining. The opportunity we can offer is one-of-a-kind.
We apply advanced technologies such as Artificial Intelligence, Blockchain, AR/VR, Cloud Native and Quantum Physics to solve our customers’ missions in cyber, RF, undersea, interstellar space and everything in between.
As a Lead Data Engineer, you’ll lead model and simulation activities as you participate in requirements analysis and management, functional analysis, performance analysis, system design, trade studies, systems integration and test (verification). It’s your chance to step up to the challenge and prove you’re ready to lead the world.
REPRESENTATIVE DUTIES AND TASKS:
We are seeking a Senior Data Engineer to support the Insider Threat mission. Data Engineers work with various security system data owners to automate data integration and collection strategies. Work closely with the data science team to ensure data cleanliness and accuracy.
- Support data science team by designing, developing and implementing scalable ETL process for disparate datasets into a Hadoop infrastructure
- Design, develop, implement and maintain data ingestion process from various disparate datasets using StreamSets (experience with StreamSets not mandatory)
- Develop processes to identify data drift and malformed records
- Develop technical documentation and standard operating procedures Mentor new and junior data engineers
- Leads technical tasks for small teams or projects
KNOWLEDGE SKILLS AND ABILITIES:
* The salary listed in the header is an estimate based on salary data for similar jobs in the same area. Salary or compensation data found in the job description is accurate.
- Working knowledge of entity resolution systems
- Experience with messages systems like Kafka
- Experience with NoSQL and/or graph databases like MongoDB or ArangoDB Any of the following databases: SQL, MongoDB, Oracle, Postgres
- Working experience with ETL processing
- Working experience with data workflow products like StreamSets or NiFi
- Working experience with Python RESTful API services, JDBC
- Experience with Hadoop and Hive/Impala
- Experience with Cloudera Data Science Workbench is a plus
- Understanding of pySpark Leadership experience
- Creative thinker
- Ability to multi-task
- Excellent use and understanding of data engineering concepts, principles, and theories
Loading some great jobs for you...