Big Data Engineer

    • Job Tracking ID: 512530-616418
    • Job Location: Schaumburg, IL
    • Job Level: Mid Career (2+ years)
    • Level of Education: BA/BS
    • Job Type: Full-Time/Regular
    • Date Updated: April 25, 2018
    • Years of Experience: 5 - 7 Years
    • Starting Date: May 1, 2018
Invite a friend
facebook LinkedIn Twitter Email


Job Description:

The Big Data Engineer’s primary responsibilities are to build, integrate data from various resources, and support Cogensia’s big data ecosystem. The Engineer will work closely with other teammates to design optimum solutions using best practices. The Engineer is responsible to ensure the data ecosystem is built to be highly scalable and responsive through writing complex queries and ensuring optimal performance and availability.

 

The Big Data Engineer also creates ETL, batch, and automated processes on top of big datasets and creates big data warehouses to be used for reporting and analysis by Cogensia’s data scientists. This role will provide build and support ingestion and connection solutions for the businesses’ data science teams as well as outside partners and clients.

 

Responsibilities:

  • Work closely with SMEs and implement agreed upon solutions using best practices

  • Select and integrate any Big Data tools and frameworks required to provide requested capabilities

  • Implement ETL processes

  • Monitor performance and advising any necessary infrastructure changes

  • Management of Hadoop, Spark, or EMR clusters, with all included services

  • Design, implement, and tune tables, queries, stored procedures, indexes, etc.

  • Provide technical support to members of TS and SA team, as well as project support across client engagements

  • Work with geographically dispersed teams, embracing Agile and DevOps strategies for themselves and others while driving adoption to enable greater technology and business value

  • Stays current with relevant technology in order to maintain and/or improve functionality for authored applications.

  • Assume other responsibilities as requested/required

Experience and Skills:

  • Bachelor's or advanced degree in Computer Science/IT or related field

  • 5+ years relevant experience

  • Proficient understanding of distributed computing principles

  • Ability to solve any ongoing issues with operating the cluster

  • Proficiency with Hadoop, Spark, MapReduce, HDFS

  • Experience building stream-processing systems, using solutions such as Storm or Spark-Streaming

  • Good knowledge of Big Data querying tools, such as Pig, Hive, and Impala

  • Experience with integration of data from multiple data sources

  • Experience with NoSQL databases, such as HBase, Cassandra, Redshift, MongoDB

  • Knowledge of various ETL techniques and frameworks, such as Flume

  • Experience with Big Data ML toolkits, such as Mahout, SparkML, or H2O

  • Good understanding of Lambda Architecture, along with its advantages and drawbacks

  • Experience with Cloudera/MapR/Hortonworks

  • Experience with using Python scripts & libraries

  • Experience desired with Database Warehousing Design Concepts; Dimensional Modeling, Star/Snowflake Schemas, ETL/ELT, Data Marts, Analytic Playgrounds, Reporting techniques

  • Experience working with Agile software development methodologies, namely Scrum

  • Proven experience with team collaboration, release management, system and performance monitoring

  • Ability to work well with people from many different disciplines and varying degrees of technical experience

  • Strong organizational, presentation and customer service skills. Excellent problem-solving skills to assist in detecting potential issues and issue resolution

  • Excellent analytical, problem resolution, organization and time management skills.

  • Ability to handle multiple tasks at a time