Data Product Engineer
Location: Buffalo, NY
Duration: 6+ months
Pay Rate: TBD
This resource will be started remotely but will be expected to be on-site full-time when the client requests. Currently aiming for sometime in Q1.
Some of the most common responsibilities include:
Participate in detailed technical design, development, implementation and support of Data applications.
Develop, construct, test, automate and maintain Data Pipeline for enterprise and non-enterprise platforms
Maintain standards compliance and ensure development artifacts are in alignment with patterns/ frameworks designed by software engineering teams.
Experience in solving the business problem with the right Data architecture
Identify ways to improve data reliability, efficiency and quality
Use large data sets to address business issues
Prepare data for predictive and prescriptive modeling
Leverage data to discover tasks that can be automated
Deliver updates to stakeholders based on analytics
• Familiarity with spark programming paradigms (batch and stream-processing)
• Familiarity of MDM concepts and informatica MDM tooling.
• Understanding of different data abstraction objects used in spark for different use cases, use of optimal data format and other optimization techniques.
• Strong programming skills in at least one of the following languages:J2EE (strongly preferred) Java, Scala. Familiarity with a scripting language like Python as well as Unix/Linux shells.
• Strong knowledge of writing optimized Spark & Hive sql and experience to tune poor performing queries.
• Outstanding programming and debugging skills. Strong knowledge of common algorithms and data structures.
• Experience in creating API’s (REST/Protobuf).
• Experience in data patterns (point to point/pub sub/Reply To Queue)
• Experience in developing containerized services.
• Good understanding of job scheduling and workflow orchestration through enterprise scheduling tools preferably CA-Automic or Control-M.
• Strong experience with SQL and relational databases like PostgreSQL, MySQL, Teradata, SQL Server and Oracle.
• Experience with elasticSearch/lucine / graphDB / noSql store
• Should be experienced in Data wrangling
• Should have analytical background to find hidden patterns using data
• Have a good understanding on Data modelling concepts.
• Familiarity with one or more stream processing / queuing technologies like Spark Streaming, Kafka, Kinesis, Flink, etc. preferred.
• Familiarity and prior experience with Agile / Scrum development methodologies.
• Prior Experience deploying to cloud platforms, preferably Azure or AWS Cloud
• Familiarity with any Object-Oriented Programming language
• Prior experience in Continuous Integration/Continuous Delivery tools and pipelines such as Jenkins, Maven, Gradle, etc.
• Experience working in a Regulated industry is preferred.