The primary objective of a Data Engineer focused on Integration Services is to prepare datasets for modeling or analytics through data profiling, cleansing, and linking methodologies. Data Engineers in this role understand that data quality is affected by the way data is entered, stored, and managed. With attention to detail, they improve data quality, shorten the implementation life cycle, and improve understanding of the data for internal and external analysts. In the process, they discover meaning in the data itself that drives value for our clients.
Our data scientists, insights strategists, and clients count on Data Engineers to verify the reliability and effectiveness of the data as well as pinpoint gaps. Data Engineers know that they have been successful when they are able to deliver a cleaned and linked data set and provide the required documentation to awaiting team members.
Essential Duties & Responsibilities:
- Acquire pre-identified datasets for analytics use and supply appropriate documentation for team use.
- Perform data profiling tasks (query, mining, etc.) to determine risks to proposed analytics solution based on quality of, and anomalies within, existing data sources.
- Interface internally and externally to understand business rules and identify gaps across data definitions. Perform data cleaning and table joins using knowledge of the client’s business rules.
- Participate in Engineering, Product Delivery, and Client calls to facilitate knowledge transfer and clarification of business rules and client data flow.
- Create source to target maps, data audit, or other reference documents and deliver as input into next workflow. Data dictionaries will have concise, consistent, and unambiguous definitions.
- Recommend and implement enhancements that standardize and streamline processes, assure data quality and reliability, and reduce processing time to meet client expectations.
- Communicate progress and completion to project team. Escalate roadblocks that may impact delivery schedule.
- Actively participate in Quality Assurance procedures.
- Create hourly, daily, weekly, or monthly automated processes built on proven and completed work.
- Additional duties as assigned to ensure client and company success.
- Follow company policy and procedures which protect sensitive data and maintain compliance with established security standards and best practices.
- Bachelor’s degree in Computer Science, Computer Engineering, Mathematics, or related field, or 3+ years of relevant work experience.
- Experience working with relational database structures, SQL and/or flat files and performing table joins, web crawling, and web development.
- Proficiency in one or more of the following programming languages: PHP, Java, or Python and a familiarity with Node.js
- Natural curiosity about what’s hidden in the data through exploration, attention to detail, and ability to see the big picture – similar to putting together a 10,000 piece puzzle.
- Resourceful in getting things done, self-starter, and productive working independently or collaboratively—ours is a fast-paced entrepreneurial environment with performance expectations and deadlines.
- Ability to learn quickly and contribute ideas that make the team, processes, and product better.
- Ability to communicate your ideas (verbal and written) so that data engineers, analysts, and management can understand.
- Ability to defend your professional decisions and organize proof that your ideas and processes are correct.
- Share our values for growth, relationships, integrity, and passion for making data analytics-ready.
- Experience working in one of the following industries: healthcare, financial services, media or legal.
- Experience working with commercial relational database systems such as electronic medical records or other clinical systems, customer relationship management software, or accounting systems.
- Familiar with various data management methodologies, data exploration techniques, data quality assurance practices, and data discovery / visualization tools.
- Prior experience supporting business intelligence operations, managing technical, business, and process metadata related to data warehousing.
- Experience working with NoSQL, Hive, MapReduce and other Big Data technologies is preferred but not required.
- Experience working with distributed and/or parallel systems experience or knowledge of concepts.
- Willing to train the right candidate.
Location: South Bend, IN