Data Engineering & Pipelining Lead, Auckland

Last update 2024-04-18
Expires 2024-05-18
ID #2001465289
Data Engineering & Pipelining Lead, Auckland
New Zealand, Auckland, Auckland,
Modified January 20, 2024


We are excited to be expanding our Life Science Data Analytics Platform group in our Boston and Cambridge offices. We are looking for a Data Science & Pipelining Lead to join our Life Science Data Analytics Platform Group.
Arrayo assists top tier clients in life sciences in implementing effective Data Analytics strategies. We make sure that data assets are available and accessible for advanced analytics, and so that the inherent value of data assets can be realized more readily.
The Data Science & Pipelining Lead will be responsible for driving the socialization and utilization of technologies, algorithms, models, and methods for science driven data analytics R&D projects. As an Arrayo Team member, you will work to understand users’ requirements, and drive the definition, design, implementation and validation of cutting-edge pipelines and models used to process and analyze diverse sources of data.

Develop data pipelines to extract, transform, and load data from various data sources in various forms.
Work in collaboration with key scientific personnel to build, test, adapt, support, and validate pipelines with integration into production systems.
Manage the definition, design, implementation, and validation of data pipelines and models to analyze data from diverse sources.
Write custom scripts to extract data from unstructured/semi-structured sources.
Make great use of advanced pipeline technologies incl. Prefect, Nextflow, Airflow, Cromwell, KNIME, Databricks, Luigi, petl, AWS Data Pipeline.
Deliver solutions in an efficient agile manner.
Contribute to many different projects in a dynamic, fast-moving environment.
Collaboratively translate scientific and business questions into data and analytics requirements.
Drive rapid prototyping for further implementation of analytical products.
Partner with SMEs to translate modeling outputs into business language.
Work with IT resources to enable appropriate data flow/data models.

B. S. in information systems, computer science, computer engineering or related field with 4+ years of experience working within bioinformatics, genomics, genetics, or other science related environments. M. S. or Ph D Is preferred.
Knowledge of a subset of analytical approaches (ex. machine learning, statistical analysis, predictive modeling, visual analytics).
Proficiency building, running, and monitoring pipelines on cloud computing environments.
Experience in commonly used command-line NGS tools is a plus (BWA, SAMTools, Bowtie2, Picard, PINDEL, GATK, etc.).
Ability to understand and communicate statistical measures for interrogating the quality of data manipulation preferred.
Demonstrated ability to communicate efficiently and work effectively with a team of scientists and other engineers.
Experience with SQL and modeling relational databases. Postgre SQL experience preferred.
Experience using / designing web services and REST APIs.
Knowledge of software development best practices
Experience working in Cloud Computing environments (ex AWS, Azure, etc) is preferred.
Arrayo is an Equal Employment Opportunity employer and as such does not discriminate against any applicant for employment or employee on the basis of race, color, religious creed, gender, age, marital status, sexual orientation, national origin, disability, veteran status or any other classification protected by applicable discrimination laws.
Follow us on Social Media


Job details:

Job type: Full time
Contract type: Permanent
Salary type: Monthly
Occupation: Data engineering & pipelining lead

⇐ Previous job

Next job ⇒     


Contact employer

    Employer's info

    Quick search:


    Type city or region