Seargin

Careers At Seargin

Break into the IT industry without coding or tech skills and join teams working on international projects.

Data Engineer

Seargin is a dynamic multinational tech company operating in 50 countries. At Seargin, we drive innovation and create projects that shape the future and greatly enhance the quality of life. You will find our solutions in the space industry, supporting scientists in the development of cancer drugs, and implementing innovative technological solutions for industrial clients worldwide. These are just some of the areas in which we operate.

Position:

Data Engineer

Location:

Remote

Country:

UE

Form of employment:

B2B

Experience level:

Senior

Responsibilities:

  • Data Accessibility and Sharing

    Enable the finding, accessing, processing, publication, and sharing of biomedical data to facilitate insights for secondary use

  • Secondary Use Data Engine Development

    Develop and maintain the EDIS end-to-end engine designed for secondary use and primary exploration, ensuring seamless integration with externally generated data sources

  • Integration of Real-World Data (RWD)

    Integrate real-world data (RWD) from both clinical and non-clinical sources into the data ecosystem to enhance data richness and insight generation.

  • ETL Process Design and Implementation

    Design and implement efficient Extract, Transform, Load (ETL) processes to ensure high-quality data is available for analysis and reporting

  • Data Warehouse Enhancements

    Work on enhancements to the data warehouse infrastructure, optimizing performance and scalability to support large volumes of biomedical data

  • Collaboration with Data Scientists

    Collaborate with data scientists and analysts to understand data requirements and ensure data availability for analytics and reporting

  • Data Quality Assurance

    Implement data quality checks and validation processes to ensure the integrity and accuracy of data throughout its lifecycle

  • Monitoring and Maintenance

    Monitor data pipelines and workflows, troubleshooting issues as they arise and performing routine maintenance to ensure data reliability

  • Documentation of Data Processes

    Document data workflows, processes, and standards to maintain transparency and facilitate onboarding for new team members

  • AWS Development

    Utilize AWS services (e.g., S3, Redshift, Lambda) to build and manage cloud-based data solutions that support data storage, processing, and analysis

  • Performance Optimization

    Optimize data processing and storage solutions for efficiency and performance, ensuring quick access to large datasets

  • Compliance and Security

    Ensure compliance with relevant data protection regulations and implement security measures to protect sensitive biomedical data

What we offer

Requirements:

  • Programming Language Proficiency

    4+ years of experience working with programming languages focused on data pipelines, such as Python or R

  • SQL Expertise

    4+ years of experience working with SQL for data querying and manipulation

  • Data Pipeline Maintenance Experience

    3+ years of experience in maintaining and optimizing data pipelines to ensure data accuracy and efficiency

  • Diverse Storage Solutions Experience

    3+ years of experience working with various types of storage solutions, including filesystem, relational databases, MPP (Massively Parallel Processing), and NoSQL databases

  • Data Architecture Knowledge

    3+ years of experience in data architecture concepts, including data modeling, metadata management, workflow management, ETL/ELT processes, real-time streaming, data quality, and distributed systems

  • Cloud Technologies Experience

    3+ years of experience with cloud technologies, particularly for data pipelines, utilizing tools such as Airflow, Glue, Dataflow, and other solutions like Elastic, Redshift, BigQuery, Lambda, S3, and EBS

  • Relational Database Knowledge

    Strong knowledge of relational databases, including schema design and query optimization (optional)

  • Programming in Java and/or Scala

    1+ years of experience in Java and/or Scala for data processing and application development

  • Data Serialization Skills

    Very good knowledge of data serialization languages such as JSON, XML, and YAML

  • Version Control and DevOps Tools

    Excellent knowledge of Git and Gitflow, along with experience in DevOps tools such as Docker, Bamboo, Jenkins, and Terraform

  • Performance Analysis and Troubleshooting

    Capability to conduct performance analysis, troubleshooting, and remediation of data pipelines (optional)

  • Unix Proficiency

    Excellent knowledge of Unix/Linux environments for data processing and management.

  • Data Quality Assurance

    Experience in implementing data quality checks and validation processes to ensure data integrity and reliability throughout the pipeline

  • Data Security Practices

    Understanding of data security practices and regulations, particularly concerning sensitive health data and compliance standards in the pharmaceutical industry

  • Data Visualization Tools Knowledge

    Familiarity with data visualization tools (e.g., Tableau, Power BI) to support data-driven decision-making and reporting

  • Collaboration Skills

    Strong collaboration skills to work effectively with cross-functional teams, including data scientists, business analysts, and pharmaceutical partners

Nice to have

Apply & join the team




    Ready to elevate your business? Let’s start the conversation.

    Reach out to learn more