Data Engineer - Basel, Schweiz - MDPI

    MDPI
    MDPI Basel, Schweiz

    vor 1 Woche

    Default job background
    Ganztags
    Beschreibung

    A pioneer in scholarly open access publishing, MDPI has supported academic communities since 1996. Our mission is to foster open scientific exchange in all forms, across all disciplines. We operate more than 400 diverse, peer-reviewed, open access journals supported by over 66,000 academic editors. We serve scholars from around the world to ensure the latest research is openly and broadly available.

    MDPI is headquartered in Switzerland with additional offices in Europe, Asia and North America. We are committed to ensuring that high quality research is made available as quickly as possible. We also support sustainability projects, with sustainability as a key theme in many journals and through the MDPI Sustainability Foundation.

    To strengthen our AI / Data team, we are looking for a Data Engineer that will play a key role in supporting us with data modeling, quality, validation, analysis, governance, warehouse and ETL pipelines. The position of Data Engineer is a full-time and permanent position based in our office in the Basel, Switzerland.

    Tasks & Responsibilities

  • Data modelling: design and implement data models that support business requirements and optimise for performance and scalability on large datasets.
  • Data quality and validation: provide support and solutions to develop data quality checks within the pipelines to ensure data is always within its expectations.
  • Data quality and validation: provide support and solutions to develop data quality checks within the pipelines to ensure data is always within its expectations.
  • Data integration: integrate data from all sources within the company and make it equally available for local and remote organisations.
  • Data analysis: develop and maintain insightful reports and dashboards to provide business metrics, trends and performances.
  • Data governance: support ensuring data compliance and harmonisation by developing strong and meaningful collaboration with data governance team policies and regulations.
  • Extract-Transform-Load (ETL) pipelines: create and maintain fast and efficient data pipelines from different relational and non-relational sources - MySQL, Apache Solr, MongoDB, APIs into a centralised data lake and data warehouse, using the Apache Airflow orchestrator.
  • Data Warehouse tuning and optimisation: contribute to the optimisation and performance of the orchestrator and data warehouse to ensure durable environment health and scalability.
  • Cross-functional collaboration: collaborate within an international environment to provide your expertise to data scientists, analysts and business stakeholders.
  • Requirements

  • Degree in computer science or data engineering certification.
  • Knowledge of Python programming language.
  • Experience with Pandas or Polars (Dataframes), SQLAlchemy (database toolkit), Streamlit is a plus.
  • Hands-on experience with SQL.
  • Experience with relational databases. Knowledge of MySQL and PostgreSQL-specific features is a plus.
  • Good understanding of NoSQL databases: Apache Solr, MongoDB.
  • Excellent problem-solving, communication, and organisational skills.
  • Technical expertise in data mining and modelling techniques.
  • Basic understanding of Machine Learning models.
  • Proven ability to work independently and within a team.
  • What we offer

  • The opportunity to contribute to the academic/scientific community;
  • Flexible working hours;
  • 25 Home Office Days per year;
  • Support for a healthy lifestyle;
  • Team bond strengthening through team-building events;
  • Professional growth opportunities with our global training system;
  • Working in a collaborative, diverse, and socially responsible team;
  • Company retreat facility;
  • Full-coverage insurance for accidents/daily sickness;
  • Prime location near Basel train station and city center;