No more applications are being accepted for this job
- Experience with handling large volumes of unstructured data (TBs ideally), looking for proficiency beyond structured databases like MySQL.
- Experience with various tools such as Kibana, Power BI, and optionally, Neo4j for graph visualization.
- Expert level of proficiency in Python programming.
- Must have managed large scale of data, both in terms of storage (GBs) and records/files (multi-million) as the role involves handling TBs of data.
- Oversee and manage architecture on Azure platform for efficient data operations.
- Experience in working with audio, video, or image data.
- Development Design, construct, install, test, and maintain highly scalable data pipelines with a focus on machine learning models and analytics.
- Experience in utilizing Hugging Face models for various tasks.
- Work closely with data scientists, ML engineers, and stakeholders to ensure that data is accessible, consistent, and reliable for ongoing projects.
- Develop and maintain APIs for data access and manipulation and integrate with external data services as needed.
- Manage and optimize data storage solutions, including relational databases, Search Engines like Elasticsearch and NoSQL databases, to support the requirements of machine learning models.
- Implement processes to monitor data quality and ensure production data is always accurate and available for key stakeholders.
- Collaborate with ML engineers to assist in data-related technical issues and provide architectural guidance and solutions.
- Ensure compliance with data security and privacy policies.
- Maintain clear and up-to-date documentation including data dictionaries, metadata, and architectural diagrams.
- Bachelor's degree in Computer Science, Engineering, Mathematics, or a related field; or equivalent work experience.
- 3+ years of experience in a Data Engineering role.
- Knowledge / skills in various data science techniques and NLP.
- Optional experience in Pyspark and big data handling.
- Familiarity with machine learning frameworks such as TensorFlow, PyTorch, or similar.
- Strong understanding of data warehousing concepts, ETL processes, and data modeling.
- Knowledge in DevOps, CI/CD methods, and containerization technologies like Docker or Kubernetes.
- Experience with real-time data processing.
Data Engineer - Abu Dhabi, United Arab Emirates - Marc Ellis
Description
Job Type: 12 Months ExtendableResponsibilities: