A new start-up company and now forming a new Data team in HK
- Extract large-scale data from the internet to support the training and optimization of foundational models.
- Conduct extensive cleaning and preprocessing of external data to ensure its quality for model training.
- Develop and maintain automated pipelines for continuous data updates and validation.
- Collaborate closely with the algorithm team to grasp data requirements and facilitate effective data transfer.
- Supply high-quality, cleaned datasets to enhance the performance of the algorithm team’s models.
- Bachelor’s degree in Computer Science, Electrical Engineering, or a related field.
- Expert in data crawling and cleaning
- Proficiency in Python, with skills in web requests, data security, HTML, and JavaScript.
- Experience with data warehousing and the ability to manage both internal and external data sources.
- Passion for problem-solving in data engineering and statistics.
- Self-motivated with strong communication skills.
- Proficient in written and spoken English and Chinese(Cantonese or Mandarin)