DataKind Spring Internship: Empowering Change Through Data Science

DataKind Spring Internship: Empowering Change Through Data Science

One of the most rewarding learning experiences of my professional career has been my internship with DataKind. DataKind is a 'Data for Good' organization that utilizes data science and AI to tackle the world’s toughest challenges. I had the absolute privilege of working directly under Matthew Harris Ph.D., the head of Data Science at DataKind, exploring some of the latest cutting-edge technologies in the field of Artificial Intelligence.

Throughout my internship, I had the opportunity to extensively engage with cutting-edge AI technologies, with a primary focus on Large Language Models. I worked on developing a metadata prediction tool leveraging OpenAI’s 'text-davinci-002', a Generative AI model for text completion and prediction tasks.

Humanitarian Exchange Language (HXL) is a metadata language framework employed by humanitarian aid workers that allows them to share resources and datasets for analysis, augmenting the work carried out in the field. As we are all aware, metadata is essentially "data about data". Efficiently logged metadata allows analysts to engage in data analysis tasks with informed context about their data. My task focused on leveraging large language models to predict metadata for datasets on the United Nations Humanitarian Data Exchange.

This project involved extensive data engineering tasks with Spark and data querying with SQL. After building out a comprehensive data ingestion pipeline, the next task focused on data analysis. This involved building features from the ingested data that would be used to train the machine learning model.

The primary input data used by large language models is usually in the form of a prompt. A prompt, as the name indicates, is an instruction provided by the user to the model. This project gave me extensive experience in the novel technologies of prompt engineering. Since the entire project was built in Python, this opportunity also allowed me to sharpen my software engineering skills.

Working on real-world data science projects at DataKind allowed me to understand the significance of collaboration and teamwork in a corporate setting. I was fortunate to be part of a supportive team that fostered a culture of knowledge sharing and encouraged innovative solutions to challenges.

In conclusion, my spring internship at DataKind as a Data Scientist has been a transformative experience. The knowledge and skills acquired during this internship will undoubtedly shape my future endeavors in the field of data analytics and beyond.


#UTDMSBA #UTD #JSOM #HIREJSOM #AI #DATAFORGOOD #INTERNSHIP

Nelsa Pe?a, MPA (she/her)

Human Resources/People Ops Professional

10 个月

Yay! It has been great having you as an intern this past couple of months!

要查看或添加评论,请登录

Shashank Ravishankar的更多文章

  • VANILLA NEURAL NETWORKS: AN INTRODUCTION

    VANILLA NEURAL NETWORKS: AN INTRODUCTION

    Over the course of my studies, I’ve found that a lot of Data Science tutorials use extensive techno-jargon that can be…

    3 条评论
  • Neural Networks. What are they ? How to build em ?

    Neural Networks. What are they ? How to build em ?

    One of the primary fields in machine learning that has taken prominence in solving real world problems is Deep…

  • Gradient Descent for people in a hurry !

    Gradient Descent for people in a hurry !

    Gradient Descent is one of the key algorithms that machine learning practitioners everywhere use, but not many of them…

    6 条评论

社区洞察

其他会员也浏览了