DataKind Spring Internship: Empowering Change Through Data Science
Shashank Ravishankar
Data Scientist | Capgemini North America | AI & Analytics Practice |
One of the most rewarding learning experiences of my professional career has been my internship with DataKind. DataKind is a 'Data for Good' organization that utilizes data science and AI to tackle the world’s toughest challenges. I had the absolute privilege of working directly under Matthew Harris Ph.D., the head of Data Science at DataKind, exploring some of the latest cutting-edge technologies in the field of Artificial Intelligence.
Throughout my internship, I had the opportunity to extensively engage with cutting-edge AI technologies, with a primary focus on Large Language Models. I worked on developing a metadata prediction tool leveraging OpenAI’s 'text-davinci-002', a Generative AI model for text completion and prediction tasks.
Humanitarian Exchange Language (HXL) is a metadata language framework employed by humanitarian aid workers that allows them to share resources and datasets for analysis, augmenting the work carried out in the field. As we are all aware, metadata is essentially "data about data". Efficiently logged metadata allows analysts to engage in data analysis tasks with informed context about their data. My task focused on leveraging large language models to predict metadata for datasets on the United Nations Humanitarian Data Exchange.
This project involved extensive data engineering tasks with Spark and data querying with SQL. After building out a comprehensive data ingestion pipeline, the next task focused on data analysis. This involved building features from the ingested data that would be used to train the machine learning model.
The primary input data used by large language models is usually in the form of a prompt. A prompt, as the name indicates, is an instruction provided by the user to the model. This project gave me extensive experience in the novel technologies of prompt engineering. Since the entire project was built in Python, this opportunity also allowed me to sharpen my software engineering skills.
领英推荐
Working on real-world data science projects at DataKind allowed me to understand the significance of collaboration and teamwork in a corporate setting. I was fortunate to be part of a supportive team that fostered a culture of knowledge sharing and encouraged innovative solutions to challenges.
In conclusion, my spring internship at DataKind as a Data Scientist has been a transformative experience. The knowledge and skills acquired during this internship will undoubtedly shape my future endeavors in the field of data analytics and beyond.
#UTDMSBA #UTD #JSOM #HIREJSOM #AI #DATAFORGOOD #INTERNSHIP
Human Resources/People Ops Professional
10 个月Yay! It has been great having you as an intern this past couple of months!