Data Science + Industry
Dr. Puneett B.
Director @ American Express Enterprise Data Office | Generative Artificial Intelligence (AI) | Data Science & Business Intelligence (BI) | Trainer | Consultant | Deep Learning | Machine Learning | Risk Analytics
Thanks to the democratization of AI, developers are now finding it easier to design and integrate AI-driven decision-making and data-driven insights into user experiences and development workflows. Here are a few examples of how data science is "applied" to real-world applications across the industry:
The figure shows other domains and examples for applying data science techniques. Want to explore other applications? Check out the Review & Self Study section below.
Data Science + Research
While real-world applications often focus on industry use cases at scale, research applications and projects can be useful from two perspectives:
For students, these research projects can provide both learning and collaboration opportunities that can improve your understanding of the topic, and broaden your awareness and engagement with relevant people or teams working in areas of interest. So what do research projects look like and how can they make an impact?
Let's look at one example - the MIT Gender Shades Study from Joy Buolamwini (MIT Media Labs) with a signature research paper co-authored with Timnit Gebru (then at Microsoft Research) that focused on
Results showed that though overall classification accuracy was good, there was a noticeable difference in error rates between various subgroups - with misgendering being higher for females or persons with darker skin types, indicative of bias.
Key Outcomes: Raised awareness that data science needs more representative datasets (balanced subgroups) and more inclusive teams (diverse backgrounds) to recognize and eliminate or mitigate such biases earlier in AI solutions. Research efforts like this are also instrumental in many organizations defining principles and practices for responsible AI to improve fairness across their AI products and processes.
Want to learn about relevant research efforts in Microsoft?
Data Science + Humanities
Digital Humanities has been defined as "a collection of practices and approaches combining computational methods with humanistic inquiry". Stanford projects like "rebooting history" and "poetic thinking" illustrate the linkage between Digital Humanities and Data Science - emphasizing techniques like network analysis, information visualization, spatial and text analysis that can help us revisit historical and literary data sets to derive new insights and perspective.
Want to explore and extend a project in this space?
领英推荐
Check out "Emily Dickinson and the Meter of Mood" - a great example from Jen Looper that asks how we can use data science to revisit familiar poetry and re-evaluate its meaning and the contributions of its author in new contexts. For instance, can we predict the season in which a poem was authored by analyzing its tone or sentiment - and what does this tell us about the author's state of mind over the relevant period?
To answer that question, we follow the steps of our data science lifecycle:
Using this workflow, we can explore the seasonal impacts on the sentiment of the poems, and help us fashion our own perspectives on the author. Try it out yourself - then extend the notebook to ask other questions or visualize the data in new ways!
You can use some of the tools in the Digital Humanities toolkit to pursue these avenues of inquiry
Data Science + Sustainability
The 2030 Agenda For Sustainable Development - adopted by all United Nations members in 2015 - identifies 17 goals including ones that focus on Protecting the Planet from degradation and the impact of climate change. The Microsoft Sustainability initiative supports these goals by exploring ways in which technology solutions can support and build more sustainable futures with a focus on 4 goals - being carbon negative, water positive, zero waste, and bio-diverse by 2030.
Tackling these challenges in a scalable and timely manner requires cloud-scale thinking - and large-scale data. The Planetary Computer initiative provides 4 components to help data scientists and developers in this effort:
The Planetary Computer Project is currently in preview (as of Sep 2021) - here's how you can get started contributing to sustainability solutions using data science.
Think about how you can use data visualization to expose or amplify relevant insights into areas like climate change and deforestation. Or think about how insights can be used to create new user experiences that motivate behavioral changes for more sustainable living.
Data Science + Students
We've talked about real-world applications in industry and research and explored data science application examples in digital humanities and sustainability. So how can you build your skills and share your expertise as a data science beginner?
Here are some examples of data science student projects to inspire you.