Discussing ChatGPT
Príncipe Zanguilo
Business Transformation at AXCO - CMG Consulting Group #Data #AI #ML #Research
Context
When ChatGPT was launched, we both ( Domingos Pelezo and I) got excited and started using it replacing, stackoverflow for specific topics on programming and google for general topics on data science, AI and Machine Learning. Amazed by the potential, and knowing that ChatGPT was not connected to any search engine, we asked ourselves how it would be if it was connected to google engine??
We raised our question about google because we consider it the most used research engine. Before ChatGPT, Google was the first place we used to ask questions related to any subject. But, because Google is a big corporation with some level of government control, two main concerns arose.?
Bias?
Here comes OpenAI, the creator of ChatGPT on top of GPT3, founded to counterbalance Google on data privacy and mainly on bias. When it was founded, it was an open source non profit company, at least this is what Elon Musk one of the co-founders affirms. Since Microsoft has invested more than 10 billion dollars in the project, we believe that ChatGPT will be just one more tool to reinforce bias towards society. Because the models will be fed with bing’s data, we believe that the bias will always favor the western values, liberal ideologies, which has already been reported in some examples shared on the internet.?
Now that we know that connecting ChatGPT with a search engine generates a very big issue and challenge to solve, our next question is, at what magnitude will the bias be reinforced, considering that ChatGPT linked to bing can even express attitude!
领英推荐
To answer the above question, we believe that it is important to understand first if the bias is coming from the data, from the algorithm or from the? engineers during the decision-making process and data training. Beside the information that the model is being fed, what conclusion or relation the model is building from that information and if it is something that they (openAI) still can control? or only the AI controls it. OpenAI and Microsoft should clarify to the public.
We agreed that we should consider the possibilities that we have to tackle the problem. First we should know if the model will be retrained with unbiased data and second, are we adding new data to reduce the overfitted data or are we retraining it with a completely new dataset.?
Should we allow personnel data sharing to foundational models to allow customized experience, and at what risk? This is a question for the public.?
As for now, our recommendations are: